Create and deploy an agent¶
Agents start from a checkpoint on a finished (or checkpointed) experiment. Create New Agent collects a name, source experiment, and which checkpoint to freeze. Turning that record into a live endpoint is a separate step on the Agents table.
Before you open the modal¶
New Agent only enables when your org has at least one experiment with checkpoints in the active tab’s project type:
Classic RL: non-advanced projects with checkpoints
Advanced Training: advanced projects with checkpoints
If the button is greyed out, finish a run that writes checkpoints first, or switch tabs if your project is the other type.
Step 1: Details¶
Field |
Required |
Notes |
|---|---|---|
Name |
Yes |
Trimmed on save |
Description |
No |
Optional |
Source experiment |
Yes |
Dropdown lists experiments that have checkpoints; Classic RL shows the gym environment name in parentheses, Advanced Training shows the dataset name |
You cannot go to step 2 without a name and experiment.
The info panel on the right summarizes what an agent is: a chosen checkpoint you can deploy or (for Advanced Training) reuse as a training starting point.
Step 2: Checkpoint¶
Choose how to pick the weights:
Type |
UI label |
Selection rule |
|---|---|---|
best |
Best checkpoint (highest evaluation score) |
In the modal, the checkpoint with the highest evaluation score. |
final |
Final checkpoint (highest steps) |
Checkpoint with the largest step count |
specific |
Specific checkpoint |
You pick a row in the table (radio); table sorts by score or steps depending on type |
Training and evaluation score columns come from experiment metrics at each checkpoint step.
Two charts (training score and evaluation score) appear when metrics are available. They help compare steps before you commit.
For specific, the table is interactive. For best and final, the table is dimmed and the platform picks the row for you.
What gets created¶
Behavior depends on which tab you were on when you opened the modal:
Advanced Training
Creates a saved model with your name, description, source experiment, and checkpoint choice.
Does not start inference. Status stays Undeployed until you Connect on the Advanced Training table.
Classic RL
Creates a deployment row for the chosen checkpoint.
The row exists immediately but stays Undeployed until you Connect.
Deploy (connect)¶
You can also start from the Experiments tab: expand a finished experiment, then Deploy checkpoint on a row in the Checkpoints table. That opens the same create-agent flow with the step pre-selected.
Creation and going live are separate steps.
Classic RL¶
Create the agent (row appears, not connected).
Click Connect on the row.
Wait for status to move through Pending to Deployed (often around ninety seconds, longer for large checkpoints).
Advanced Training¶
Create the saved model.
Click Connect. If no deployment exists yet, the platform creates one and turns it on. If one exists, Connect enables it.
Wait for Deployed as above.
Disconnect turns inference off without deleting the agent record. Delete removes the agent (and linked deployment on Advanced Training).
Algorithms and inference paths¶
Algorithm |
Create saved model |
Live inference (after Connect) |
|---|---|---|
Supervised |
Yes |
Yes; Manual HTTP uses predict |
LatentPPO |
Yes |
Yes; Manual HTTP uses predict |
LLM family (GRPO, DPO, SFT, …) |
Yes |
Yes; snippets use generate; chat playground on LLM deployments |
Classic RL (PPO, DQN, …) |
N/A (Classic tab only) |
Yes; snippets use get_action |
Path suffixes and statuses: Inference contract.
Checkpoint prerequisites¶
Experiments without checkpoints never appear in the source dropdown. If you just finished training, refresh the Agents table or wait for checkpoints to appear before creating an agent.
Pipelines can auto-create saved models when a stage succeeds with auto-add enabled. Those rows still need Connect if you want HTTP inference.