Create and deploy an agent¶

Agents start from a checkpoint on a finished (or checkpointed) experiment. Create New Agent collects a name, source experiment, and which checkpoint to freeze. Turning that record into a live endpoint is a separate step on the Agents table.

Step 1: Details¶

Field	Required	Notes
Name	Yes	Trimmed on save
Description	No	Optional
Source experiment	Yes	Dropdown lists experiments that have checkpoints; Classic RL shows the gym environment name in parentheses, Advanced Training shows the dataset name

You cannot go to step 2 without a name and experiment.

The info panel on the right summarizes what an agent is: a chosen checkpoint you can deploy or (for Advanced Training) reuse as a training starting point.

Step 2: Checkpoint¶

Choose how to pick the weights:

Type	UI label	Selection rule
best	Best checkpoint (highest evaluation score)	In the modal, the checkpoint with the highest evaluation score.
final	Final checkpoint (highest steps)	Checkpoint with the largest step count
specific	Specific checkpoint	You pick a row in the table (radio); table sorts by score or steps depending on type

Training and evaluation score columns come from experiment metrics at each checkpoint step.

Two charts (training score and evaluation score) appear when metrics are available. They help compare steps before you commit.

For specific, the table is interactive. For best and final, the table is dimmed and the platform picks the row for you.

What gets created¶

Behavior depends on which tab you were on when you opened the modal:

Advanced Training

Creates a saved model with your name, description, source experiment, and checkpoint choice.
Does not start inference. Status stays Undeployed until you Connect on the Advanced Training table.

Classic RL

Creates a deployment row for the chosen checkpoint.
The row exists immediately but stays Undeployed until you Connect.

Deploy (connect)¶

You can also start from the Experiments tab: expand a finished experiment, then Deploy checkpoint on a row in the Checkpoints table. That opens the same create-agent flow with the step pre-selected.

Creation and going live are separate steps.

Classic RL¶

Create the agent (row appears, not connected).
Click Connect on the row.
Wait for status to move through Pending to Deployed (often around ninety seconds, longer for large checkpoints).

Advanced Training¶

Create the saved model.
Click Connect. If no deployment exists yet, the platform creates one and turns it on. If one exists, Connect enables it.
Wait for Deployed as above.

Disconnect turns inference off without deleting the agent record. Delete removes the agent (and linked deployment on Advanced Training).

Algorithms and inference paths¶

Algorithm	Create saved model	Live inference (after Connect)
Supervised	Yes	Yes; Manual HTTP uses predict
LatentPPO	Yes	Yes; Manual HTTP uses predict
LLM family (GRPO, DPO, SFT, …)	Yes	Yes; snippets use generate; chat playground on LLM deployments
Classic RL (PPO, DQN, …)	N/A (Classic tab only)	Yes; snippets use get_action

Path suffixes and statuses: Inference contract.

Checkpoint prerequisites¶

Experiments without checkpoints never appear in the source dropdown. If you just finished training, refresh the Agents table or wait for checkpoints to appear before creating an agent.

Pipelines can auto-create saved models when a stage succeeds with auto-add enabled. Those rows still need Connect if you want HTTP inference.