Create and deploy an agent

Agents start from a checkpoint on a finished (or checkpointed) experiment. Create New Agent collects a name, source experiment, and which checkpoint to freeze. Turning that record into a live endpoint is a separate step on the Agents table.

Before you open the modal

New Agent only enables when your org has at least one experiment with checkpoints in the active tab’s project type:

  • Classic RL: non-advanced projects with checkpoints

  • Advanced Training: advanced projects with checkpoints

If the button is greyed out, finish a run that writes checkpoints first, or switch tabs if your project is the other type.

Step 1: Details

Field

Required

Notes

Name

Yes

Trimmed on save

Description

No

Optional

Source experiment

Yes

Dropdown lists experiments that have checkpoints; Classic RL shows the gym environment name in parentheses, Advanced Training shows the dataset name

You cannot go to step 2 without a name and experiment.

The info panel on the right summarizes what an agent is: a chosen checkpoint you can deploy or (for Advanced Training) reuse as a training starting point.

Step 2: Checkpoint

Choose how to pick the weights:

Type

UI label

Selection rule

best

Best checkpoint (highest evaluation score)

In the modal, the checkpoint with the highest evaluation score.

final

Final checkpoint (highest steps)

Checkpoint with the largest step count

specific

Specific checkpoint

You pick a row in the table (radio); table sorts by score or steps depending on type

Training and evaluation score columns come from experiment metrics at each checkpoint step.

Two charts (training score and evaluation score) appear when metrics are available. They help compare steps before you commit.

For specific, the table is interactive. For best and final, the table is dimmed and the platform picks the row for you.

What gets created

Behavior depends on which tab you were on when you opened the modal:

Advanced Training

  • Creates a saved model with your name, description, source experiment, and checkpoint choice.

  • Does not start inference. Status stays Undeployed until you Connect on the Advanced Training table.

Classic RL

  • Creates a deployment row for the chosen checkpoint.

  • The row exists immediately but stays Undeployed until you Connect.

Deploy (connect)

You can also start from the Experiments tab: expand a finished experiment, then Deploy checkpoint on a row in the Checkpoints table. That opens the same create-agent flow with the step pre-selected.

Creation and going live are separate steps.

Classic RL

  1. Create the agent (row appears, not connected).

  2. Click Connect on the row.

  3. Wait for status to move through Pending to Deployed (often around ninety seconds, longer for large checkpoints).

Advanced Training

  1. Create the saved model.

  2. Click Connect. If no deployment exists yet, the platform creates one and turns it on. If one exists, Connect enables it.

  3. Wait for Deployed as above.

Disconnect turns inference off without deleting the agent record. Delete removes the agent (and linked deployment on Advanced Training).

Algorithms and inference paths

Algorithm

Create saved model

Live inference (after Connect)

Supervised

Yes

Yes; Manual HTTP uses predict

LatentPPO

Yes

Yes; Manual HTTP uses predict

LLM family (GRPO, DPO, SFT, …)

Yes

Yes; snippets use generate; chat playground on LLM deployments

Classic RL (PPO, DQN, …)

N/A (Classic tab only)

Yes; snippets use get_action

Path suffixes and statuses: Inference contract.

Checkpoint prerequisites

Experiments without checkpoints never appear in the source dropdown. If you just finished training, refresh the Agents table or wait for checkpoints to appear before creating an agent.

Pipelines can auto-create saved models when a stage succeeds with auto-add enabled. Those rows still need Connect if you want HTTP inference.

See also