Agents

Agents in the sidebar turns a trained checkpoint into something you can call over HTTP. Two tabs split the list:

Tab

Use for

Classic RL

Gym and PettingZoo checkpoints

Advanced Training

LLM and supervised deployments

Classic RL and Advanced Training list different kinds of agents. Checkpoints from one project type do not appear on the other tab.

Classic RL tab

Classic RL agents come from gym (or PettingZoo) experiments. The table lists agents tied to those runs.

Column

Meaning

Agent

Deployment name

Description

Optional text

Source Experiment

Experiment the checkpoint came from

Status

Undeployed, Pending, Deployed, or Failed (see Inference contract)

Actions

Connect / Disconnect, Delete, expand chevron when deployed

Connect starts inference for that row. Disconnect turns the endpoint off without deleting the agent. Delete removes the agent from the table.

Expand a row when status is Deployed and the agent is connected. The panel shows API key, endpoint URL, code snippets, and (for supported algorithms) a chat playground.

Creating a Classic RL agent opens Create New Agent in classic mode. The row appears with status Undeployed until you click Connect.

Advanced Training tab

Advanced Training agents are saved models from dataset- or LLM-based experiments. The table adds a Base Model column when the saved model records a pretrained path.

Connect works in two cases:

  • No deployment yet: Connect creates the inference deployment and turns it on.

  • Deployment already exists: Connect / Disconnect toggles whether it is live.

Expand a row when the linked deployment is Deployed. The panel shows API key, endpoint, and snippets.

Saved models from Advanced Training can also be picked as decoders when you chain LatentPPO in a pipeline.

Create New Agent

Use New Agent in the top bar. The button stays disabled until at least one experiment in the current tab’s project type has checkpoints. The tooltip names which project type is missing data.

The modal has two steps: Details, then Checkpoint. See Create and deploy an agent.

LatentPPO

LatentPPO appears on the Advanced Training tab. Create a saved model from a LatentPPO experiment, then Connect like other advanced agents. When status is Deployed, Manual HTTP snippets use the predict path (same family as Supervised). There is no chat playground.

In pipelines, a prior stage’s checkpoint can still feed a LatentPPO stage as the decoder. See Pipelines.

Inference contract

Deployment statuses, which snippet tab to use, and endpoint path suffixes are in Inference contract. Start there when a row stays Pending or moves to Failed.

See also