Agents¶

Agents in the sidebar turns a trained checkpoint into something you can call over HTTP. Two tabs split the list:

Tab	Use for
Classic RL	Gym and PettingZoo checkpoints
Advanced Training	LLM and supervised deployments

Classic RL and Advanced Training list different kinds of agents. Checkpoints from one project type do not appear on the other tab.

Classic RL tab¶

Classic RL agents come from gym (or PettingZoo) experiments. The table lists agents tied to those runs.

Column	Meaning
Agent	Deployment name
Description	Optional text
Source Experiment	Experiment the checkpoint came from
Status	Undeployed, Pending, Deployed, or Failed (see Inference contract)
Actions	Connect / Disconnect, Delete, expand chevron when deployed

Connect starts inference for that row. Disconnect turns the endpoint off without deleting the agent. Delete removes the agent from the table.

Expand a row when status is Deployed and the agent is connected. The panel shows API key, endpoint URL, code snippets, and (for supported algorithms) a chat playground.

Creating a Classic RL agent opens Create New Agent in classic mode. The row appears with status Undeployed until you click Connect.

Advanced Training tab¶

Advanced Training agents are saved models from dataset- or LLM-based experiments. The table adds a Base Model column when the saved model records a pretrained path.

Connect works in two cases:

No deployment yet: Connect creates the inference deployment and turns it on.
Deployment already exists: Connect / Disconnect toggles whether it is live.

Expand a row when the linked deployment is Deployed. The panel shows API key, endpoint, and snippets.

Saved models from Advanced Training can also be picked as decoders when you chain LatentPPO in a pipeline.

Create New Agent¶

Use New Agent in the top bar. The button stays disabled until at least one experiment in the current tab’s project type has checkpoints. The tooltip names which project type is missing data.

The modal has two steps: Details, then Checkpoint. See Create and deploy an agent.

LatentPPO¶

LatentPPO appears on the Advanced Training tab. Create a saved model from a LatentPPO experiment, then Connect like other advanced agents. When status is Deployed, Manual HTTP snippets use the predict path (same family as Supervised). There is no chat playground.

In pipelines, a prior stage’s checkpoint can still feed a LatentPPO stage as the decoder. See Pipelines.

Inference contract¶

Deployment statuses, which snippet tab to use, and endpoint path suffixes are in Inference contract. Start there when a row stays Pending or moves to Failed.