Supervised training¶
Some Advanced Training experiments minimize supervised loss on labeled data instead of running a full RL loop. The Agent step exposes the catalog entries below.
Enterprise only
Supervised and LatentPPO run on tabular and non-tabular datasets, which require an Enterprise plan on the organization. SFT runs on language datasets and only needs a plan that includes Advanced Training. See Plan permissions.
Algorithms on Agent¶
Name |
Typical use |
|---|---|
Supervised |
Tabular or non-tabular datasets with input and target columns |
SFT |
Language SFT datasets with prompt and target columns |
LatentPPO |
Latent module trained between pretrained blocks; shares much of the supervised wizard flow |
Which dataset unlocks which algorithm¶
Dataset on Environment |
Algorithms you usually see |
|---|---|
Tabular |
Supervised only |
Non-tabular object detection (with a prior Supervised saved model) |
Supervised and LatentPPO |
Other non-tabular |
Supervised |
SFT |
SFT |
Reasoning or preference |
RL-style options in LLM algorithms, not this page |
Supervised and SFT still use the full wizard: Agent (algorithm and network), Environment (dataset binding), Training (steps, batch size, optimizer-related fields), and Resources when you need specific compute.
Metrics on Results¶
Default charts depend on dataset task type and algorithm:
Algorithm |
Typical default training charts |
|---|---|
Supervised (classification) |
Training score (accuracy), evaluation fitness, loss, validation loss, steps |
Supervised (object detection) |
Training score (mean IoU), pixel accuracy, validation pixel accuracy, loss, validation loss, steps |
LatentPPO (object detection) |
Composite reward (training score), evaluation mean IoU, loss decomposition ( |
Score vs fitness: for Supervised, the score chart tracks the training fitness metric (accuracy, mean IoU, MSE). For LatentPPO, score is the weighted composite reward during training; use fitness for validation mean IoU.
After training, deploy Supervised or LatentPPO checkpoints from Agents → Advanced Training (Connect, then predict snippets). See Create and deploy an agent.