Training¶
Training settings come from the experiment wizard and show up again on the project Results tab when you compare runs. What you can configure depends on whether the project is Classic RL or Advanced Training.
Classic RL vs Advanced Training¶
Classic RL |
Advanced Training |
|
|---|---|---|
Environment step |
Gym environment (built-in or custom) |
Dataset or LLM simulation environment |
Agent step |
Classic algorithms (PPO, DQN, …) |
LLM, supervised, or simulation algorithms |
Training step |
Hyperparameters for the chosen algorithm |
Fields vary by algorithm and dataset |
Optional HPO step |
Evolution and tournament options |
Same when enabled |
Classic projects use Classic RL algorithms. Advanced projects use LLM algorithms or Supervised training depending on the dataset or environment you picked.
Wizard steps (in order)¶
Resources: compute class and node count (Advanced Training often needs GPU before Agent unlocks).
Environment: gym env or dataset attachment.
Agent: algorithm, network architecture, algorithm-specific options.
Training: steps, batch sizes, learning rates, and related knobs.
HPO: optional evolutionary search and mutation ranges. See How evolutionary hyperparameter optimization works for the population diagram and selection loop.
Changing the algorithm on Agent reloads defaults for that algorithm and environment combination.
Compare settings across runs¶
On the project page, open the Results tab. Select two or more experiments with the eye icon, then scroll to the comparison tables grouped as Resources, Environment, Agent, Training, and Hyperparameter Optimization (only when HPO is on). Toggle Display All vs Display Differences to hide rows where every selected run agrees.
Details: Training settings.