Training

Training settings come from the experiment wizard and show up again on the project Results tab when you compare runs. What you can configure depends on whether the project is Classic RL or Advanced Training.

Classic RL vs Advanced Training

Classic RL

Advanced Training

Environment step

Gym environment (built-in or custom)

Dataset or LLM simulation environment

Agent step

Classic algorithms (PPO, DQN, …)

LLM, supervised, or simulation algorithms

Training step

Hyperparameters for the chosen algorithm

Fields vary by algorithm and dataset

Optional HPO step

Evolution and tournament options

Same when enabled

Classic projects use Classic RL algorithms. Advanced projects use LLM algorithms or Supervised training depending on the dataset or environment you picked.

Wizard steps (in order)

  1. Resources: compute class and node count (Advanced Training often needs GPU before Agent unlocks).

  2. Environment: gym env or dataset attachment.

  3. Agent: algorithm, network architecture, algorithm-specific options.

  4. Training: steps, batch sizes, learning rates, and related knobs.

  5. HPO: optional evolutionary search and mutation ranges. See How evolutionary hyperparameter optimization works for the population diagram and selection loop.

Changing the algorithm on Agent reloads defaults for that algorithm and environment combination.

Compare settings across runs

On the project page, open the Results tab. Select two or more experiments with the eye icon, then scroll to the comparison tables grouped as Resources, Environment, Agent, Training, and Hyperparameter Optimization (only when HPO is on). Toggle Display All vs Display Differences to hide rows where every selected run agrees.

Details: Training settings.