Training settings¶
Arena builds each run from the experiment wizard. You fill Resources, Environment, Agent, Training, and optional HPO; Arena stores that bundle on the experiment and uses it when you train.
What each wizard block covers¶
Wizard step |
Comparison section on Results |
What it holds |
|---|---|---|
Resources |
Resources |
Compute class, node count, resource identifiers |
Environment |
Environment |
Gym name, dataset link, simulation or prompting fields |
Agent |
Agent |
Algorithm name, network architecture, algorithm-specific hyperparameters |
Training |
Training |
Steps, batch sizes, learning rates, replay buffer when the algorithm needs it |
HPO |
Hyperparameter Optimization |
Mutation probabilities, tournament selection, evolution ranges on training fields |
DQN runs can surface epsilon-related training rows in the comparison table. Off-policy classic algorithms include replay buffer rows; on-policy algorithms omit them.
Defaults when you change selections¶
Picking a new algorithm, environment, or dataset on the wizard reloads defaults for that combination. Switching algorithm replaces partial edits you already made, so finish environment and dataset choices before you tune training numbers.
Compare runs on Results¶
Open the project Results tab.
Click the eye icon on at least two experiments in Running or Completed.
Scroll below the charts to the comparison card.
Use Display All or Display Differences. Differences mode hides rows where every selected run matches; if nothing differs, you see No differences between selected runs.
Select at least two experiments before you flip to Display Differences; otherwise the UI warns you to pick more runs.
What the UI does not check for you¶
Training from the wizard does not, by itself, confirm credit balance, queue capacity, or that every file in a custom environment upload is present. Those show up as errors when you start or while the job runs. Use the wizard validation messages on each step before you submit.