Experiment wizard overview

New and draft experiments use a six-step flow. The header shows New experiment and a horizontal stepper; the body swaps content as you move.

Steps (in order)

  1. Resources: nodes and resource class.

  2. Environment: gym (classic) or gym / dataset (Advanced Training).

  3. Agent: algorithm and model or network.

  4. Training: rollout length, batch sizes, and related fields.

  5. HPO: hyperparameter optimization.

  6. Summary: read-only review; Train or Resume here.

Classic vs Advanced Training

When you created the project, the Training switcher set the type:

  • Classic RL: gym environment grid, standard Agent form, classic Training accordions.

  • Advanced Training: combined environment grid (datasets and RL Environment rows), Pipelines tab on the project, and LLM-oriented agent and training UI where the flow needs it.

See Projects.

Summary step

On Summary for a draft, Train schedules the run. Tooltip: training begins shortly when resources are ready, usually within about ten minutes. Status becomes Pending first; allow up to roughly ten minutes for the job to move to Running when the cluster is busy. Success toast: Experiment {name} scheduled for training…

If the experiment is no longer a Draft, Resume replaces Train (unless status is Running or Pending). Resume opens a dialog to pick a checkpoint and name a new experiment.

After a successful Train from the wizard, Arena returns you to the project Experiments tab and the stepper resets to Resources.

Step guides

Wizard step

Guide

Resources

Choosing compute

Environment

Environment step

Agent

Agent step

Training

Training step

HPO

Hyperparameter search

After training, see Experiment statuses.