Results¶
Open the Results tab on a project to compare experiments in that project. Select runs on the left, then read charts and setting comparisons on the right. Saved checkpoints stay on the Experiments tab (expand a row). See Checkpoints.
Tab order for Advanced Training projects: Experiments, Pipelines, Results. Classic RL projects have no Pipelines tab.
Layout¶
A horizontal split:
Panel |
Contents |
|---|---|
Left (~30%) |
All experiments header, refresh control, Running and Completed lists, pagination when needed |
Right (~70%) |
Metric charts and training-setting comparison for selected runs |
Until you select at least one run with the eye icon, the right side shows: Make a selection by clicking on the eye icon to view Experiment Metrics.
Running vs completed¶
Section |
Statuses |
|---|---|
Running |
Running, Stopping, Pending |
Completed |
Succeeded, Stopped, Failed |
When a run you had selected under Running finishes, the UI moves that selection to Completed so charts do not drop it.
Row actions:
Running: Stop training, Update mutation parameters (when HPO applies), View logs
Completed: View logs (eye icon still toggles chart selection)
View logs opens the experiment log view for that run.
Selecting runs for charts¶
Click the eye icon to add or remove an experiment from the chart set. Selections are remembered per project separately for Running and Completed.
Charts use algorithm defaults plus network-specific series. Supervised runs split fitness charts by configured fitness metric (accuracy, mean IoU, MSE, and similar). Object detection runs chart pixel accuracy and validation pixel accuracy (not a generic accuracy series). LatentPPO charts each configured reward component (mean_iou, dice, ce, boundary) and labels training score as composite reward. When selected runs need different fitness groupings, Arena shows more than one fitness chart.
The right-hand chart area has accordions:
Section |
Contents |
|---|---|
Custom Charts |
Build your own plots (pick metrics, line vs other types, Steps or Time on the X axis) |
Default Training Charts |
Score, fitness, losses, episode time, steps (presets per algorithm) |
Hyperparameter and Network Optimization Charts |
How mutations and architectures evolve when HPO is on (see How evolutionary hyperparameter optimization works) |
Toggle Steps vs Time on the X axis for default training charts (top of that section). Custom charts use the same choice in the chart builder.
One selected run: charts can show each agent in the population (legend entries per agent). Click a legend label to hide or show a series.
Several selected runs: charts aggregate to population level (averages, best fitness, and similar) so you can compare algorithms or environments side by side.
Plots appear only after a run has produced metrics. Brand-new experiments with no data yet stay empty until training progresses.
While any Running selection exists, the tab refreshes experiment data about every five seconds. Polling pauses if you open the mutation-parameter dialog or clear all running selections.
Training setting comparison¶
Below the charts, grouped tables mirror the wizard: Resources, Environment, Agent, Training, and Hyperparameter Optimization when any selected run used HPO. Toggle Display All vs Display Differences to focus on mismatches. Full detail: Training settings.
Focusing one experiment¶
You can land on Results with one experiment highlighted (for example after Train, halt, or from a pipeline stage’s metrics control). Arena selects that row, then clears the highlight after about a second.