On-prem resource classes

Enterprise only

On-prem resource classes require an Enterprise plan and an enabled on-prem provider. See Plan permissions.

A resource class names a compute shape: CPU, GPU, and memory per worker, plus how many workers the generated install bundle should run. On-prem classes are yours to define; Arena does not look up cloud instance types for them.

Classes hang off your org’s on-prem provider. Add and edit them on the Training cluster page after the provider is enabled.

Multiple classes for different hardware

One org has a single on-prem provider, but you can add many resource classes. Each class targets its own machines: separate Add resource class, separate install from Config, separate entry on the experiment Resources step.

Arena does not offer a GPU model field for on-prem. Use Name to label the hardware so your team can tell pools apart in the table and experiment picker — for example nvidia-l4-east and nvidia-a100-west. Optional Description can hold extra detail. Set CPUs, GPUs (count per worker), and Memory to match the workers on that pool; install each class only on hosts that fit those numbers.

Fields when adding or editing

Field

Required

Notes

Name

Yes

Shown in the class table and experiment picker; use this to label the hardware pool (GPU model, site, rack)

Enabled

No (default on)

Disabled classes stay listed but are not offered on new runs

Description

No

Optional short text (e.g. rack location or GPU SKU)

Number of nodes

Yes (≥ 1)

Drives worker replica count in the generated stack file or Helm chart

CPUs

No

Per worker node

GPUs

No

Per worker node

Memory

No

Free text, e.g. 32 GB (per worker)

The form groups CPU, GPU, and memory under Compute resource (per worker node). That matches what workers request in the deployment bundle, not the head node.

Validation in the UI: node count must be at least 1; name cannot be empty on save.

Table columns and actions

Each row shows Name, Number of nodes, Number of CPU of worker node, Number of GPU of worker node, Memory of worker node, Status, Setup bundle, Deployment (Config), and Actions.

  • Enable / Disable — in the Status column; flips whether the class is available for experiments without deleting it.

  • Edit — opens the same fields as add.

  • Delete — permanent; the confirmation dialog shows the class Name.

  • Config — expands the deployment panel (README, stack files, download). See Training cluster.

Setup bundle column

UI label

Meaning

Current

Generated deployment bundle matches the platform’s current on-prem image

Update recommended

Platform image changed; download a fresh .tar and roll out on your cluster

Re-download or re-run install when you see Update recommended so workers run the current Arena train image.

Using a class in an experiment

On the experiment wizard Resources step (first step):

  1. Choose Number of nodes (1, 2, 4, 8, 16, or 32). This is the run’s node count for orchestration and (for cloud classes) cost display.

  2. Pick a Resource class radio entry.

On-prem classes appear when:

  • Your org’s on-prem provider is enabled, and

  • The class itself is enabled.

They sort ahead of cloud classes in the list. Price shows as zero credits.

Advanced Training projects and dataset experiments list only classes with at least one GPU per worker. CPU-only on-prem classes appear on classic RL projects but not on Advanced Training.

Class node count vs wizard node count

Example: class gpu-pool has Number of nodes 2 (two workers in the install bundle). On Resources, you might still choose 4 nodes for the run’s orchestration. Size the class to match the physical GPU machines in your swarm or cluster; ask your Manager if the wizard node count should match the class.

If nothing on-prem appears, check that the provider is enabled, at least one class exists and is enabled, the org is on Enterprise, and (for Advanced Training) the class has GPUs.