On-prem compute

Enterprise only

On-prem compute is available on organizations with an Enterprise plan. Without it, the On-prem training cluster menu item is hidden. See Plan permissions.

Enterprise organizations can run training on hardware they operate (a private data centre, a GPU rack, or a Kubernetes cluster in your VPC). Arena still hosts the UI, experiment state, and scheduling; jobs execute on your machines after you connect them with a setup bundle from the platform.

On-prem classes appear on the same experiment Resources step as cloud classes.

Who can use it

Who

Requirement

What they can do

Manager on an Enterprise org

Profile menu → On-prem training cluster

Enable the provider, add and edit resource classes, download setup bundles, copy install commands

Any org member on Enterprise

Provider enabled and at least one class enabled

Pick on-prem classes when building or editing experiments

Organizations that are not on Enterprise do not see On-prem training cluster in the profile menu. Members do not open the training cluster page; ask a Manager to configure on-prem for the org.

Moving an org to Enterprise is a contract or billing change (see Credits and plans), not a self-serve toggle.

Prerequisites

Before you connect hardware:

  • An Enterprise organization with a Manager who can open On-prem training cluster

  • Arena CLI installed if you plan to run install from your laptop (pip install agilerl, then arena login)

  • Outbound network from your cluster to Arena (for the private tunnel Arena sets up in the install bundle)

  • Hardware that matches the class you define: for Advanced Training projects, worker classes need at least one GPU per worker node

You install with Docker Swarm or Helm on Kubernetes — pick one path per cluster, not both. Fulfill only the bullets for the path you choose:

Path

You need

Docker Swarm

A Swarm manager host and worker hosts reachable by SSH from where you run install

Helm

A Kubernetes cluster and a working local kubectl context

Multi-worker pools often need shared storage (typically NFS) so the Ray head and workers see the same paths. Arena does not add that to the download bundle — see Shared storage in the install guide.

Configure in Arena

  1. Open profile menu → On-prem training cluster.

  2. Click Enable on-prem (one provider per organization).

  3. Click Add resource class and set Name, worker CPU/GPU/memory, and Number of nodes (see Resource classes).

You can add many resource classes on the same org — each one is a separate worker pool with its own install bundle and Config install on your hardware. For example, one class for an NVIDIA L4 rack and another for A100 machines: use Name (and optional Description) to label the hardware (nvidia-l4, nvidia-a100). Arena has no GPU model picker for on-prem; set CPUs, GPUs (count), and Memory per worker to match those machines, then run install separately for each class row.

Disabling the provider hides on-prem classes from new experiment runs until you enable it again. Existing class definitions stay in the table.

Install on your hardware

Arena generates connection settings in each class setup bundle. You do not configure AgileRL’s gateway manually.

Pick one path:

Path

When to use

Prerequisite

UI + Config panel

You created the class in the table and want copy-paste commands

Provider enabled, class saved; expand Config on the row

Arena CLI from terminal

Automate install from your laptop

CLI installed (pip install agilerl), arena login, class name in the command (CLI can enable the provider and create the class if missing — see Install a cluster)

Download setup (.tar)

Air-gapped hosts or no Arena CLI on your machine

Same as UI path — class must exist; extract on a bastion and run ./setup.sh

Re-run install or re-download when the class row shows Update recommended.

Full steps: Install a cluster.

After install

  1. Confirm Setup bundle shows Current on the class row.

  2. Start or edit an experiment and pick your on-prem class on the Resources step.

More detail