Train an LLM

Create a preference dataset experiment in an Advanced Training project, configure DPO, and start training.

Prerequisites

1. Open the project

On Experiments, open your Advanced Training project.

Advanced Training project experiments table

The Advanced Training project page.

2. New experiment

Click New experiment, name it, and save to open the wizard.

New experiment dialog in advanced project

Creating a new experiment.

3. Environment step

On the wizard’s Environment step, select your preference dataset. The preview shows Prompt, Chosen, and Rejected columns.

Wizard dataset selection step with preference dataset

Selecting a preference dataset in the wizard.

4. Agent

Choose DPO and tune hyperparameters if needed. Click Next.

Wizard agent step with DPO selected

The Agent step with DPO for preference training.

5. Training

Set training steps and related options, then continue to HPO or Summary as shown.

Wizard training step

The Training step.

6. Summary

Review the configuration on Summary.

Wizard summary before train

Summary before starting the run.

7. Running experiment

Click Train. The experiment row shows status while the job runs.

Experiments table with running status

An experiment with an active training status.