Train an LLM¶

Create a preference dataset experiment in an Advanced Training project, configure DPO, and start training.

Prerequisites¶

Advanced Training project with a preference dataset (prompt / chosen / rejected columns).
See Preference datasets if you still need to upload data.

On Experiments, open your Advanced Training project.

Click New experiment, name it, and save to open the wizard.

On the wizard’s Environment step, select your preference dataset. The preview shows Prompt, Chosen, and Rejected columns.

Choose DPO and tune hyperparameters if needed. Click Next.

Set training steps and related options, then continue to HPO or Summary as shown.

Review the configuration on Summary.

Click Train. The experiment row shows status while the job runs.