SFTTrainer
! More docs here: TRL SFT docs. We do 60 steps to speed things up, but you can set num_train_epochs=1
for a full run, and turn off max_steps=None
. We also support TRL’s DPOTrainer
!
push_to_hub
for an online save or save_pretrained
for a local save.
[NOTE] This ONLY saves the LoRA adapters, and not the full model.
False
to True
: