Cette page est affichée en anglais. Une traduction française est en cours.
ScriptsMar 31, 2026·2 min de lecture

TRL — Post-Training LLMs with RLHF & DPO

TRL is a Hugging Face library for post-training foundation models. 17.9K+ GitHub stars. SFT, GRPO, DPO, reward modeling. Scales from single GPU to multi-node. Apache 2.0.

Introduction

TRL (Transformers Reinforcement Learning) is a Hugging Face library for post-training foundation models using techniques like Supervised Fine-Tuning (SFT), Group Relative Policy Optimization (GRPO), Direct Preference Optimization (DPO), and reward modeling. With 17,900+ GitHub stars and Apache 2.0 license, TRL scales from single GPU to multi-node clusters via Accelerate and DeepSpeed. It includes a CLI for quick fine-tuning without code, and integrates with PEFT for efficient training on large models.

Best for: ML engineers fine-tuning LLMs with human preference data or custom datasets Works with: Claude Code, OpenAI Codex, Cursor, Gemini CLI, Windsurf Ecosystem: Hugging Face Transformers, Accelerate, DeepSpeed, PEFT


Key Features

  • Multiple trainers: SFTTrainer, GRPOTrainer, DPOTrainer, RewardTrainer, PPOTrainer
  • CLI interface: Fine-tune models without writing code
  • Scalable: Single GPU to multi-node clusters via Accelerate and DeepSpeed
  • PEFT integration: LoRA and QLoRA for efficient training on large models
  • Built on Transformers: Full compatibility with Hugging Face ecosystem

FAQ

Q: What is TRL? A: TRL is a Hugging Face library with 17.9K+ stars for post-training LLMs using SFT, DPO, GRPO, and reward modeling. It scales from single GPU to multi-node and includes a no-code CLI. Apache 2.0.

Q: How do I install TRL? A: Run pip install trl. Use the CLI with trl sft --model_name_or_path <model> --dataset_name <dataset> or the Python API with SFTTrainer/DPOTrainer classes.


🙏

Source et remerciements

Created by Hugging Face. Licensed under Apache 2.0. huggingface/trl — 17,900+ GitHub stars

Discussion

Connectez-vous pour rejoindre la discussion.
Aucun commentaire pour l'instant. Soyez le premier à partager votre avis.

Actifs similaires