Is TRL — Post-Training LLMs with RLHF & DPO free to use?

Yes. TRL — Post-Training LLMs with RLHF & DPO is freely available on TokRepo. Check the Source & Thanks section on the asset page for the specific open-source license.

How do I install TRL — Post-Training LLMs with RLHF & DPO?

Visit the asset page on TokRepo and click "Copy for agent" to get the installation instructions. Most assets can be installed with a single command.

Cette page est affichée en anglais. Une traduction française est en cours.

ScriptsMar 31, 2026·2 min de lecture

TRL — Post-Training LLMs with RLHF & DPO

Name: TRL — Post-Training LLMs with RLHF & DPO
Author: Script Depot

TRL is a Hugging Face library for post-training foundation models. 17.9K+ GitHub stars. SFT, GRPO, DPO, reward modeling. Scales from single GPU to multi-node. Apache 2.0.

Script Depot · Community

Introduction

TRL (Transformers Reinforcement Learning) is a Hugging Face library for post-training foundation models using techniques like Supervised Fine-Tuning (SFT), Group Relative Policy Optimization (GRPO), Direct Preference Optimization (DPO), and reward modeling. With 17,900+ GitHub stars and Apache 2.0 license, TRL scales from single GPU to multi-node clusters via Accelerate and DeepSpeed. It includes a CLI for quick fine-tuning without code, and integrates with PEFT for efficient training on large models.

Best for: ML engineers fine-tuning LLMs with human preference data or custom datasets Works with: Claude Code, OpenAI Codex, Cursor, Gemini CLI, Windsurf Ecosystem: Hugging Face Transformers, Accelerate, DeepSpeed, PEFT

Key Features

Multiple trainers: SFTTrainer, GRPOTrainer, DPOTrainer, RewardTrainer, PPOTrainer
CLI interface: Fine-tune models without writing code
Scalable: Single GPU to multi-node clusters via Accelerate and DeepSpeed
PEFT integration: LoRA and QLoRA for efficient training on large models
Built on Transformers: Full compatibility with Hugging Face ecosystem

FAQ

Q: What is TRL? A: TRL is a Hugging Face library with 17.9K+ stars for post-training LLMs using SFT, DPO, GRPO, and reward modeling. It scales from single GPU to multi-node and includes a no-code CLI. Apache 2.0.

Q: How do I install TRL? A: Run pip install trl. Use the CLI with trl sft --model_name_or_path <model> --dataset_name <dataset> or the Python API with SFTTrainer/DPOTrainer classes.

🙏

Source et remerciements

Created by Hugging Face. Licensed under Apache 2.0. huggingface/trl — 17,900+ GitHub stars

Discussion

Connectez-vous pour rejoindre la discussion.

Aucun commentaire pour l'instant. Soyez le premier à partager votre avis.

◈Accueil 🔍Rechercher 👤Moi

TRL — Post-Training LLMs with RLHF & DPO

Key Features

FAQ

Source et remerciements

Discussion

Actifs similaires

Unkey — Open-Source API Key Management Platform

Flagsmith — Open-Source Feature Flags and Remote Config

OpenStatus — Open-Source Monitoring and Status Page Platform