# TRL — Post-Training LLMs with RLHF & DPO > TRL is a Hugging Face library for post-training foundation models. 17.9K+ GitHub stars. SFT, GRPO, DPO, reward modeling. Scales from single GPU to multi-node. Apache 2.0. ## Install Save as a script file and run: ## Quick Use ```bash # Install pip install trl # Fine-tune with CLI (no code needed) trl sft --model_name_or_path Qwen/Qwen2.5-0.5B \ --dataset_name trl-lib/Capybara \ --output_dir sft-output # Or in Python python -c " from trl import SFTTrainer from transformers import AutoModelForCausalLM, AutoTokenizer model = AutoModelForCausalLM.from_pretrained('Qwen/Qwen2.5-0.5B') tokenizer = AutoTokenizer.from_pretrained('Qwen/Qwen2.5-0.5B') trainer = SFTTrainer(model=model, tokenizer=tokenizer, train_dataset=dataset) trainer.train() " ``` --- ## Intro TRL (Transformers Reinforcement Learning) is a Hugging Face library for post-training foundation models using techniques like Supervised Fine-Tuning (SFT), Group Relative Policy Optimization (GRPO), Direct Preference Optimization (DPO), and reward modeling. With 17,900+ GitHub stars and Apache 2.0 license, TRL scales from single GPU to multi-node clusters via Accelerate and DeepSpeed. It includes a CLI for quick fine-tuning without code, and integrates with PEFT for efficient training on large models. **Best for**: ML engineers fine-tuning LLMs with human preference data or custom datasets **Works with**: Claude Code, OpenAI Codex, Cursor, Gemini CLI, Windsurf **Ecosystem**: Hugging Face Transformers, Accelerate, DeepSpeed, PEFT --- ## Key Features - **Multiple trainers**: SFTTrainer, GRPOTrainer, DPOTrainer, RewardTrainer, PPOTrainer - **CLI interface**: Fine-tune models without writing code - **Scalable**: Single GPU to multi-node clusters via Accelerate and DeepSpeed - **PEFT integration**: LoRA and QLoRA for efficient training on large models - **Built on Transformers**: Full compatibility with Hugging Face ecosystem --- ### FAQ **Q: What is TRL?** A: TRL is a Hugging Face library with 17.9K+ stars for post-training LLMs using SFT, DPO, GRPO, and reward modeling. It scales from single GPU to multi-node and includes a no-code CLI. Apache 2.0. **Q: How do I install TRL?** A: Run `pip install trl`. Use the CLI with `trl sft --model_name_or_path --dataset_name ` or the Python API with SFTTrainer/DPOTrainer classes. --- ## Source & Thanks > Created by [Hugging Face](https://github.com/huggingface). Licensed under Apache 2.0. > [huggingface/trl](https://github.com/huggingface/trl) — 17,900+ GitHub stars --- Source: https://tokrepo.com/en/workflows/7989ba1c-3daf-4fd1-bf5b-f1b16b6ca990 Author: Script Depot