Skills2026年4月26日·1 分钟阅读

LlamaFactory — Unified Fine-Tuning for 100+ LLMs

An open-source framework that unifies efficient fine-tuning methods for over 100 large language models including LLaMA, Qwen, Mistral, and more, with a web UI and CLI.

Agent 就绪

Agent 可直接安装

这个资产可安装;Agent 先选择当前运行时、检查安装计划,再运行匹配命令。

Native · 98/100策略:允许
Agent 入口
任意 MCP/CLI Agent
类型
Skill
安装
Single
信任
信任等级:Established
入口
LlamaFactory
直接安装命令
npx -y tokrepo@latest install 2eb5ce3d-416b-11f1-9bc6-00163e2b0d79 --target codex

先 dry-run 确认安装计划,再运行此命令。

Introduction

LlamaFactory provides a unified interface for fine-tuning over 100 large language models using methods like LoRA, QLoRA, full tuning, and RLHF. It removes the need to write custom training scripts for each model architecture, letting you configure everything through a web UI or YAML files.

What LlamaFactory Does

  • Supports 100+ LLM architectures including LLaMA, Qwen, Mistral, Gemma, ChatGLM, and Phi
  • Implements multiple fine-tuning methods: full, freeze, LoRA, QLoRA, DoRA, and LongLoRA
  • Provides RLHF training via PPO, DPO, KTO, and ORPO alignment algorithms
  • Includes a built-in Gradio web UI (LlamaBoard) for no-code training configuration
  • Handles dataset preprocessing, tokenization, and multi-GPU distributed training automatically

Architecture Overview

LlamaFactory wraps Hugging Face Transformers and PEFT libraries into a unified training pipeline. A YAML-based configuration system maps to model loaders, adapter injectors, and trainer classes. The framework dynamically selects the right tokenizer, chat template, and training strategy based on the model type and chosen fine-tuning method.

Self-Hosting & Configuration

  • Install via pip or use the official Docker image with CUDA support
  • Configure training runs through YAML files or the web UI
  • Supports multi-GPU training via DeepSpeed ZeRO stages 2 and 3
  • Integrates with Weights and Biases, MLflow, and TensorBoard for experiment tracking
  • Export fine-tuned models to GGUF, ONNX, or Hugging Face Hub format

Key Features

  • Single framework covering supervised fine-tuning, reward modeling, and RLHF/DPO alignment
  • Quantized training with 4-bit and 8-bit precision to reduce GPU memory requirements
  • Built-in evaluation with BLEU, ROUGE, and custom benchmark support
  • Flash Attention 2 and gradient checkpointing for memory-efficient training
  • Dataset mixing and streaming for handling large-scale instruction datasets

Comparison with Similar Tools

  • Axolotl — similar scope but LlamaFactory covers more model architectures and alignment methods
  • Unsloth — focuses on inference and training speed optimization; LlamaFactory offers broader method support
  • TRL — lower-level Hugging Face library; LlamaFactory provides a higher-level UI-driven workflow
  • torchtune — PyTorch-native fine-tuning; fewer model architectures supported
  • Ludwig — general-purpose declarative ML; LlamaFactory specializes in LLM fine-tuning

FAQ

Q: What GPU do I need to fine-tune a 7B model? A: With QLoRA (4-bit), you can fine-tune a 7B model on a single GPU with 16 GB VRAM. Full fine-tuning requires significantly more memory.

Q: Can I fine-tune multimodal (vision-language) models? A: Yes. LlamaFactory supports fine-tuning VLMs like LLaVA, Qwen-VL, and InternVL with image-text datasets.

Q: Does it support multi-node distributed training? A: Yes, via DeepSpeed and Hugging Face Accelerate for multi-node, multi-GPU setups.

Q: How do I use my own dataset? A: Place your dataset in JSON or JSONL format and register it in the dataset configuration file. The web UI also allows uploading datasets directly.

Sources

讨论

登录后参与讨论。
还没有评论,来写第一条吧。

相关资产