ScriptsMar 31, 2026·2 min read

Axolotl — Streamlined LLM Fine-Tuning

Axolotl streamlines post-training and fine-tuning for LLMs. 11.6K+ GitHub stars. LoRA, QLoRA, DPO, GRPO, multimodal training. Single YAML config. Flash Attention, multi-GPU. Apache 2.0.

TL;DR
Axolotl simplifies LLM fine-tuning with LoRA, QLoRA, DPO, and GRPO via one YAML config.
§01

What it is

Axolotl is a framework that streamlines post-training and fine-tuning for large language models. It supports LoRA, QLoRA, full fine-tuning, DPO, GRPO, and multimodal training. You define your entire training run in a single YAML configuration file, and Axolotl handles data loading, tokenization, training, and evaluation.

Axolotl targets ML engineers who want to fine-tune foundation models without writing custom training scripts for each experiment.

§02

How it saves time or tokens

Axolotl eliminates boilerplate training code. Instead of writing data loaders, configuring optimizers, and managing device placement, you specify parameters in YAML and run one command. Switching from LoRA to QLoRA to full fine-tuning is a config change, not a code rewrite.

Flash Attention integration and multi-GPU support via DeepSpeed mean training runs faster without manual optimization.

§03

How to use

  1. Install Axolotl: pip install axolotl
  2. Copy an example YAML config for your model and training method
  3. Customize the config with your dataset, model, and hyperparameters
  4. Run training: accelerate launch -m axolotl.cli.train config.yml
§04

Example

# config.yml
base_model: meta-llama/Llama-3-8B
model_type: LlamaForCausalLM
tokenizer_type: AutoTokenizer

load_in_4bit: true
adapter: qlora
lora_r: 32
lora_alpha: 16
lora_dropout: 0.05

datasets:
  - path: tatsu-lab/alpaca
    type: alpaca

sequence_len: 4096
micro_batch_size: 2
gradient_accumulation_steps: 4
learning_rate: 0.0002
num_epochs: 3
optimizer: adamw_bnb_8bit

flash_attention: true
wandb_project: my-finetune

Run with: accelerate launch -m axolotl.cli.train config.yml

§05

Related on TokRepo

§06

Common pitfalls

  • QLoRA with 4-bit quantization requires bitsandbytes, which only works on NVIDIA GPUs; Apple Silicon and AMD users need different quantization methods
  • Incorrect sequence_len causes OOM or truncated training data; check your dataset's token length distribution first
  • Flash Attention requires a compatible GPU (Ampere or newer); disable it on older hardware to avoid crashes

Frequently Asked Questions

How does Axolotl compare to TRL?+

TRL provides individual trainer classes (SFTTrainer, DPOTrainer) that you compose in Python code. Axolotl wraps the entire training pipeline in a YAML config with opinionated defaults. Axolotl is faster to get started; TRL gives more programmatic control.

Which models does Axolotl support?+

Axolotl supports LLaMA, Mistral, Gemma, Phi, Qwen, and most models available on Hugging Face. Any model compatible with Hugging Face Transformers can be used by specifying the correct model_type in the config.

Can I train multimodal models with Axolotl?+

Yes. Axolotl supports multimodal training for vision-language models. Configure the multimodal dataset format and model type in the YAML config. This feature supports LLaVA-style architectures.

What hardware do I need?+

For QLoRA fine-tuning of an 8B model, a single GPU with 24 GB VRAM (RTX 3090, A5000) is sufficient. Full fine-tuning of larger models requires multi-GPU setups. Axolotl integrates with DeepSpeed and FSDP for distributed training.

Does Axolotl support experiment tracking?+

Yes. Axolotl integrates with Weights and Biases (wandb) and MLflow. Set the wandb_project or mlflow_tracking_uri in your YAML config to log metrics, hyperparameters, and artifacts automatically.

Citations (3)
  • Axolotl GitHub— Axolotl streamlines LLM fine-tuning with 11.6K+ GitHub stars
  • arXiv— QLoRA: Efficient Finetuning of Quantized Language Models
  • arXiv— Flash Attention for efficient transformer training
🙏

Source & Thanks

Created by Axolotl AI. Licensed under Apache 2.0. axolotl-ai-cloud/axolotl — 11,600+ GitHub stars

Discussion

Sign in to join the discussion.
No comments yet. Be the first to share your thoughts.

Related Assets