LitGPT — Fine-Tune and Deploy AI Models Simply
Lightning AI's framework for fine-tuning and serving 20+ LLM families. LitGPT supports LoRA, QLoRA, full fine-tuning with one-command training on consumer hardware.
What it is
LitGPT is Lightning AI's framework for fine-tuning and serving large language models. It supports 20+ model families including Llama, Mistral, Gemma, and Phi, with training methods including LoRA, QLoRA, and full fine-tuning. The framework is designed for one-command operations: download a model, chat with it, or fine-tune it with a single CLI command.
LitGPT targets ML engineers and developers who want to fine-tune open-source LLMs without building custom training infrastructure. It handles quantization, memory optimization, and distributed training transparently.
How it saves time or tokens
LitGPT reduces fine-tuning from a multi-day infrastructure project to a single CLI command. QLoRA support means you can fine-tune a 7B parameter model on a single consumer GPU (RTX 3090 or above). The framework handles gradient checkpointing, mixed precision, and data loading automatically. For serving, the same framework deploys the fine-tuned model with optimized inference.
How to use
- Install LitGPT:
pip install litgpt
- Download and chat with a model:
litgpt download meta-llama/Llama-3.1-8B-Instruct
litgpt chat meta-llama/Llama-3.1-8B-Instruct
- Fine-tune with LoRA:
litgpt finetune_lora meta-llama/Llama-3.1-8B-Instruct \
--data JSON --data.json_path training_data.json
Example
Complete fine-tuning workflow:
# Prepare training data as JSON
cat training_data.json
# [{"instruction": "Summarize this article", "input": "...", "output": "..."}]
# Fine-tune with QLoRA (fits on 24GB GPU)
litgpt finetune_lora meta-llama/Llama-3.1-8B-Instruct \
--data JSON \
--data.json_path training_data.json \
--quantize bnb.nf4 \
--train.epochs 3
# Chat with the fine-tuned model
litgpt chat checkpoints/fine-tuned/
# Serve the model
litgpt serve checkpoints/fine-tuned/ --port 8000
Related on TokRepo
- Local LLM Providers — Local LLM running tools including Ollama, LM Studio, and vLLM
- AI Tools for Coding — Development tools for AI model training and deployment
Common pitfalls
- QLoRA requires sufficient VRAM for the quantized model plus LoRA adapters. A 7B model needs roughly 12GB VRAM minimum with QLoRA.
- Training data format must match LitGPT's expected schema (instruction/input/output JSON). Misformatted data causes silent training issues.
- Fine-tuned models may overfit on small datasets. Use validation splits and monitor training loss to detect overfitting early.
- Always check the official documentation for the latest version-specific changes and migration guides before upgrading in production environments.
- For team deployments, establish clear guidelines on configuration and usage patterns to ensure consistency across developers.
- Model downloads can be large (several GB for 7B+ models). Ensure adequate disk space and a stable internet connection before starting the download process.
Frequently Asked Questions
LitGPT supports 20+ families including Llama, Mistral, Mixtral, Gemma, Phi, Falcon, StableLM, and others. New models are added as they are released. Check the documentation for the current list.
Yes. QLoRA support lets you fine-tune 7B-13B parameter models on GPUs with 24GB VRAM (RTX 3090, RTX 4090). Larger models require multi-GPU setups or more VRAM.
LoRA adds small trainable adapter layers to the model while keeping the base weights frozen. QLoRA adds quantization (4-bit) on top of LoRA, reducing VRAM requirements by roughly 4x. QLoRA is recommended for consumer hardware.
Yes. LitGPT includes a serve command that starts an OpenAI-compatible API server for your fine-tuned model. This lets you integrate the model into applications using standard API calls.
Yes. LitGPT leverages Lightning's distributed training capabilities. Multi-GPU and multi-node training work transparently for both full fine-tuning and LoRA methods.
Citations (3)
- LitGPT GitHub— LitGPT supports fine-tuning 20+ LLM families
- LoRA Paper— LoRA and QLoRA for parameter-efficient fine-tuning
- Lightning AI— Lightning AI distributed training framework
Related on TokRepo
Source & Thanks
Created by Lightning AI. Licensed under Apache 2.0.
Lightning-AI/litgpt — 10k+ stars