PromptsApr 8, 2026·2 min read

LitGPT — Fine-Tune and Deploy AI Models Simply

Lightning AI's framework for fine-tuning and serving 20+ LLM families. LitGPT supports LoRA, QLoRA, full fine-tuning with one-command training on consumer hardware.

PR
Prompt Lab · Community
Quick Use

Use it first, then decide how deep to go

This block should tell both the user and the agent what to copy, install, and apply first.

pip install litgpt
# Download a model
litgpt download meta-llama/Llama-3.1-8B-Instruct

# Chat locally
litgpt chat meta-llama/Llama-3.1-8B-Instruct

# Fine-tune with LoRA
litgpt finetune_lora meta-llama/Llama-3.1-8B-Instruct \
  --data JSON --data.json_path training_data.json

# Serve as API
litgpt serve meta-llama/Llama-3.1-8B-Instruct

What is LitGPT?

LitGPT is Lightning AI's framework for fine-tuning, pretraining, and deploying large language models. It supports 20+ model families (Llama, Mistral, Phi, Gemma, etc.) with multiple fine-tuning methods (LoRA, QLoRA, full). One-command workflows make it accessible on consumer GPUs. Built on PyTorch Lightning for scalable training.

Answer-Ready: LitGPT is Lightning AI's LLM fine-tuning and serving framework. Supports 20+ model families, LoRA/QLoRA/full fine-tuning, one-command training on consumer GPUs. Built on PyTorch Lightning. Download, fine-tune, and serve with simple CLI commands. 10k+ GitHub stars.

Best for: ML engineers fine-tuning open-source models on their own hardware. Works with: Llama, Mistral, Phi, Gemma, Falcon, and 15+ model families. Setup time: Under 5 minutes.

Core Features

1. One-Command Workflows

litgpt download <model>     # Download model
litgpt chat <model>          # Interactive chat
litgpt finetune_lora <model> # LoRA fine-tuning
litgpt finetune <model>      # Full fine-tuning
litgpt serve <model>         # API server
litgpt evaluate <model>      # Benchmark evaluation
litgpt pretrain <model>      # Pretrain from scratch

2. Supported Model Families (20+)

Family Models
Meta Llama 3.1, Llama 3.2, CodeLlama
Mistral Mistral 7B, Mixtral
Google Gemma, Gemma 2
Microsoft Phi-3, Phi-3.5
Alibaba Qwen 2.5
TII Falcon
StabilityAI StableLM

3. Fine-Tuning Methods

Method VRAM Needed Quality
QLoRA (4-bit) 6GB Good
LoRA 12GB Very good
Full fine-tune 40GB+ Best
Adapter 8GB Good

4. Training Data Format

[
  {"instruction": "Summarize this text", "input": "Long article...", "output": "Brief summary..."},
  {"instruction": "Translate to French", "input": "Hello world", "output": "Bonjour le monde"}
]

5. Multi-GPU Training

# 4 GPU training with FSDP
litgpt finetune_lora meta-llama/Llama-3.1-8B-Instruct \
  --devices 4 --strategy fsdp

FAQ

Q: Can I fine-tune on a single consumer GPU? A: Yes, QLoRA needs only 6GB VRAM. A 7B model fine-tunes on an RTX 3060.

Q: How does it compare to Unsloth? A: Unsloth is faster for single-GPU LoRA. LitGPT offers more model families, multi-GPU, and full pretraining support.

Q: Can I serve the fine-tuned model? A: Yes, litgpt serve exposes an OpenAI-compatible API endpoint.

🙏

Source & Thanks

Created by Lightning AI. Licensed under Apache 2.0.

Lightning-AI/litgpt — 10k+ stars

Discussion

Sign in to join the discussion.
No comments yet. Be the first to share your thoughts.

Related Assets