# nanochat — Affordable Open-Source ChatGPT by Karpathy > An open-source project by Andrej Karpathy demonstrating how to build a capable chatbot for under $100 in compute, using efficient training techniques on small models. ## Install Save as a script file and run: # nanochat — Affordable Open-Source ChatGPT by Karpathy ## Quick Use ```bash git clone https://github.com/karpathy/nanochat.git cd nanochat pip install -r requirements.txt python train.py --config configs/small.yaml python chat.py --model checkpoints/latest ``` ## Introduction nanochat is an open-source project by Andrej Karpathy that demonstrates building a functional chatbot for under $100 in compute costs. It serves as both an educational resource and a practical starting point for training small language models with modern techniques. ## What nanochat Does - Trains a capable chatbot model from scratch on consumer hardware - Implements efficient training techniques that minimize compute requirements - Provides a complete pipeline from data preparation to interactive chat inference - Includes instruction tuning and RLHF-style alignment on a budget - Offers a reference implementation for understanding LLM training internals ## Architecture Overview nanochat implements a transformer-based language model with a streamlined training pipeline. It uses a custom data loading system optimized for small-scale training, mixed-precision training with gradient accumulation, and a multi-stage pipeline covering pretraining, supervised fine-tuning, and preference optimization. The codebase is intentionally minimal to serve as a readable reference. ## Self-Hosting & Configuration - Requires Python 3.10+ with PyTorch and a CUDA-capable GPU (RTX 3090 or better recommended) - Training configs are YAML files specifying model size, data paths, and hyperparameters - Pretrained checkpoints are available for skipping the pretraining phase - Inference runs on consumer GPUs or CPUs (slower) for interactive chat - No cloud dependencies; the entire pipeline runs on a single machine ## Key Features - Complete LLM training pipeline in a minimal, readable codebase - Budget-friendly: full training from scratch costs under $100 in GPU compute - Multi-stage training covering pretraining, SFT, and preference optimization - Educational code with clear documentation explaining each component - Checkpoint compatibility with common inference frameworks for deployment ## Comparison with Similar Tools - **minimind** — similar educational LLM trainer; nanochat includes alignment and chat-specific training stages - **nanoGPT** — Karpathy's earlier project for pretraining only; nanochat extends to full chat model training - **llama.cpp** — inference-focused; nanochat covers the training side of the pipeline - **Axolotl** — fine-tuning toolkit; nanochat provides the full training stack from scratch ## FAQ **Q: What GPU is needed for training?** A: An RTX 3090 or 4090 is sufficient for the default model configuration. **Q: Can I use my own training data?** A: Yes. The data pipeline accepts JSONL formatted conversation data. **Q: How does the output quality compare to commercial models?** A: nanochat produces a capable conversational model, though it does not match frontier models trained on much larger budgets. **Q: Is this suitable for production deployment?** A: nanochat is primarily educational. For production, consider fine-tuning a larger pretrained model. ## Sources - https://github.com/karpathy/nanochat - https://karpathy.ai/ --- Source: https://tokrepo.com/en/workflows/asset-cf04e473 Author: Script Depot