# nanochat — Affordable Open-Source ChatGPT by Karpathy

> An open-source project by Andrej Karpathy demonstrating how to build a capable chatbot for under $100 in compute, using efficient training techniques on small models.

## Install

Save as a script file and run:

# nanochat — Affordable Open-Source ChatGPT by Karpathy

## Quick Use
```bash
git clone https://github.com/karpathy/nanochat.git
cd nanochat
pip install -r requirements.txt
python train.py --config configs/small.yaml
python chat.py --model checkpoints/latest
```

## Introduction
nanochat is an open-source project by Andrej Karpathy that demonstrates building a functional chatbot for under $100 in compute costs. It serves as both an educational resource and a practical starting point for training small language models with modern techniques.

## What nanochat Does
- Trains a capable chatbot model from scratch on consumer hardware
- Implements efficient training techniques that minimize compute requirements
- Provides a complete pipeline from data preparation to interactive chat inference
- Includes instruction tuning and RLHF-style alignment on a budget
- Offers a reference implementation for understanding LLM training internals

## Architecture Overview
nanochat implements a transformer-based language model with a streamlined training pipeline. It uses a custom data loading system optimized for small-scale training, mixed-precision training with gradient accumulation, and a multi-stage pipeline covering pretraining, supervised fine-tuning, and preference optimization. The codebase is intentionally minimal to serve as a readable reference.

## Self-Hosting & Configuration
- Requires Python 3.10+ with PyTorch and a CUDA-capable GPU (RTX 3090 or better recommended)
- Training configs are YAML files specifying model size, data paths, and hyperparameters
- Pretrained checkpoints are available for skipping the pretraining phase
- Inference runs on consumer GPUs or CPUs (slower) for interactive chat
- No cloud dependencies; the entire pipeline runs on a single machine

## Key Features
- Complete LLM training pipeline in a minimal, readable codebase
- Budget-friendly: full training from scratch costs under $100 in GPU compute
- Multi-stage training covering pretraining, SFT, and preference optimization
- Educational code with clear documentation explaining each component
- Checkpoint compatibility with common inference frameworks for deployment

## Comparison with Similar Tools
- **minimind** — similar educational LLM trainer; nanochat includes alignment and chat-specific training stages
- **nanoGPT** — Karpathy's earlier project for pretraining only; nanochat extends to full chat model training
- **llama.cpp** — inference-focused; nanochat covers the training side of the pipeline
- **Axolotl** — fine-tuning toolkit; nanochat provides the full training stack from scratch

## FAQ
**Q: What GPU is needed for training?**
A: An RTX 3090 or 4090 is sufficient for the default model configuration.

**Q: Can I use my own training data?**
A: Yes. The data pipeline accepts JSONL formatted conversation data.

**Q: How does the output quality compare to commercial models?**
A: nanochat produces a capable conversational model, though it does not match frontier models trained on much larger budgets.

**Q: Is this suitable for production deployment?**
A: nanochat is primarily educational. For production, consider fine-tuning a larger pretrained model.

## Sources
- https://github.com/karpathy/nanochat
- https://karpathy.ai/

---
Source: https://tokrepo.com/en/workflows/asset-cf04e473
Author: Script Depot