Quick Use
- Local:
ollama pull deepseek-coder:6.7b - Configure Continue / Aider / Cursor to use the local model
- Or use hosted API with
model="deepseek-coder"
Intro
DeepSeek Coder is the code-specialized open-weight model — trained on 2T tokens of code across 100+ languages, with native fill-in-middle (FIM) support for tab autocomplete. Outperforms Codestral and matches GPT-4o on HumanEval and MBPP at a fraction of the cost. Best for: local tab autocomplete via Continue / Cursor's local mode, and code-heavy production agents that need cheap inference. Works with: Ollama, vLLM, llama.cpp, DeepSeek API, Continue, Aider. Setup time: 2 minutes.
Local with Ollama
ollama pull deepseek-coder:6.7b # ~4GB, fits on most laptops
ollama pull deepseek-coder:33b # ~20GB, M3 Pro / 4090 territory
# Quick test
ollama run deepseek-coder:6.7b
> Write a Rust function that returns the Nth Fibonacci with memoization.Use as tab autocomplete in Continue
// Continue's config.json
{
"tabAutocompleteModel": {
"title": "DeepSeek Coder",
"provider": "ollama",
"model": "deepseek-coder:6.7b",
"apiBase": "http://localhost:11434"
},
"models": [
{
"title": "DeepSeek Coder Chat",
"provider": "ollama",
"model": "deepseek-coder:33b"
}
]
}Use with Aider
# Hosted
export DEEPSEEK_API_KEY=sk-...
aider --model deepseek/deepseek-coder
# Local (BYOK Ollama)
aider --model ollama/deepseek-coder:33bFill-in-middle (FIM) format
DeepSeek Coder's tab-completion uses a specific FIM format:
<|fim_begin|>{prefix}<|fim_hole|>{suffix}<|fim_end|>Continue / Aider / Cursor handle this automatically. If you're integrating manually, use the FIM tokens — completions are 10-30% better than naive prompting.
Pricing & versions
| Variant | Params | RAM (4-bit) | HumanEval Pass@1 |
|---|---|---|---|
| deepseek-coder:1.3b | 1.3B | ~1GB | ~38% |
| deepseek-coder:6.7b | 6.7B | ~4GB | ~58% |
| deepseek-coder:33b | 33B | ~20GB | ~76% |
| deepseek-coder-v2:236b (MoE) | 236B (21B active) | API only | ~86% |
| GPT-4o (compare) | — | API only | ~90% |
Hosted API: $0.14 / 1M input tokens — cheapest production-quality coder model.
FAQ
Q: Coder vs full DeepSeek-V3 for coding? A: Coder is smaller, faster, cheaper, FIM-aware — best for local autocomplete and quick code questions. V3 is bigger, broader, better at long-context reasoning across files. For tab autocomplete: Coder. For 'understand my whole repo and refactor': V3.
Q: Can I fine-tune DeepSeek Coder? A: Yes — open weights mean any standard LoRA / QLoRA tooling (axolotl, unsloth, trl) works. The 6.7B variant LoRAs are practical on a single 24GB GPU.
Q: Is the V2 MoE coder available locally? A: The V2 236B MoE has open weights but the size makes it impractical for single-machine local. Use it via DeepSeek API or rent GPU time on Together / Fireworks. The 33B dense version is the local-friendly sweet spot.
Source & Thanks
Built by DeepSeek. Weights MIT-licensed.
deepseek-ai/DeepSeek-Coder — ⭐ 23,000+