Cette page est affichée en anglais. Une traduction française est en cours.
KnowledgeMay 8, 2026·4 min de lecture

DeepSeek Coder — Code-Specialized Model for Local Inference

DeepSeek Coder is the code-specialized open-weight model with FIM (fill-in-middle) support. Beats Codestral on HumanEval. Drops into Continue, Aider.

DeepSeek
DeepSeek · Community
Prêt pour agents

Cet actif peut être lu et installé directement par les agents

TokRepo expose une commande CLI universelle, un contrat d'installation, le metadata JSON, un plan selon l'adaptateur et le contenu raw pour aider les agents à juger l'adaptation, le risque et les prochaines actions.

Stage only · 15/100Stage only
Surface agent
Tout agent MCP/CLI
Type
Knowledge
Installation
Stage only
Confiance
Confiance : New
Point d'entrée
Asset
Commande CLI universelle
npx tokrepo install 08acf3a7-b56b-40d2-9c94-9a8eb773eca4
Introduction

DeepSeek Coder is the code-specialized open-weight model — trained on 2T tokens of code across 100+ languages, with native fill-in-middle (FIM) support for tab autocomplete. Outperforms Codestral and matches GPT-4o on HumanEval and MBPP at a fraction of the cost. Best for: local tab autocomplete via Continue / Cursor's local mode, and code-heavy production agents that need cheap inference. Works with: Ollama, vLLM, llama.cpp, DeepSeek API, Continue, Aider. Setup time: 2 minutes.


Local with Ollama

ollama pull deepseek-coder:6.7b   # ~4GB, fits on most laptops
ollama pull deepseek-coder:33b    # ~20GB, M3 Pro / 4090 territory

# Quick test
ollama run deepseek-coder:6.7b
> Write a Rust function that returns the Nth Fibonacci with memoization.

Use as tab autocomplete in Continue

// Continue's config.json
{
  "tabAutocompleteModel": {
    "title": "DeepSeek Coder",
    "provider": "ollama",
    "model": "deepseek-coder:6.7b",
    "apiBase": "http://localhost:11434"
  },
  "models": [
    {
      "title": "DeepSeek Coder Chat",
      "provider": "ollama",
      "model": "deepseek-coder:33b"
    }
  ]
}

Use with Aider

# Hosted
export DEEPSEEK_API_KEY=sk-...
aider --model deepseek/deepseek-coder

# Local (BYOK Ollama)
aider --model ollama/deepseek-coder:33b

Fill-in-middle (FIM) format

DeepSeek Coder's tab-completion uses a specific FIM format:

<|fim_begin|>{prefix}<|fim_hole|>{suffix}<|fim_end|>

Continue / Aider / Cursor handle this automatically. If you're integrating manually, use the FIM tokens — completions are 10-30% better than naive prompting.

Pricing & versions

Variant Params RAM (4-bit) HumanEval Pass@1
deepseek-coder:1.3b 1.3B ~1GB ~38%
deepseek-coder:6.7b 6.7B ~4GB ~58%
deepseek-coder:33b 33B ~20GB ~76%
deepseek-coder-v2:236b (MoE) 236B (21B active) API only ~86%
GPT-4o (compare) API only ~90%

Hosted API: $0.14 / 1M input tokens — cheapest production-quality coder model.


FAQ

Q: Coder vs full DeepSeek-V3 for coding? A: Coder is smaller, faster, cheaper, FIM-aware — best for local autocomplete and quick code questions. V3 is bigger, broader, better at long-context reasoning across files. For tab autocomplete: Coder. For 'understand my whole repo and refactor': V3.

Q: Can I fine-tune DeepSeek Coder? A: Yes — open weights mean any standard LoRA / QLoRA tooling (axolotl, unsloth, trl) works. The 6.7B variant LoRAs are practical on a single 24GB GPU.

Q: Is the V2 MoE coder available locally? A: The V2 236B MoE has open weights but the size makes it impractical for single-machine local. Use it via DeepSeek API or rent GPU time on Together / Fireworks. The 33B dense version is the local-friendly sweet spot.


Quick Use

  1. Local: ollama pull deepseek-coder:6.7b
  2. Configure Continue / Aider / Cursor to use the local model
  3. Or use hosted API with model="deepseek-coder"

Intro

DeepSeek Coder is the code-specialized open-weight model — trained on 2T tokens of code across 100+ languages, with native fill-in-middle (FIM) support for tab autocomplete. Outperforms Codestral and matches GPT-4o on HumanEval and MBPP at a fraction of the cost. Best for: local tab autocomplete via Continue / Cursor's local mode, and code-heavy production agents that need cheap inference. Works with: Ollama, vLLM, llama.cpp, DeepSeek API, Continue, Aider. Setup time: 2 minutes.


Local with Ollama

ollama pull deepseek-coder:6.7b   # ~4GB, fits on most laptops
ollama pull deepseek-coder:33b    # ~20GB, M3 Pro / 4090 territory

# Quick test
ollama run deepseek-coder:6.7b
> Write a Rust function that returns the Nth Fibonacci with memoization.

Use as tab autocomplete in Continue

// Continue's config.json
{
  "tabAutocompleteModel": {
    "title": "DeepSeek Coder",
    "provider": "ollama",
    "model": "deepseek-coder:6.7b",
    "apiBase": "http://localhost:11434"
  },
  "models": [
    {
      "title": "DeepSeek Coder Chat",
      "provider": "ollama",
      "model": "deepseek-coder:33b"
    }
  ]
}

Use with Aider

# Hosted
export DEEPSEEK_API_KEY=sk-...
aider --model deepseek/deepseek-coder

# Local (BYOK Ollama)
aider --model ollama/deepseek-coder:33b

Fill-in-middle (FIM) format

DeepSeek Coder's tab-completion uses a specific FIM format:

<|fim_begin|>{prefix}<|fim_hole|>{suffix}<|fim_end|>

Continue / Aider / Cursor handle this automatically. If you're integrating manually, use the FIM tokens — completions are 10-30% better than naive prompting.

Pricing & versions

Variant Params RAM (4-bit) HumanEval Pass@1
deepseek-coder:1.3b 1.3B ~1GB ~38%
deepseek-coder:6.7b 6.7B ~4GB ~58%
deepseek-coder:33b 33B ~20GB ~76%
deepseek-coder-v2:236b (MoE) 236B (21B active) API only ~86%
GPT-4o (compare) API only ~90%

Hosted API: $0.14 / 1M input tokens — cheapest production-quality coder model.


FAQ

Q: Coder vs full DeepSeek-V3 for coding? A: Coder is smaller, faster, cheaper, FIM-aware — best for local autocomplete and quick code questions. V3 is bigger, broader, better at long-context reasoning across files. For tab autocomplete: Coder. For 'understand my whole repo and refactor': V3.

Q: Can I fine-tune DeepSeek Coder? A: Yes — open weights mean any standard LoRA / QLoRA tooling (axolotl, unsloth, trl) works. The 6.7B variant LoRAs are practical on a single 24GB GPU.

Q: Is the V2 MoE coder available locally? A: The V2 236B MoE has open weights but the size makes it impractical for single-machine local. Use it via DeepSeek API or rent GPU time on Together / Fireworks. The 33B dense version is the local-friendly sweet spot.


Source & Thanks

Built by DeepSeek. Weights MIT-licensed.

deepseek-ai/DeepSeek-Coder — ⭐ 23,000+

🙏

Source et remerciements

Built by DeepSeek. Weights MIT-licensed.

deepseek-ai/DeepSeek-Coder — ⭐ 23,000+

Fil de discussion

Connectez-vous pour rejoindre la discussion.
Aucun commentaire pour l'instant. Soyez le premier à partager votre avis.

Actifs similaires