Esta página se muestra en inglés. Una traducción al español está en curso.
KnowledgeMay 8, 2026·4 min de lectura

DeepSeek Coder — Code-Specialized Model for Local Inference

DeepSeek Coder is the code-specialized open-weight model with FIM (fill-in-middle) support. Beats Codestral on HumanEval. Drops into Continue, Aider.

Listo para agents

Staging seguro para este activo

Este activo primero queda en staging. El prompt copiado pide inspeccionar los archivos staged antes de activar scripts, config MCP o config global.

Stage only · 27/100Política: staging
Superficie agent
Cualquier agent MCP/CLI
Tipo
Knowledge
Instalación
Stage only
Confianza
Confianza: Community
Entrada
Asset
Comando de staging seguro
npx -y tokrepo@latest install 08acf3a7-b56b-40d2-9c94-9a8eb773eca4 --target codex

Primero deja archivos en staging; la activación requiere revisar el README y el plan staged.

Introducción

DeepSeek Coder is the code-specialized open-weight model — trained on 2T tokens of code across 100+ languages, with native fill-in-middle (FIM) support for tab autocomplete. Outperforms Codestral and matches GPT-4o on HumanEval and MBPP at a fraction of the cost. Best for: local tab autocomplete via Continue / Cursor's local mode, and code-heavy production agents that need cheap inference. Works with: Ollama, vLLM, llama.cpp, DeepSeek API, Continue, Aider. Setup time: 2 minutes.


Local with Ollama

ollama pull deepseek-coder:6.7b   # ~4GB, fits on most laptops
ollama pull deepseek-coder:33b    # ~20GB, M3 Pro / 4090 territory

# Quick test
ollama run deepseek-coder:6.7b
> Write a Rust function that returns the Nth Fibonacci with memoization.

Use as tab autocomplete in Continue

// Continue's config.json
{
  "tabAutocompleteModel": {
    "title": "DeepSeek Coder",
    "provider": "ollama",
    "model": "deepseek-coder:6.7b",
    "apiBase": "http://localhost:11434"
  },
  "models": [
    {
      "title": "DeepSeek Coder Chat",
      "provider": "ollama",
      "model": "deepseek-coder:33b"
    }
  ]
}

Use with Aider

# Hosted
export DEEPSEEK_API_KEY=sk-...
aider --model deepseek/deepseek-coder

# Local (BYOK Ollama)
aider --model ollama/deepseek-coder:33b

Fill-in-middle (FIM) format

DeepSeek Coder's tab-completion uses a specific FIM format:

<|fim_begin|>{prefix}<|fim_hole|>{suffix}<|fim_end|>

Continue / Aider / Cursor handle this automatically. If you're integrating manually, use the FIM tokens — completions are 10-30% better than naive prompting.

Pricing & versions

Variant Params RAM (4-bit) HumanEval Pass@1
deepseek-coder:1.3b 1.3B ~1GB ~38%
deepseek-coder:6.7b 6.7B ~4GB ~58%
deepseek-coder:33b 33B ~20GB ~76%
deepseek-coder-v2:236b (MoE) 236B (21B active) API only ~86%
GPT-4o (compare) API only ~90%

Hosted API: $0.14 / 1M input tokens — cheapest production-quality coder model.


FAQ

Q: Coder vs full DeepSeek-V3 for coding? A: Coder is smaller, faster, cheaper, FIM-aware — best for local autocomplete and quick code questions. V3 is bigger, broader, better at long-context reasoning across files. For tab autocomplete: Coder. For 'understand my whole repo and refactor': V3.

Q: Can I fine-tune DeepSeek Coder? A: Yes — open weights mean any standard LoRA / QLoRA tooling (axolotl, unsloth, trl) works. The 6.7B variant LoRAs are practical on a single 24GB GPU.

Q: Is the V2 MoE coder available locally? A: The V2 236B MoE has open weights but the size makes it impractical for single-machine local. Use it via DeepSeek API or rent GPU time on Together / Fireworks. The 33B dense version is the local-friendly sweet spot.


Quick Use

  1. Local: ollama pull deepseek-coder:6.7b
  2. Configure Continue / Aider / Cursor to use the local model
  3. Or use hosted API with model="deepseek-coder"

Intro

DeepSeek Coder is the code-specialized open-weight model — trained on 2T tokens of code across 100+ languages, with native fill-in-middle (FIM) support for tab autocomplete. Outperforms Codestral and matches GPT-4o on HumanEval and MBPP at a fraction of the cost. Best for: local tab autocomplete via Continue / Cursor's local mode, and code-heavy production agents that need cheap inference. Works with: Ollama, vLLM, llama.cpp, DeepSeek API, Continue, Aider. Setup time: 2 minutes.


Local with Ollama

ollama pull deepseek-coder:6.7b   # ~4GB, fits on most laptops
ollama pull deepseek-coder:33b    # ~20GB, M3 Pro / 4090 territory

# Quick test
ollama run deepseek-coder:6.7b
> Write a Rust function that returns the Nth Fibonacci with memoization.

Use as tab autocomplete in Continue

// Continue's config.json
{
  "tabAutocompleteModel": {
    "title": "DeepSeek Coder",
    "provider": "ollama",
    "model": "deepseek-coder:6.7b",
    "apiBase": "http://localhost:11434"
  },
  "models": [
    {
      "title": "DeepSeek Coder Chat",
      "provider": "ollama",
      "model": "deepseek-coder:33b"
    }
  ]
}

Use with Aider

# Hosted
export DEEPSEEK_API_KEY=sk-...
aider --model deepseek/deepseek-coder

# Local (BYOK Ollama)
aider --model ollama/deepseek-coder:33b

Fill-in-middle (FIM) format

DeepSeek Coder's tab-completion uses a specific FIM format:

<|fim_begin|>{prefix}<|fim_hole|>{suffix}<|fim_end|>

Continue / Aider / Cursor handle this automatically. If you're integrating manually, use the FIM tokens — completions are 10-30% better than naive prompting.

Pricing & versions

Variant Params RAM (4-bit) HumanEval Pass@1
deepseek-coder:1.3b 1.3B ~1GB ~38%
deepseek-coder:6.7b 6.7B ~4GB ~58%
deepseek-coder:33b 33B ~20GB ~76%
deepseek-coder-v2:236b (MoE) 236B (21B active) API only ~86%
GPT-4o (compare) API only ~90%

Hosted API: $0.14 / 1M input tokens — cheapest production-quality coder model.


FAQ

Q: Coder vs full DeepSeek-V3 for coding? A: Coder is smaller, faster, cheaper, FIM-aware — best for local autocomplete and quick code questions. V3 is bigger, broader, better at long-context reasoning across files. For tab autocomplete: Coder. For 'understand my whole repo and refactor': V3.

Q: Can I fine-tune DeepSeek Coder? A: Yes — open weights mean any standard LoRA / QLoRA tooling (axolotl, unsloth, trl) works. The 6.7B variant LoRAs are practical on a single 24GB GPU.

Q: Is the V2 MoE coder available locally? A: The V2 236B MoE has open weights but the size makes it impractical for single-machine local. Use it via DeepSeek API or rent GPU time on Together / Fireworks. The 33B dense version is the local-friendly sweet spot.


Source & Thanks

Built by DeepSeek. Weights MIT-licensed.

deepseek-ai/DeepSeek-Coder — ⭐ 23,000+

🙏

Fuente y agradecimientos

Built by DeepSeek. Weights MIT-licensed.

deepseek-ai/DeepSeek-Coder — ⭐ 23,000+

Discusión

Inicia sesión para unirte a la discusión.
Aún no hay comentarios. Sé el primero en compartir tus ideas.

Activos relacionados