KnowledgeMay 8, 2026·4 min read

DeepSeek Coder — Code-Specialized Model for Local Inference

DeepSeek Coder is the code-specialized open-weight model with FIM (fill-in-middle) support. Beats Codestral on HumanEval. Drops into Continue, Aider.

Agent ready

This asset can be read and installed directly by agents

TokRepo exposes a universal CLI command, install contract, metadata JSON, adapter-aware plan, and raw content links so agents can judge fit, risk, and next actions.

Stage only · 15/100Stage only
Agent surface
Any MCP/CLI agent
Kind
Knowledge
Install
Stage only
Trust
Trust: New
Entrypoint
Asset
Universal CLI install command
npx tokrepo install 08acf3a7-b56b-40d2-9c94-9a8eb773eca4
Intro

DeepSeek Coder is the code-specialized open-weight model — trained on 2T tokens of code across 100+ languages, with native fill-in-middle (FIM) support for tab autocomplete. Outperforms Codestral and matches GPT-4o on HumanEval and MBPP at a fraction of the cost. Best for: local tab autocomplete via Continue / Cursor's local mode, and code-heavy production agents that need cheap inference. Works with: Ollama, vLLM, llama.cpp, DeepSeek API, Continue, Aider. Setup time: 2 minutes.


Local with Ollama

ollama pull deepseek-coder:6.7b   # ~4GB, fits on most laptops
ollama pull deepseek-coder:33b    # ~20GB, M3 Pro / 4090 territory

# Quick test
ollama run deepseek-coder:6.7b
> Write a Rust function that returns the Nth Fibonacci with memoization.

Use as tab autocomplete in Continue

// Continue's config.json
{
  "tabAutocompleteModel": {
    "title": "DeepSeek Coder",
    "provider": "ollama",
    "model": "deepseek-coder:6.7b",
    "apiBase": "http://localhost:11434"
  },
  "models": [
    {
      "title": "DeepSeek Coder Chat",
      "provider": "ollama",
      "model": "deepseek-coder:33b"
    }
  ]
}

Use with Aider

# Hosted
export DEEPSEEK_API_KEY=sk-...
aider --model deepseek/deepseek-coder

# Local (BYOK Ollama)
aider --model ollama/deepseek-coder:33b

Fill-in-middle (FIM) format

DeepSeek Coder's tab-completion uses a specific FIM format:

<|fim_begin|>{prefix}<|fim_hole|>{suffix}<|fim_end|>

Continue / Aider / Cursor handle this automatically. If you're integrating manually, use the FIM tokens — completions are 10-30% better than naive prompting.

Pricing & versions

Variant Params RAM (4-bit) HumanEval Pass@1
deepseek-coder:1.3b 1.3B ~1GB ~38%
deepseek-coder:6.7b 6.7B ~4GB ~58%
deepseek-coder:33b 33B ~20GB ~76%
deepseek-coder-v2:236b (MoE) 236B (21B active) API only ~86%
GPT-4o (compare) API only ~90%

Hosted API: $0.14 / 1M input tokens — cheapest production-quality coder model.


FAQ

Q: Coder vs full DeepSeek-V3 for coding? A: Coder is smaller, faster, cheaper, FIM-aware — best for local autocomplete and quick code questions. V3 is bigger, broader, better at long-context reasoning across files. For tab autocomplete: Coder. For 'understand my whole repo and refactor': V3.

Q: Can I fine-tune DeepSeek Coder? A: Yes — open weights mean any standard LoRA / QLoRA tooling (axolotl, unsloth, trl) works. The 6.7B variant LoRAs are practical on a single 24GB GPU.

Q: Is the V2 MoE coder available locally? A: The V2 236B MoE has open weights but the size makes it impractical for single-machine local. Use it via DeepSeek API or rent GPU time on Together / Fireworks. The 33B dense version is the local-friendly sweet spot.


Quick Use

  1. Local: ollama pull deepseek-coder:6.7b
  2. Configure Continue / Aider / Cursor to use the local model
  3. Or use hosted API with model="deepseek-coder"

Intro

DeepSeek Coder is the code-specialized open-weight model — trained on 2T tokens of code across 100+ languages, with native fill-in-middle (FIM) support for tab autocomplete. Outperforms Codestral and matches GPT-4o on HumanEval and MBPP at a fraction of the cost. Best for: local tab autocomplete via Continue / Cursor's local mode, and code-heavy production agents that need cheap inference. Works with: Ollama, vLLM, llama.cpp, DeepSeek API, Continue, Aider. Setup time: 2 minutes.


Local with Ollama

ollama pull deepseek-coder:6.7b   # ~4GB, fits on most laptops
ollama pull deepseek-coder:33b    # ~20GB, M3 Pro / 4090 territory

# Quick test
ollama run deepseek-coder:6.7b
> Write a Rust function that returns the Nth Fibonacci with memoization.

Use as tab autocomplete in Continue

// Continue's config.json
{
  "tabAutocompleteModel": {
    "title": "DeepSeek Coder",
    "provider": "ollama",
    "model": "deepseek-coder:6.7b",
    "apiBase": "http://localhost:11434"
  },
  "models": [
    {
      "title": "DeepSeek Coder Chat",
      "provider": "ollama",
      "model": "deepseek-coder:33b"
    }
  ]
}

Use with Aider

# Hosted
export DEEPSEEK_API_KEY=sk-...
aider --model deepseek/deepseek-coder

# Local (BYOK Ollama)
aider --model ollama/deepseek-coder:33b

Fill-in-middle (FIM) format

DeepSeek Coder's tab-completion uses a specific FIM format:

<|fim_begin|>{prefix}<|fim_hole|>{suffix}<|fim_end|>

Continue / Aider / Cursor handle this automatically. If you're integrating manually, use the FIM tokens — completions are 10-30% better than naive prompting.

Pricing & versions

Variant Params RAM (4-bit) HumanEval Pass@1
deepseek-coder:1.3b 1.3B ~1GB ~38%
deepseek-coder:6.7b 6.7B ~4GB ~58%
deepseek-coder:33b 33B ~20GB ~76%
deepseek-coder-v2:236b (MoE) 236B (21B active) API only ~86%
GPT-4o (compare) API only ~90%

Hosted API: $0.14 / 1M input tokens — cheapest production-quality coder model.


FAQ

Q: Coder vs full DeepSeek-V3 for coding? A: Coder is smaller, faster, cheaper, FIM-aware — best for local autocomplete and quick code questions. V3 is bigger, broader, better at long-context reasoning across files. For tab autocomplete: Coder. For 'understand my whole repo and refactor': V3.

Q: Can I fine-tune DeepSeek Coder? A: Yes — open weights mean any standard LoRA / QLoRA tooling (axolotl, unsloth, trl) works. The 6.7B variant LoRAs are practical on a single 24GB GPU.

Q: Is the V2 MoE coder available locally? A: The V2 236B MoE has open weights but the size makes it impractical for single-machine local. Use it via DeepSeek API or rent GPU time on Together / Fireworks. The 33B dense version is the local-friendly sweet spot.


Source & Thanks

Built by DeepSeek. Weights MIT-licensed.

deepseek-ai/DeepSeek-Coder — ⭐ 23,000+

🙏

Source & Thanks

Built by DeepSeek. Weights MIT-licensed.

deepseek-ai/DeepSeek-Coder — ⭐ 23,000+

Discussion

Sign in to join the discussion.
No comments yet. Be the first to share your thoughts.

Related Assets