How do I install DeepSeek Coder — Code-Specialized Model for Local Inference?

Visit the asset page on TokRepo and click "Copy for agent" to get the installation instructions. Most assets can be installed with a single command.

DeepSeek Coder — Code-Specialized Model for Local Inference

ollama pull deepseek-coder:6.7b # ~4GB, fits on most laptops ollama pull deepseek-coder:33b # ~20GB, M3 Pro / 4090 territory # Quick test ollama run deepseek-coder:6.7b > Write a Rust function that returns the Nth Fibonacci with memoization.

// Continue's config.json { "tabAutocompleteModel": { "title": "DeepSeek Coder", "provider": "ollama", "model": "deepseek-coder:6.7b", "apiBase": "http://localhost:11434" }, "models": [ { "title": "DeepSeek Coder Chat", "provider": "ollama", "model": "deepseek-coder:33b" } ] }

Variant

Params

RAM (4-bit)

HumanEval Pass@1

deepseek-coder:1.3b

1.3B

~1GB

~38%

deepseek-coder:6.7b

6.7B

~4GB

~58%

deepseek-coder:33b

33B

~20GB

~76%

deepseek-coder-v2:236b (MoE)

236B (21B active)

API only

~86%

GPT-4o (compare)

—

API only

~90%

Quick Use

Local: ollama pull deepseek-coder:6.7b
Configure Continue / Aider / Cursor to use the local model
Or use hosted API with model="deepseek-coder"

Intro

DeepSeek Coder is the code-specialized open-weight model — trained on 2T tokens of code across 100+ languages, with native fill-in-middle (FIM) support for tab autocomplete. Outperforms Codestral and matches GPT-4o on HumanEval and MBPP at a fraction of the cost. Best for: local tab autocomplete via Continue / Cursor's local mode, and code-heavy production agents that need cheap inference. Works with: Ollama, vLLM, llama.cpp, DeepSeek API, Continue, Aider. Setup time: 2 minutes.

Local with Ollama

ollama pull deepseek-coder:6.7b   # ~4GB, fits on most laptops
ollama pull deepseek-coder:33b    # ~20GB, M3 Pro / 4090 territory

# Quick test
ollama run deepseek-coder:6.7b
> Write a Rust function that returns the Nth Fibonacci with memoization.

Use as tab autocomplete in Continue

// Continue's config.json
{
  "tabAutocompleteModel": {
    "title": "DeepSeek Coder",
    "provider": "ollama",
    "model": "deepseek-coder:6.7b",
    "apiBase": "http://localhost:11434"
  },
  "models": [
    {
      "title": "DeepSeek Coder Chat",
      "provider": "ollama",
      "model": "deepseek-coder:33b"
    }
  ]
}

Use with Aider

# Hosted
export DEEPSEEK_API_KEY=sk-...
aider --model deepseek/deepseek-coder

# Local (BYOK Ollama)
aider --model ollama/deepseek-coder:33b

Fill-in-middle (FIM) format

DeepSeek Coder's tab-completion uses a specific FIM format:

<|fim_begin|>{prefix}<|fim_hole|>{suffix}<|fim_end|>

Continue / Aider / Cursor handle this automatically. If you're integrating manually, use the FIM tokens — completions are 10-30% better than naive prompting.

Pricing & versions

Variant	Params	RAM (4-bit)	HumanEval Pass@1
deepseek-coder:1.3b	1.3B	~1GB	~38%
deepseek-coder:6.7b	6.7B	~4GB	~58%
deepseek-coder:33b	33B	~20GB	~76%
deepseek-coder-v2:236b (MoE)	236B (21B active)	API only	~86%
GPT-4o (compare)	—	API only	~90%

Hosted API: $0.14 / 1M input tokens — cheapest production-quality coder model.

FAQ

Q: Coder vs full DeepSeek-V3 for coding? A: Coder is smaller, faster, cheaper, FIM-aware — best for local autocomplete and quick code questions. V3 is bigger, broader, better at long-context reasoning across files. For tab autocomplete: Coder. For 'understand my whole repo and refactor': V3.

Q: Can I fine-tune DeepSeek Coder? A: Yes — open weights mean any standard LoRA / QLoRA tooling (axolotl, unsloth, trl) works. The 6.7B variant LoRAs are practical on a single 24GB GPU.

Q: Is the V2 MoE coder available locally? A: The V2 236B MoE has open weights but the size makes it impractical for single-machine local. Use it via DeepSeek API or rent GPU time on Together / Fireworks. The 33B dense version is the local-friendly sweet spot.

Source & Thanks

Built by DeepSeek. Weights MIT-licensed.

deepseek-ai/DeepSeek-Coder — ⭐ 23,000+

DeepSeek Coder — Code-Specialized Model for Local Inference

Staging seguro para este activo

Local with Ollama

Use as tab autocomplete in Continue

Use with Aider

Fill-in-middle (FIM) format

Pricing & versions

FAQ

Quick Use

Intro

Local with Ollama

Use as tab autocomplete in Continue

Use with Aider

Fill-in-middle (FIM) format

Pricing & versions

FAQ

Source & Thanks

Fuente y agradecimientos

Discusión

Activos relacionados

DeepSeek-R1 — Open-Weight Reasoning Model Rivaling OpenAI o1

DeepSeek-V3 — Open-Weight 671B MoE Model with GPT-4o Quality

Fireworks Inference — 100+ Open Models on OpenAI-Compat API

Memorix — Cross-Agent Memory Control Plane