# DeepSeek Coder — Code-Specialized Model for Local Inference

> DeepSeek Coder is the code-specialized open-weight model with FIM (fill-in-middle) support. Beats Codestral on HumanEval. Drops into Continue, Aider.

## Install

Copy the content below into your project:

## Quick Use

1. Local: `ollama pull deepseek-coder:6.7b`
2. Configure Continue / Aider / Cursor to use the local model
3. Or use hosted API with `model="deepseek-coder"`

---

## Intro

DeepSeek Coder is the code-specialized open-weight model — trained on 2T tokens of code across 100+ languages, with native fill-in-middle (FIM) support for tab autocomplete. Outperforms Codestral and matches GPT-4o on HumanEval and MBPP at a fraction of the cost. Best for: local tab autocomplete via Continue / Cursor's local mode, and code-heavy production agents that need cheap inference. Works with: Ollama, vLLM, llama.cpp, DeepSeek API, Continue, Aider. Setup time: 2 minutes.

---

### Local with Ollama

```bash
ollama pull deepseek-coder:6.7b   # ~4GB, fits on most laptops
ollama pull deepseek-coder:33b    # ~20GB, M3 Pro / 4090 territory

# Quick test
ollama run deepseek-coder:6.7b
> Write a Rust function that returns the Nth Fibonacci with memoization.
```

### Use as tab autocomplete in Continue

```jsonc
// Continue's config.json
{
  "tabAutocompleteModel": {
    "title": "DeepSeek Coder",
    "provider": "ollama",
    "model": "deepseek-coder:6.7b",
    "apiBase": "http://localhost:11434"
  },
  "models": [
    {
      "title": "DeepSeek Coder Chat",
      "provider": "ollama",
      "model": "deepseek-coder:33b"
    }
  ]
}
```

### Use with Aider

```bash
# Hosted
export DEEPSEEK_API_KEY=sk-...
aider --model deepseek/deepseek-coder

# Local (BYOK Ollama)
aider --model ollama/deepseek-coder:33b
```

### Fill-in-middle (FIM) format

DeepSeek Coder's tab-completion uses a specific FIM format:

```
<|fim_begin|>{prefix}<|fim_hole|>{suffix}<|fim_end|>
```

Continue / Aider / Cursor handle this automatically. If you're integrating manually, use the FIM tokens — completions are 10-30% better than naive prompting.

### Pricing & versions

| Variant | Params | RAM (4-bit) | HumanEval Pass@1 |
|---|---|---|---|
| deepseek-coder:1.3b | 1.3B | ~1GB | ~38% |
| deepseek-coder:6.7b | 6.7B | ~4GB | ~58% |
| deepseek-coder:33b | 33B | ~20GB | ~76% |
| deepseek-coder-v2:236b (MoE) | 236B (21B active) | API only | ~86% |
| GPT-4o (compare) | — | API only | ~90% |

Hosted API: $0.14 / 1M input tokens — cheapest production-quality coder model.

---

### FAQ

**Q: Coder vs full DeepSeek-V3 for coding?**
A: Coder is smaller, faster, cheaper, FIM-aware — best for local autocomplete and quick code questions. V3 is bigger, broader, better at long-context reasoning across files. For tab autocomplete: Coder. For 'understand my whole repo and refactor': V3.

**Q: Can I fine-tune DeepSeek Coder?**
A: Yes — open weights mean any standard LoRA / QLoRA tooling (axolotl, unsloth, trl) works. The 6.7B variant LoRAs are practical on a single 24GB GPU.

**Q: Is the V2 MoE coder available locally?**
A: The V2 236B MoE has open weights but the size makes it impractical for single-machine local. Use it via DeepSeek API or rent GPU time on Together / Fireworks. The 33B dense version is the local-friendly sweet spot.

---

## Source & Thanks

> Built by [DeepSeek](https://github.com/deepseek-ai). Weights MIT-licensed.
>
> [deepseek-ai/DeepSeek-Coder](https://github.com/deepseek-ai/DeepSeek-Coder) — ⭐ 23,000+

---

<!-- ZH -->

## 快速使用

1. 本地：`ollama pull deepseek-coder:6.7b`
2. 配置 Continue / Aider / Cursor 用本地模型
3. 或用托管 API，设 `model="deepseek-coder"`

---

## 简介

DeepSeek Coder 是代码专用开源权重模型 —— 在 2 万亿 token 的代码上训练，覆盖 100+ 语言，原生支持 fill-in-middle（FIM）做 tab 补全。在 HumanEval 和 MBPP 上胜过 Codestral 持平 GPT-4o，成本只是后者一小部分。适合 Continue / Cursor 本地模式的本地 tab 补全、代码重的生产 agent 需要便宜推理。兼容 Ollama / vLLM / llama.cpp / DeepSeek API / Continue / Aider。装机时间 2 分钟。

---

### Ollama 本地

```bash
ollama pull deepseek-coder:6.7b   # ~4GB，多数笔记本能跑
ollama pull deepseek-coder:33b    # ~20GB，M3 Pro / 4090 级

# 快速测试
ollama run deepseek-coder:6.7b
> Write a Rust function that returns the Nth Fibonacci with memoization.
```

### 在 Continue 里当 tab 补全

```jsonc
// Continue 的 config.json
{
  "tabAutocompleteModel": {
    "title": "DeepSeek Coder",
    "provider": "ollama",
    "model": "deepseek-coder:6.7b",
    "apiBase": "http://localhost:11434"
  },
  "models": [
    {
      "title": "DeepSeek Coder Chat",
      "provider": "ollama",
      "model": "deepseek-coder:33b"
    }
  ]
}
```

### 配 Aider

```bash
# 托管
export DEEPSEEK_API_KEY=sk-...
aider --model deepseek/deepseek-coder

# 本地（BYOK Ollama）
aider --model ollama/deepseek-coder:33b
```

### Fill-in-middle（FIM）格式

DeepSeek Coder 的 tab 补全用特定 FIM 格式：

```
<|fim_begin|>{prefix}<|fim_hole|>{suffix}<|fim_end|>
```

Continue / Aider / Cursor 自动处理。手动集成的话用 FIM token —— 补全比裸 prompt 好 10-30%。

### 价格 & 版本

| 变体 | 参数 | 内存（4-bit） | HumanEval Pass@1 |
|---|---|---|---|
| deepseek-coder:1.3b | 1.3B | ~1GB | ~38% |
| deepseek-coder:6.7b | 6.7B | ~4GB | ~58% |
| deepseek-coder:33b | 33B | ~20GB | ~76% |
| deepseek-coder-v2:236b（MoE） | 236B（21B 激活） | 仅 API | ~86% |
| GPT-4o（对比） | — | 仅 API | ~90% |

托管 API：$0.14 / 百万输入 token —— 最便宜的生产级编码模型。

---

### FAQ

**Q: Coder vs 完整 DeepSeek-V3 写代码哪个好？**
A: Coder 更小、更快、更便宜、懂 FIM —— 最适合本地补全和快速代码问答。V3 更大、更广、跨文件长上下文推理更好。Tab 补全选 Coder。要「理解整个仓库并重构」选 V3。

**Q: 能微调 DeepSeek Coder 吗？**
A: 能 —— 开源权重意味着标准 LoRA / QLoRA 工具（axolotl / unsloth / trl）都能用。6.7B 变体的 LoRA 在单张 24GB GPU 上可行。

**Q: V2 MoE coder 本地能用吗？**
A: V2 236B MoE 权重开源，但尺寸让单机本地不实际。通过 DeepSeek API 或 Together / Fireworks 租 GPU 时间。33B dense 版本是本地友好的甜点。

---

## 来源与感谢

> Built by [DeepSeek](https://github.com/deepseek-ai). Weights MIT-licensed.
>
> [deepseek-ai/DeepSeek-Coder](https://github.com/deepseek-ai/DeepSeek-Coder) — ⭐ 23,000+


---
Source: https://tokrepo.com/en/workflows/deepseek-coder-code-specialized-model-for-local-inference
Author: DeepSeek