KnowledgeMay 8, 2026·5 min read

DeepSeek-R1 — Open-Weight Reasoning Model Rivaling OpenAI o1

DeepSeek-R1 is the open-weight reasoning model that matches OpenAI o1 on math, code, science benchmarks. Streaming chain-of-thought visible. MIT-licensed.

Agent ready

Safe staging for this asset

This asset is staged first. The copied prompt tells the agent to inspect the staged files and ask before activating scripts, MCP config, or global config.

Stage only · 27/100Policy: stage
Agent surface
Any MCP/CLI agent
Kind
Knowledge
Install
Stage only
Trust
Trust: Community
Entrypoint
Asset
Safe staging command
npx -y tokrepo@latest install c8ffbe43-1354-4034-8c86-6b0ab3076998 --target codex

Stages files first; activation requires review of the staged README and plan.

Intro

DeepSeek-R1 is the open-weight reasoning model that achieves o1-level performance on AIME / MATH / GPQA / Codeforces while shipping its full chain-of-thought to the user. Distilled smaller versions (1.5B, 7B, 32B, 70B) make local reasoning practical on consumer hardware. MIT license, full weights public. Best for: hard reasoning tasks (math, science, complex code) where you need a reasoning model but want open weights. Works with: DeepSeek API, Ollama (distilled), vLLM, llama.cpp. Setup time: 2 minutes.


Hosted API

from openai import OpenAI

client = OpenAI(
    base_url="https://api.deepseek.com/v1",
    api_key=os.environ["DEEPSEEK_API_KEY"],
)

response = client.chat.completions.create(
    model="deepseek-reasoner",  # R1
    messages=[{"role": "user", "content":
        "Prove that the square root of 2 is irrational"}],
)

# R1 streams reasoning + final answer
for choice in response.choices:
    print("REASONING:", choice.message.reasoning_content)
    print("ANSWER:", choice.message.content)

Unlike o1, R1's reasoning is visible — useful for debugging, education, and trust.

Local via Ollama (distilled)

ollama pull deepseek-r1:1.5b   # ~1GB, runs on a laptop
ollama pull deepseek-r1:7b     # ~5GB
ollama pull deepseek-r1:14b    # ~9GB
ollama pull deepseek-r1:32b    # ~20GB, M2 Max territory
ollama pull deepseek-r1:70b    # ~40GB, beefy server

The 7B distillation often outperforms GPT-4o on competition math while being free and fast on a single 4090.

When to use R1 vs V3

Task Pick
Math proofs, competition problems R1
Step-by-step debugging R1
Quick chitchat, summaries V3 (cheaper, faster)
Tool-use heavy agent V3 (R1's tool support is weaker)
Need visible CoT for audit R1

Pricing

Source Input $/1M tok Output $/1M tok
DeepSeek API $0.55 $2.19
OpenAI o1 (compare) $15.00 $60.00
OpenAI o1-mini (compare) $3.00 $12.00
Local distilled $0 $0

FAQ

Q: Why does R1 show its reasoning when o1 hides it? A: DeepSeek published the full RL training methodology. Visible CoT is part of the value proposition — auditability, debugging, education. OpenAI considers o1's CoT a competitive moat.

Q: How much slower is R1 vs V3? A: R1 spends extra tokens on reasoning before the final answer — typically 3-10× more output tokens, so 3-10× slower wall-clock latency on equal infra. The cost difference reflects this.

Q: Are the distilled R1 versions trained from scratch? A: No — they're knowledge-distilled from full R1 into Llama / Qwen base models. The 7B distill is Llama-3.1-8B + R1 distillation, the 32B is Qwen-2.5-32B + R1 distillation, etc. Performance trades off with base.


Quick Use

  1. Hosted: same DeepSeek API key, set model="deepseek-reasoner"
  2. Local: ollama pull deepseek-r1:7b && ollama run deepseek-r1:7b
  3. Print response.message.reasoning_content to see the full chain-of-thought

Intro

DeepSeek-R1 is the open-weight reasoning model that achieves o1-level performance on AIME / MATH / GPQA / Codeforces while shipping its full chain-of-thought to the user. Distilled smaller versions (1.5B, 7B, 32B, 70B) make local reasoning practical on consumer hardware. MIT license, full weights public. Best for: hard reasoning tasks (math, science, complex code) where you need a reasoning model but want open weights. Works with: DeepSeek API, Ollama (distilled), vLLM, llama.cpp. Setup time: 2 minutes.


Hosted API

from openai import OpenAI

client = OpenAI(
    base_url="https://api.deepseek.com/v1",
    api_key=os.environ["DEEPSEEK_API_KEY"],
)

response = client.chat.completions.create(
    model="deepseek-reasoner",  # R1
    messages=[{"role": "user", "content":
        "Prove that the square root of 2 is irrational"}],
)

# R1 streams reasoning + final answer
for choice in response.choices:
    print("REASONING:", choice.message.reasoning_content)
    print("ANSWER:", choice.message.content)

Unlike o1, R1's reasoning is visible — useful for debugging, education, and trust.

Local via Ollama (distilled)

ollama pull deepseek-r1:1.5b   # ~1GB, runs on a laptop
ollama pull deepseek-r1:7b     # ~5GB
ollama pull deepseek-r1:14b    # ~9GB
ollama pull deepseek-r1:32b    # ~20GB, M2 Max territory
ollama pull deepseek-r1:70b    # ~40GB, beefy server

The 7B distillation often outperforms GPT-4o on competition math while being free and fast on a single 4090.

When to use R1 vs V3

Task Pick
Math proofs, competition problems R1
Step-by-step debugging R1
Quick chitchat, summaries V3 (cheaper, faster)
Tool-use heavy agent V3 (R1's tool support is weaker)
Need visible CoT for audit R1

Pricing

Source Input $/1M tok Output $/1M tok
DeepSeek API $0.55 $2.19
OpenAI o1 (compare) $15.00 $60.00
OpenAI o1-mini (compare) $3.00 $12.00
Local distilled $0 $0

FAQ

Q: Why does R1 show its reasoning when o1 hides it? A: DeepSeek published the full RL training methodology. Visible CoT is part of the value proposition — auditability, debugging, education. OpenAI considers o1's CoT a competitive moat.

Q: How much slower is R1 vs V3? A: R1 spends extra tokens on reasoning before the final answer — typically 3-10× more output tokens, so 3-10× slower wall-clock latency on equal infra. The cost difference reflects this.

Q: Are the distilled R1 versions trained from scratch? A: No — they're knowledge-distilled from full R1 into Llama / Qwen base models. The 7B distill is Llama-3.1-8B + R1 distillation, the 32B is Qwen-2.5-32B + R1 distillation, etc. Performance trades off with base.


Source & Thanks

Built by DeepSeek. Weights MIT-licensed.

deepseek-ai/DeepSeek-R1 — ⭐ 90,000+

🙏

Source & Thanks

Built by DeepSeek. Weights MIT-licensed.

deepseek-ai/DeepSeek-R1 — ⭐ 90,000+

Discussion

Sign in to join the discussion.
No comments yet. Be the first to share your thoughts.

Related Assets