Configs2026年7月1日·1 分钟阅读

DeepReasoning — Chain-of-Thought Inference API Merging DeepSeek R1 and Claude

A high-performance inference API built in Rust that combines DeepSeek R1 chain-of-thought reasoning traces with Claude model responses in a unified chat interface.

Agent 就绪

这个资产会安全暂存

这个资产会先安全暂存。复制的指令会要求 Agent 读取暂存文件,并在激活脚本、MCP 配置或全局配置前先确认。

Stage only · 29/100策略:需暂存
Agent 入口
任意 MCP/CLI Agent
类型
Skill
安装
Stage only
信任
信任等级:Established
入口
DeepReasoning
安全暂存命令
npx -y tokrepo@latest install 699e9b03-758b-11f1-9bc6-00163e2b0d79 --target codex

先暂存文件;激活前需要读取暂存 README 和安装计划。

Introduction

DeepReasoning is a Rust-based inference proxy that combines DeepSeek R1 chain-of-thought reasoning with Anthropic Claude model outputs. It exposes an OpenAI-compatible API and a chat UI, enabling applications to benefit from explicit reasoning traces while leveraging Claude for final responses.

What DeepReasoning Does

  • Proxies inference requests through DeepSeek R1 for chain-of-thought reasoning and Claude for final output
  • Exposes an OpenAI-compatible REST API for drop-in integration
  • Provides a web-based chat UI that displays reasoning traces alongside responses
  • Streams both reasoning steps and final answers in real-time
  • Supports configurable model selection for each stage of the pipeline

Architecture Overview

DeepReasoning runs a Rust HTTP server built on Axum. Incoming requests are first sent to DeepSeek R1 to generate a chain-of-thought reasoning trace. The trace is then included as context in a follow-up request to Claude, which produces the final response. Both stages stream tokens to the client as they arrive. The chat UI is a bundled frontend that renders reasoning traces in collapsible sections.

Self-Hosting & Configuration

  • Clone and build with Cargo (requires Rust 1.70+)
  • Set DEEPSEEK_API_KEY and ANTHROPIC_API_KEY as environment variables
  • Configure model versions, temperature, and max tokens via config.toml
  • Runs on localhost by default; configurable port and host binding
  • Docker image available for containerized deployment

Key Features

  • Combines chain-of-thought reasoning from DeepSeek R1 with Claude output quality
  • OpenAI-compatible API allows integration with existing tools and libraries
  • Rust implementation handles concurrent requests with minimal resource usage
  • Web UI displays reasoning traces for transparency and debugging
  • Supports streaming for both reasoning and response phases

Comparison with Similar Tools

  • OpenRouter — multi-model API gateway; DeepReasoning specifically chains reasoning and response across two models
  • LiteLLM — unified LLM proxy; DeepReasoning adds a two-stage reasoning pipeline
  • Portkey — LLM gateway with caching; DeepReasoning focuses on CoT integration rather than routing
  • Jan — offline AI desktop app; DeepReasoning is an API server for programmatic access
  • LobeChat — multi-model chat UI; DeepReasoning provides a specialized reasoning-chain interface

FAQ

Q: Do I need both DeepSeek and Anthropic API keys? A: Yes. The two-stage pipeline requires access to both providers.

Q: Can I use other models instead of DeepSeek R1 or Claude? A: The architecture supports any OpenAI-compatible reasoning model for stage one and any Anthropic model for stage two. Custom providers can be configured.

Q: What is the latency overhead of the two-stage approach? A: Total latency is roughly the sum of both API calls, but streaming mitigates perceived delay since reasoning tokens appear immediately.

Q: Is this suitable for production use? A: The Rust server is production-grade in terms of performance and reliability. Cost and latency depend on the underlying API providers.

Sources

讨论

登录后参与讨论。
还没有评论,来写第一条吧。

相关资产