# DeepReasoning — Chain-of-Thought Inference API Merging DeepSeek R1 and Claude > A high-performance inference API built in Rust that combines DeepSeek R1 chain-of-thought reasoning traces with Claude model responses in a unified chat interface. ## Install Save in your project root: # DeepReasoning — Chain-of-Thought Inference API Merging DeepSeek R1 and Claude ## Quick Use ```bash git clone https://github.com/winfunc/deepreasoning.git cd deepreasoning # Set API keys export DEEPSEEK_API_KEY=your_key export ANTHROPIC_API_KEY=your_key cargo run --release # Open http://localhost:3000 for the chat UI ``` ## Introduction DeepReasoning is a Rust-based inference proxy that combines DeepSeek R1 chain-of-thought reasoning with Anthropic Claude model outputs. It exposes an OpenAI-compatible API and a chat UI, enabling applications to benefit from explicit reasoning traces while leveraging Claude for final responses. ## What DeepReasoning Does - Proxies inference requests through DeepSeek R1 for chain-of-thought reasoning and Claude for final output - Exposes an OpenAI-compatible REST API for drop-in integration - Provides a web-based chat UI that displays reasoning traces alongside responses - Streams both reasoning steps and final answers in real-time - Supports configurable model selection for each stage of the pipeline ## Architecture Overview DeepReasoning runs a Rust HTTP server built on Axum. Incoming requests are first sent to DeepSeek R1 to generate a chain-of-thought reasoning trace. The trace is then included as context in a follow-up request to Claude, which produces the final response. Both stages stream tokens to the client as they arrive. The chat UI is a bundled frontend that renders reasoning traces in collapsible sections. ## Self-Hosting & Configuration - Clone and build with Cargo (requires Rust 1.70+) - Set DEEPSEEK_API_KEY and ANTHROPIC_API_KEY as environment variables - Configure model versions, temperature, and max tokens via config.toml - Runs on localhost by default; configurable port and host binding - Docker image available for containerized deployment ## Key Features - Combines chain-of-thought reasoning from DeepSeek R1 with Claude output quality - OpenAI-compatible API allows integration with existing tools and libraries - Rust implementation handles concurrent requests with minimal resource usage - Web UI displays reasoning traces for transparency and debugging - Supports streaming for both reasoning and response phases ## Comparison with Similar Tools - **OpenRouter** — multi-model API gateway; DeepReasoning specifically chains reasoning and response across two models - **LiteLLM** — unified LLM proxy; DeepReasoning adds a two-stage reasoning pipeline - **Portkey** — LLM gateway with caching; DeepReasoning focuses on CoT integration rather than routing - **Jan** — offline AI desktop app; DeepReasoning is an API server for programmatic access - **LobeChat** — multi-model chat UI; DeepReasoning provides a specialized reasoning-chain interface ## FAQ **Q: Do I need both DeepSeek and Anthropic API keys?** A: Yes. The two-stage pipeline requires access to both providers. **Q: Can I use other models instead of DeepSeek R1 or Claude?** A: The architecture supports any OpenAI-compatible reasoning model for stage one and any Anthropic model for stage two. Custom providers can be configured. **Q: What is the latency overhead of the two-stage approach?** A: Total latency is roughly the sum of both API calls, but streaming mitigates perceived delay since reasoning tokens appear immediately. **Q: Is this suitable for production use?** A: The Rust server is production-grade in terms of performance and reliability. Cost and latency depend on the underlying API providers. ## Sources - https://github.com/winfunc/deepreasoning --- Source: https://tokrepo.com/en/workflows/deepreasoning-chain-thought-inference-api-merging-deepseek-699e9b03 Author: AI Open Source