# DeepReasoning — Chain-of-Thought Inference API Merging DeepSeek R1 and Claude

> A high-performance inference API built in Rust that combines DeepSeek R1 chain-of-thought reasoning traces with Claude model responses in a unified chat interface.

## Install

Save in your project root:

# DeepReasoning — Chain-of-Thought Inference API Merging DeepSeek R1 and Claude

## Quick Use
```bash
git clone https://github.com/winfunc/deepreasoning.git
cd deepreasoning
# Set API keys
export DEEPSEEK_API_KEY=your_key
export ANTHROPIC_API_KEY=your_key
cargo run --release
# Open http://localhost:3000 for the chat UI
```

## Introduction
DeepReasoning is a Rust-based inference proxy that combines DeepSeek R1 chain-of-thought reasoning with Anthropic Claude model outputs. It exposes an OpenAI-compatible API and a chat UI, enabling applications to benefit from explicit reasoning traces while leveraging Claude for final responses.

## What DeepReasoning Does
- Proxies inference requests through DeepSeek R1 for chain-of-thought reasoning and Claude for final output
- Exposes an OpenAI-compatible REST API for drop-in integration
- Provides a web-based chat UI that displays reasoning traces alongside responses
- Streams both reasoning steps and final answers in real-time
- Supports configurable model selection for each stage of the pipeline

## Architecture Overview
DeepReasoning runs a Rust HTTP server built on Axum. Incoming requests are first sent to DeepSeek R1 to generate a chain-of-thought reasoning trace. The trace is then included as context in a follow-up request to Claude, which produces the final response. Both stages stream tokens to the client as they arrive. The chat UI is a bundled frontend that renders reasoning traces in collapsible sections.

## Self-Hosting & Configuration
- Clone and build with Cargo (requires Rust 1.70+)
- Set DEEPSEEK_API_KEY and ANTHROPIC_API_KEY as environment variables
- Configure model versions, temperature, and max tokens via config.toml
- Runs on localhost by default; configurable port and host binding
- Docker image available for containerized deployment

## Key Features
- Combines chain-of-thought reasoning from DeepSeek R1 with Claude output quality
- OpenAI-compatible API allows integration with existing tools and libraries
- Rust implementation handles concurrent requests with minimal resource usage
- Web UI displays reasoning traces for transparency and debugging
- Supports streaming for both reasoning and response phases

## Comparison with Similar Tools
- **OpenRouter** — multi-model API gateway; DeepReasoning specifically chains reasoning and response across two models
- **LiteLLM** — unified LLM proxy; DeepReasoning adds a two-stage reasoning pipeline
- **Portkey** — LLM gateway with caching; DeepReasoning focuses on CoT integration rather than routing
- **Jan** — offline AI desktop app; DeepReasoning is an API server for programmatic access
- **LobeChat** — multi-model chat UI; DeepReasoning provides a specialized reasoning-chain interface

## FAQ
**Q: Do I need both DeepSeek and Anthropic API keys?**
A: Yes. The two-stage pipeline requires access to both providers.

**Q: Can I use other models instead of DeepSeek R1 or Claude?**
A: The architecture supports any OpenAI-compatible reasoning model for stage one and any Anthropic model for stage two. Custom providers can be configured.

**Q: What is the latency overhead of the two-stage approach?**
A: Total latency is roughly the sum of both API calls, but streaming mitigates perceived delay since reasoning tokens appear immediately.

**Q: Is this suitable for production use?**
A: The Rust server is production-grade in terms of performance and reliability. Cost and latency depend on the underlying API providers.

## Sources
- https://github.com/winfunc/deepreasoning

---
Source: https://tokrepo.com/en/workflows/deepreasoning-chain-thought-inference-api-merging-deepseek-699e9b03
Author: AI Open Source