TOKREPO · Arsenal IA
Nouveau · cette semaine

Stack IA du Développeur en Chine — Coder Derrière le GFW

Dix outils pour développeurs en Chine continentale qui ne peuvent (ou ne veulent) pas dépendre de ChatGPT/Cursor : DeepSeek + Qwen + Kimi comme couche modèle, Cherry Studio comme client desktop, One API / LiteLLM comme passerelle unifiée, Tabby / Ollama pour équipes intranet isolées, plus Continue et CLIs natifs DeepSeek qui acceptent le RMB sans VPN.

10 ressources

What's in this pack

This is the rig for a working engineer in mainland China who hits the same three walls every week: Cursor and Codex sign-up requires a foreign phone, OpenAI / Anthropic APIs are intermittently blocked at the network layer, and corporate firewalls + WeChat-Pay-only billing make foreign SaaS a non-starter. The pack is not a workaround for those tools — it's the native-China alternative built on Chinese model labs (DeepSeek, Alibaba Qwen, Moonshot Kimi) plus self-hostable gateways and IDE plugins.

Every pick here meets three criteria: RMB-payable or self-hostable, reachable from a typical mainland network without a VPN, and good enough to actually use as a daily coding assistant — not just "works". If you have Cursor working and don't want to switch, this isn't your pack. If you're on a company-issued laptop with no foreign network access, this is the only pack that ships.

Install in this order

  1. DeepSeek TUI (id: 3142) — start here because DeepSeek-V3 / R1 is the highest-quality Chinese-trained model with an RMB-billable API, and the TUI is a one-binary terminal agent that proves the API key works before you wire anything else up. npm i -g deepseek-tui, paste your key from platform.deepseek.com, get answers in 30 seconds.
  2. Qwen Code (id: 3022) — Alibaba's official terminal coding agent for Qwen models. Install second because Qwen-Max / Qwen3-Coder are the strongest alternative when DeepSeek capacity is throttled, and you want a fallback model wired up before you write any production scripts. npm i -g @qwen-code/qwen-code, /auth with your DashScope (bailian.console.aliyun.com) key.
  3. oh-my-kimi (id: 3643) — Moonshot's Kimi K2 has the longest reliable context window (1M tokens) among China-hosted models, and oh-my-kimi gates the agent loop with evidence checks. Install third to round out the model layer: DeepSeek for daily code, Qwen as backup, Kimi for whole-codebase questions.
  4. Cherry Studio Custom Models (id: 2821) — open-source desktop chat client (Mac/Win/Linux) that adds any OpenAI-compatible endpoint as a BYOK provider. Install fourth because once you have three model providers, you need one chat UI that can flip between them mid-conversation. Cherry Studio is the de facto standard in Chinese dev communities — 30K+ stars, MIT license, no telemetry.
  5. One API (id: 3821) — self-hosted LLM API gateway (Docker, Go). Install fifth because by now you have multiple keys (DeepSeek + Qwen + Kimi) and your team needs a single OpenAI-compatible URL that round-robins across them. One API is the gateway most Chinese teams reach for first (33K+ stars, Chinese-language admin UI).
  6. LiteLLM Proxy (id: 2789) — the international alternative gateway, Python-based, 100+ provider support. Install as an alternative to One API only if you also need to proxy international providers (OpenRouter, Together, Anthropic via reverse-proxy) and want richer cost tracking. Don't run both.
  7. Continue (id: 613) — open-source IDE assistant (VSCode + JetBrains) that points at any OpenAI-compatible endpoint — which means it points at your One API gateway, which means your IDE autocomplete and chat are now powered by DeepSeek-Coder for ¥0.50 per million tokens.
  8. Tabby (id: 216) — self-hosted GitHub Copilot alternative. Install only if your company forbids external API calls (banks, gov, internal R&D). Tabby runs entirely on your hardware, supports DeepSeek-Coder / Qwen-Coder weights, and gives FIM autocomplete in VSCode + JetBrains.
  9. Ollama (id: 162) — local model runtime. Install if you have a Mac with 32GB+ unified memory or a workstation with a 24GB+ GPU. Run ollama pull qwen3:14b or deepseek-r1:7b and Cherry Studio / Continue can talk to localhost:11434 instead of any API.
  10. Reasonix (id: 3604) — DeepSeek-native coding agent CLI optimized for prompt caching (claimed 99%+ cache hit). Install last as the power-user upgrade once you've used DeepSeek TUI for a week and want lower latency + lower bills on a real repo.

How they fit together

  Domestic model APIs (all RMB-billable, no VPN needed)
  ┌─────────────────┬─────────────────┬─────────────────┐
  │ DeepSeek-V3/R1  │  Qwen-Max/Coder │  Kimi K2 (1M ctx)│
  │ (deepseek.com)  │  (bailian.aliyun)│ (platform.moonshot)│
  └────────┬────────┴────────┬────────┴────────┬─────────┘
           │                 │                 │
           ▼                 ▼                 ▼
           One API  /  LiteLLM Proxy  (Docker, your VPS)
           (unified OpenAI-compat URL + key rotation + cost log)
                              │
        ┌─────────────────────┼─────────────────────┐
        ▼                     ▼                     ▼
  DeepSeek TUI /         Cherry Studio        Continue (VSCode/JB)
  Qwen Code /           (desktop chat,         (in-IDE chat
  oh-my-kimi /          BYOK any provider)     + completions)
  Reasonix
  (terminal agents)

  ── Air-gapped fallback (no external network at all) ──
          Tabby (self-hosted Copilot, runs DeepSeek-Coder weights)
          Ollama (local Qwen / DeepSeek-Coder / Yi / GLM weights)

The inflection point is One API in front of three domestic providers. Before that, you have three API keys scattered across .env files and every tool has to know about each provider. After that, every tool points at one URL, you swap models by renaming the model field, and your billing team sees one consolidated cost report. Don't skip the gateway — without it, multi-model workflows fall apart by week two.

Tradeoffs you'll hit

  • DeepSeek vs Qwen vs Kimi — DeepSeek wins on code quality + price (¥0.50-2/M tokens). Qwen wins on Chinese-language documentation, multi-modal, and Alibaba Cloud enterprise compliance. Kimi wins on context length (1M, actually usable). Most teams run DeepSeek default, Qwen as Aliyun-compliance fallback, Kimi for codebase-wide Q&A.
  • One API vs LiteLLM — One API is Chinese-developer-first (UI in Chinese, supports WeChat Pay billing plugins, lighter footprint). LiteLLM is international-first (better cost tracking, more provider integrations, Python ecosystem). If your only providers are Chinese, One API is the obvious pick.
  • Continue vs DeepSeek TUI / Qwen Code — Continue lives in your IDE (autocomplete + chat panel). The CLI agents live in terminal and can execute multi-step refactors. They complement each other: Continue for typing, CLI agents for "refactor this whole module". Most engineers run both.
  • Cloud API vs Tabby vs Ollama — Cloud API is cheapest and highest quality but requires external network. Tabby is mid-quality (model is local) and needs a GPU. Ollama is your laptop; quality drops sharply below the 14B parameter mark. Only go local if compliance demands it or you genuinely have a 4090 / M-series Pro burning idle cycles.
  • Cherry Studio vs the model labs' web chat — DeepSeek, Qwen, and Kimi all have free web chat. Cherry Studio is worth installing the moment you want to switch models mid-conversation, save chat history locally, or pipe in your own RAG knowledge base. If you only use one model from a browser, you don't need it.

Common pitfalls

  • "DeepSeek API is down" — DeepSeek has had multi-hour capacity issues during peak periods. This is exactly why One API + a Qwen fallback exists. Configure the gateway with priority: 1 deepseek, priority: 2 qwen and most outages become invisible.
  • JetBrains Continue plugin marketplace can be slow from CN — if the plugin fails to install from inside an IDE, download the .zip from plugins.jetbrains.com (often reachable) and install from disk. VSCode marketplace is generally fine via Microsoft's CN mirrors.
  • Bailian (Aliyun) auth flow assumes a Chinese phone — Qwen API via bailian.console.aliyun.com requires Aliyun account + Alipay real-name verification. If your team is on a foreign Aliyun account, use Qwen via DashScope international (dashscope-intl.aliyuncs.com) — different endpoint, same model, different billing entity.
  • Cherry Studio + corporate proxy — Cherry Studio respects system proxy by default, which breaks DeepSeek calls if your proxy whitelist only includes foreign domains. Either add api.deepseek.com to the proxy bypass list, or configure Cherry Studio's per-provider proxy override.
  • Tabby + GPU memory — DeepSeek-Coder-V2-Lite-16B needs ~24GB VRAM at fp16, ~12GB at 4-bit quant. A single 3090 / 4090 is enough; a 4060Ti 16GB is not. Right-size before promising your team Copilot replacement.
  • One API on a public IP without auth = key theft — One API admin UI defaults to root/123456. Change it the first hour, put it behind nginx + basic auth or a VPN. Several Chinese dev forums have screenshots of stolen DeepSeek keys racking up ¥10K bills overnight.
  • WeChat Pay / Alipay billing topup minimums — DeepSeek and Moonshot have ¥10-50 minimum top-ups; not a problem for individuals, but for company finance teams that need fapiao (发票), request the developer/enterprise account from day one — switching mid-stream wastes a week.
INSTALLER · UNE COMMANDE
$ tokrepo install pack/china-programmer-ai-stack
passez-la à votre agent — ou collez-la dans votre terminal
Ce qu'il contient

10 ressources prêtes à installer

Script#01
DeepSeek TUI — Terminal Coding Agent for DeepSeek

DeepSeek TUI is a terminal coding agent for DeepSeek models. Install via npm/cargo/brew, run on your repo, and keep approval gates for edits.

by Script Depot·81 views
$ tokrepo install deepseek-tui-terminal-coding-agent-for-deepseek
Script#02
Qwen Code — Terminal Coding Agent for Qwen Models

Qwen Code is an open-source terminal coding agent for Qwen models. Node 22+, npm or Homebrew install, /auth flow, codebase Q&A, refactors, and tests.

by QwenLM·99 views
$ tokrepo install qwen-code-terminal-coding-agent-for-qwen-models
Script#03
oh-my-kimi — Evidence-gated Agent Runtime for Kimi

oh-my-kimi (OMK) is a CLI runtime that adds evidence gates and worktree isolation to Kimi Code; verified 69★ and ships `omk init/doctor/chat`.

by Skill Factory·104 views
$ tokrepo install oh-my-kimi-evidence-gated-agent-runtime-for-kimi
Skill#04
Cherry Studio Custom Models — BYOK Any LLM Provider

Cherry Studio Custom Models adds any OpenAI-compatible endpoint — proxy, local, or third-party. Mix Claude, GPT, Gemini, DeepSeek, Ollama side-by-side.

by Cherry Studio·107 views
$ tokrepo install cherry-studio-custom-models-byok-any-llm-provider
Skill#05
One API — Unified LLM API Gateway (Docker)

One API is a self-hosted LLM API gateway: unify OpenAI/Claude/Gemini/DeepSeek endpoints, manage keys, and deploy via Docker in minutes (33.7k★).

by AI Open Source·89 views
$ tokrepo install one-api-unified-llm-api-gateway-docker
Agent#06
LiteLLM Proxy — Unified Gateway for 100+ LLM APIs

LiteLLM Proxy maps 100+ LLM providers (Anthropic, OpenAI, Bedrock, Vertex) to one OpenAI-compatible endpoint. Auth, rate limit, cost track, fallbacks.

by LiteLLM (BerriAI)·92 views
$ tokrepo install litellm-proxy-unified-gateway-for-100-llm-apis
Skill#07
Continue — Open-Source AI Code Assistant for IDEs

Open-source AI code assistant for VS Code and JetBrains. Connect any LLM model, use autocomplete, chat, and inline edits. Fully customizable with your own models and context. 22,000+ stars.

by Continue·152 views
$ tokrepo install continue-open-source-ai-code-assistant-ides-faed3b71
Skill#08
Tabby — Self-Hosted AI Coding Assistant

Self-hosted AI code completion and chat assistant. Privacy-first alternative to GitHub Copilot. Supports 20+ models, repo-aware context, and IDE integrations. 33K+ stars.

by TokRepo精选·908 views
$ tokrepo install tabby-self-hosted-ai-coding-assistant-1a1d4061
Skill#09
Ollama — Run LLMs Locally

Run large language models locally on your machine. Supports Llama 3, Mistral, Gemma, Phi, and dozens more. One-command install, OpenAI-compatible API.

by Script Depot·197 views
$ tokrepo install ollama-run-llms-locally-0eefb7ad
Script#10
Reasonix — DeepSeek-Native Coding Agent CLI

Reasonix is a DeepSeek-native coding agent CLI for the terminal; README reports a 99.82% cache hit in a real-day case study.

by Script Depot·112 views
$ tokrepo install reasonix-deepseek-native-coding-agent-cli
Questions fréquentes

Questions fréquentes

I can use Cursor / Claude with a VPN — why bother switching to a domestic stack?

Three reasons: (1) your company laptop and corporate VPN will eventually conflict with personal VPN tools, and the day you can't log in to a customer call is the day this stack pays for itself; (2) DeepSeek and Qwen are now genuinely competitive on code quality for ~1/10 the price — even with stable foreign access, the cost difference matters at team scale; (3) fapiao (发票). If your company finance team can't reimburse a USD Stripe charge, RMB-billable domestic models are the only practical path. Most engineers end up running both — foreign tools when convenient, this stack as the always-works fallback.

My company forbids any external API call — what subset of this pack works?

Drop the cloud APIs entirely and run: Tabby (self-hosted Copilot, item 8) + Ollama (local model runtime, item 9) + Continue (IDE plugin, item 7) pointed at Ollama's localhost endpoint. Cherry Studio (item 4) also works pointed at a local model. You'll need a workstation with a 24GB+ GPU or an M-series Mac with 32GB+ unified memory. Model recommendations: DeepSeek-Coder-V2-Lite-16B for code, Qwen3-14B for general chat. Quality is below cloud DeepSeek but well above no-AI.

How much does this stack actually cost per engineer per month?

Individual use with DeepSeek as primary: ¥30-80/month for heavy daily use (1-3M tokens/day). With One API doing fallback to Qwen, add another ¥20-40. Compare to Cursor Pro at $20/month (¥150) plus Copilot at $10/month (~¥75) — and that's assuming the VPN works. For a team of 10, a self-hosted One API on a ¥80/month VPS plus pooled DeepSeek+Qwen keys typically lands at ¥500-1500/month total versus ¥6000+ for international SaaS equivalents.

Will my company finance team accept the bills (need fapiao 发票)?

DeepSeek, Alibaba (Qwen via bailian), and Moonshot (Kimi) all issue Chinese VAT fapiao (普票 / 专票) on request — request from day one and register the enterprise account, not the individual. One API and LiteLLM are open-source self-hosted, so there's no SaaS bill to invoice. The catch: switching from a personal Aliyun account to an enterprise account mid-month is painful (separate billing entities). Set up the enterprise account first, then create developer keys under it.

Which model should I use for what task?

Default for code: DeepSeek-V3 (chat) + DeepSeek-Coder (autocomplete) — best quality-per-yuan in 2026. Reasoning / hard math / refactor planning: DeepSeek-R1 or DeepSeek-V3 thinking mode. Whole-codebase Q&A or document analysis (>200K tokens): Kimi K2 — its 1M context actually works (most others degrade past 64K). Multi-modal / vision / Aliyun-cloud-native enterprise: Qwen-VL or Qwen-Max. Air-gapped local: DeepSeek-Coder-V2-Lite-16B (code) or Qwen3-14B (general). Don't pick one model and stick with it — flipping between providers via Cherry Studio or One API is the whole point.

PLUS DANS L'ARSENAL

12 packs · 80+ ressources sélectionnées

Découvrez tous les packs curatés sur la page d'accueil

Retour à tous les packs