[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"pack-detail-self-hosted-ai-en":3,"seo:pack:self-hosted-ai:en":66},{"code":4,"message":5,"data":6},200,"操作成功",{"pack":7},{"slug":8,"icon":9,"tone":10,"status":11,"status_label":12,"title":13,"description":14,"items":15,"install_cmd":65},"self-hosted-ai","🏠","#059669","stable","Stable","Self-Hosted AI","Tabby, Onyx, LibreChat, and an n8n starter kit — keep your data on your own metal.",[16,28,36,44,51,58],{"id":17,"uuid":18,"slug":19,"title":20,"description":21,"author_name":22,"view_count":23,"vote_count":24,"lang_type":25,"type":26,"type_label":27},216,"1a1d4061-a148-4566-a3d7-ab40e6f2a972","tabby-self-hosted-ai-coding-assistant-1a1d4061","Tabby — Self-Hosted AI Coding Assistant","Self-hosted AI code completion and chat assistant. Privacy-first alternative to GitHub Copilot. Supports 20+ models, repo-aware context, and IDE integrations. 33K+ stars.","TokRepo精选",1141,0,"en","skill","Skill",{"id":29,"uuid":30,"slug":31,"title":32,"description":33,"author_name":34,"view_count":35,"vote_count":24,"lang_type":25,"type":26,"type_label":27},390,"e1fd7c46-bbda-4956-8649-9c3ed579ff25","whisper-cpp-local-speech-text-pure-c-c-e1fd7c46","whisper.cpp — Local Speech-to-Text in Pure C\u002FC++","High-performance port of OpenAI Whisper in C\u002FC++. No Python, no GPU required. Runs on CPU, Apple Silicon, CUDA, and even Raspberry Pi. Real-time transcription.","Script Depot",1949,{"id":37,"uuid":38,"slug":39,"title":40,"description":41,"author_name":42,"view_count":43,"vote_count":24,"lang_type":25,"type":26,"type_label":27},321,"210679a0-712f-4ec5-8d69-e0a016361c95","onyx-self-hosted-ai-chat-40-connectors-210679a0","Onyx — Self-Hosted AI Chat with 40+ Connectors","Onyx (formerly Danswer) is a self-hosted AI chat with RAG, custom agents, and 40+ knowledge connectors. 20.4K+ stars. Enterprise search. MIT.","AI Open Source",383,{"id":45,"uuid":46,"slug":47,"title":48,"description":49,"author_name":42,"view_count":50,"vote_count":24,"lang_type":25,"type":26,"type_label":27},284,"850494fb-7737-4388-8104-f8860a0d2d41","librechat-self-hosted-multi-ai-chat-platform-850494fb","LibreChat — Self-Hosted Multi-AI Chat Platform","LibreChat is a self-hosted AI chat platform unifying Claude, OpenAI, Google, AWS in one interface. 35.1K+ GitHub stars. Agents, MCP, code interpreter, multi-user auth. MIT.",318,{"id":52,"uuid":53,"slug":54,"title":55,"description":56,"author_name":42,"view_count":57,"vote_count":24,"lang_type":25,"type":26,"type_label":27},483,"92d3cc62-6199-4b1c-a7f1-1b73a1da86a0","self-hosted-ai-starter-kit-local-ai-n8n-92d3cc62","Self-Hosted AI Starter Kit — Local AI with n8n","Docker Compose template by n8n that bootstraps a complete local AI environment with n8n workflow automation, Ollama LLMs, Qdrant vector database, and PostgreSQL. 14,500+ stars.",369,{"id":59,"uuid":60,"slug":61,"title":62,"description":63,"author_name":42,"view_count":64,"vote_count":24,"lang_type":25,"type":26,"type_label":27},870,"f05a11a5-33e5-11f1-9bc6-00163e2b0d79","typebot-visual-ai-chatbot-builder-you-can-self-host-f05a11a5","Typebot — Visual AI Chatbot Builder You Can Self-Host","Build advanced chatbots visually with 34+ blocks. Embed anywhere, collect results in real-time. OpenAI integration, custom themes, analytics. Self-hostable. 9,800+ stars.",349,"tokrepo install pack\u002Fself-hosted-ai",{"pageType":67,"pageKey":8,"locale":25,"title":68,"metaDescription":69,"h1":13,"tldr":70,"bodyMarkdown":71,"faq":72,"schema":88,"internalLinks":96,"citations":109,"wordCount":122,"generatedAt":123},"pack","Self-Hosted AI: Tabby, Onyx, LibreChat, n8n Starter Kit","Tabby, Onyx, LibreChat, n8n — six self-hosted AI assets that replace Copilot, ChatGPT, and Zapier on your own server. One-command install via TokRepo.","Six battle-tested self-hosted AI assets — Tabby (Copilot replacement), Onyx (enterprise search), LibreChat (ChatGPT clone), and an n8n AI starter kit. Keep your data on your own metal.","## What's in this pack\n\nThis pack collects the **six self-hosted AI assets** that consistently show up when teams move off SaaS for compliance, cost, or sovereignty reasons. Three are coding\u002Fchat replacements (Tabby, LibreChat, Onyx). Three are infrastructure pieces (n8n AI starter kit, local STT, model gateway).\n\n| # | Asset | Type | What it replaces |\n|---|---|---|---|\n| 1 | Tabby | self-hosted service | GitHub Copilot |\n| 2 | Onyx | self-hosted service | Glean \u002F enterprise ChatGPT |\n| 3 | LibreChat | self-hosted UI | ChatGPT for the team |\n| 4 | n8n AI starter kit | docker-compose | Zapier with AI nodes |\n| 5 | Whisper STT (local) | service | Otter \u002F Rev \u002F cloud STT |\n| 6 | Local model gateway | service | LiteLLM with local-first routing |\n\n## Why this matters\n\nThe default 2026 AI stack assumes you're fine sending your code, chats, and customer data to OpenAI \u002F Anthropic \u002F Google. For most consumer apps that's fine. For regulated industries (health, finance, legal), gov work, or any team where your IP *is* the product, it's a non-starter. This pack is the assembled answer: a stack you can run on a single workstation or a small Kubernetes cluster that gives you Copilot-equivalent dev tools, ChatGPT-equivalent chat, and enterprise-search-equivalent retrieval — entirely on your own hardware.\n\nThe three headline replacements:\n\n- **Tabby** is the Copilot stand-in. Self-host it, point your IDE at it, and you get inline code completion backed by whatever local model you load (DeepSeek-Coder, Qwen-Coder, etc). On a single 3090 you'll match Copilot quality on most languages.\n- **Onyx** (formerly Danswer) is the enterprise-search stand-in. Connect it to your Confluence, Notion, GitHub, Slack, and it builds an internal ChatGPT that answers questions from your docs. Vector + keyword hybrid search, with citations.\n- **LibreChat** is the team-ChatGPT stand-in. Multi-user, multi-model (works with local Ollama or cloud APIs as a fallback), conversation history, prompt library. The default UI when you want to give your team \"a ChatGPT\" without paying per seat.\n\nThe three infrastructure pieces fill in the gaps. The n8n starter kit gives you Docker compose for n8n + Postgres + Qdrant + a local model — workflow automation on your own metal. Local Whisper means meeting transcripts and voice notes never leave your network. The model gateway routes between local and cloud models so you can fall back to Claude only when local can't answer.\n\n## Install in one command\n\n```bash\n# Install the entire pack\ntokrepo install pack\u002Fself-hosted-ai\n\n# Or pick the piece you actually need\ntokrepo install tabby\ntokrepo install onyx\ntokrepo install librechat\ntokrepo install n8n-ai-starter-kit\n```\n\nThe TokRepo CLI installs the docker-compose files, environment templates, and the rule files \u002F subagents for your AI tool that explain *when* to invoke the local stack vs the cloud. Run `docker compose up -d` after install and the services are reachable on localhost.\n\n## Common pitfalls\n\n- **Don't run a 70B model on 16GB VRAM.** Match model size to your GPU. Tabby's DeepSeek-Coder-7B fits on a 12GB card and is plenty for completion. For chat, Qwen-2.5-32B in 4-bit is a sweet spot if you have 24GB.\n- **Onyx connectors silently rate-limit.** When you point Onyx at a 50k-page Confluence, the initial sync takes hours and some connectors will pause. Watch the logs; don't trust the UI's progress bar in the first 24 hours.\n- **n8n + AI workflows leak credentials.** The starter kit ships with default Postgres credentials in plaintext. Change them, and bind n8n behind Cloudflare Tunnel or a reverse proxy with auth before exposing it.\n- **LibreChat permissions are flat by default.** Out of the box every user can see every conversation. Configure RBAC and per-user model whitelisting before you onboard a team.\n- **Backups aren't automatic.** Self-hosted = self-backup. Schedule pg_dump for LibreChat\u002FOnyx and snapshot the Tabby model cache; budget storage 3× your active dataset for restore points.\n\n## Relationship to other packs\n\nThis pack pairs naturally with two others. **MCP Server Stack** gives you the protocol-level connectors (filesystem, browser, database MCP servers) that route through your local model gateway — so even Claude Code can call your local services. **LLM Observability** matters more here than on cloud APIs because you own the failure surface; Langfuse self-hosted is in that pack and integrates cleanly with Onyx and LibreChat.\n\nIf you're starting from zero, install order: 1) LibreChat (immediate user value), 2) Tabby (developer value), 3) Onyx (org-wide search), 4) n8n + gateway when you start building automations on top.",[73,76,79,82,85],{"q":74,"a":75},"Is Tabby free?","Yes, Tabby is open-source under Apache 2.0 with a free self-hosted Community edition. There's a paid Enterprise tier for SSO, audit logs, and SLAs, but the Community edition is fully featured for individual and small-team use. You only pay for the GPU you run it on. Same model for Onyx, LibreChat, and n8n — all OSS with optional paid tiers.",{"q":77,"a":78},"Will this work with Cursor or Codex CLI instead of Claude Code?","The self-hosted services are tool-agnostic — Tabby exposes a Copilot-compatible API that any IDE supporting Copilot can hit (VS Code, JetBrains, Vim). LibreChat is a web UI so it's tool-independent. The TokRepo CLI installs the AI-tool-specific config (Cursor rules, AGENTS.md, Claude Code subagents) that tells your agent the local services exist.",{"q":80,"a":81},"How does Tabby compare to Cursor with a local model?","Cursor's local-model support is limited to specific endpoints; Tabby is purpose-built for self-hosted code completion with telemetry, model warmup, and a real backend. If you want IDE-agnostic, multi-team self-hosted Copilot, Tabby wins. If you specifically want Cursor's UX with a local model behind it, see the local model gateway in this pack — it can act as a Cursor-compatible endpoint.",{"q":83,"a":84},"What's the difference vs the MCP Server Stack pack?","MCP Server Stack is about protocol-level connectors so AI tools can read your filesystem, browser, database. Self-Hosted AI is about replacing the cloud LLM\u002FUI\u002FIDE assistant entirely with services on your own hardware. They're complementary: the MCP servers can be configured to route through your local model gateway, giving you a fully on-prem agent stack.",{"q":86,"a":87},"When should I NOT self-host?","When latency matters more than sovereignty (real-time voice, sub-300ms code completion against a small model is hard), when your usage is too low to justify a GPU ($100\u002Fmo of API calls is cheaper than a 4090 amortized over 3 years), or when you don't have ops support to handle backups, model upgrades, and the inevitable 2 a.m. OOM. Self-hosting is real ops work; budget it.",{"@context":89,"@type":90,"name":13,"description":14,"numberOfItems":91,"publisher":92},"https:\u002F\u002Fschema.org","CollectionPage",6,{"@type":93,"name":94,"url":95},"Organization","TokRepo","https:\u002F\u002Ftokrepo.com",[97,101,105],{"url":98,"anchor":99,"reason":100},"\u002Fen\u002Fpacks\u002Fmcp-server-stack","MCP Server Stack","MCP servers wire local models into AI tools",{"url":102,"anchor":103,"reason":104},"\u002Fen\u002Fpacks\u002Fllm-observability","LLM Observability","monitor your self-hosted stack",{"url":106,"anchor":107,"reason":108},"\u002Fen\u002Ftools\u002Fcline","Cline","VS Code agent that pairs well with local Tabby",[110,114,118],{"claim":111,"source_name":112,"source_url":113},"Tabby is a self-hosted AI coding assistant alternative to GitHub Copilot","TabbyML\u002Ftabby on GitHub","https:\u002F\u002Fgithub.com\u002FTabbyML\u002Ftabby",{"claim":115,"source_name":116,"source_url":117},"LibreChat is an open-source ChatGPT clone supporting multiple LLM backends","danny-avila\u002FLibreChat on GitHub","https:\u002F\u002Fgithub.com\u002Fdanny-avila\u002FLibreChat",{"claim":119,"source_name":120,"source_url":121},"n8n is a fair-code workflow automation platform with self-hosting support","n8n.io\u002Fself-hosted","https:\u002F\u002Fdocs.n8n.io\u002Fhosting\u002F",760,"2026-05-02T15:00:00Z"]