2026 最佳自托管 AI 工具推荐
完全隐私地在本地运行 AI。可自托管的开源 LLM、聊天界面、知识库和开发工具。
Open WebUI — Self-Hosted ChatGPT Alternative
Feature-rich open-source web UI for running local and remote LLMs. Open WebUI supports Ollama, OpenAI, Claude API with RAG, tools, multi-user, and mobile-friendly interface.
Self-Hosted AI Starter Kit — Local AI with n8n
Docker Compose template by n8n that bootstraps a complete local AI environment with n8n workflow automation, Ollama LLMs, Qdrant vector database, and PostgreSQL. 14,500+ stars.
Open WebUI — Self-Hosted AI Chat Platform
Feature-rich, offline-capable AI interface for Ollama, OpenAI, and local LLMs. Built-in RAG, voice, model builder. 130K+ stars.
Ollama — Run LLMs Locally with One Command
Run Llama 3, Mistral, Gemma, Phi, and 100+ open-source LLMs locally with a single command. OpenAI-compatible API for seamless integration with AI tools. 120,000+ GitHub stars.
Continue — Open-Source AI Code Assistant
Open-source AI code assistant for VS Code and JetBrains. Tab autocomplete, chat, inline editing with any model — OpenAI, Anthropic, Ollama, or self-hosted.
AFFiNE — Open-Source Notion Alternative
Docs, whiteboards, and databases in one privacy-first workspace. Local-first with real-time collaboration. 66K+ GitHub stars.
OpenCode — Open-Source AI Coding Agent for Terminal
Open-source AI coding agent with 140K+ stars. TUI-first design, LSP integration, works with Claude, OpenAI, Google, or local models. Two built-in agents: build and plan. MIT license.
Devika — Open Source AI Software Engineer
Open-source AI software engineer that plans, researches, and writes code autonomously. Supports Claude, GPT, and local models with browser and terminal access.
Open WebUI — Self-Hosted AI Chat Interface
User-friendly, self-hosted AI chat interface. Supports Ollama, OpenAI, Anthropic, and any OpenAI-compatible API. RAG, web search, voice, image gen, and plugins. 129K+ stars.
Jan — Offline AI Desktop App with Full Privacy
Jan is an open-source ChatGPT alternative that runs LLMs locally with full privacy. 41.4K+ GitHub stars. Desktop app for Windows/macOS/Linux, OpenAI-compatible API, MCP support. Apache 2.0.
Langfuse — Open Source LLM Observability
Langfuse is an open-source LLM engineering platform for tracing, prompt management, evaluation, and debugging AI apps. 24.1K+ GitHub stars. Self-hosted or cloud. MIT.
Chroma — Open-Source Embedding Database for AI
Lightweight open-source vector database that runs anywhere. Chroma provides in-memory, local file, and client-server modes for embeddings with zero-config LangChain integration.
Jan — Run AI Models Locally on Your Desktop
Open-source desktop app to run LLMs offline. Jan supports Llama, Mistral, and Gemma models with one-click download, OpenAI-compatible API, and full privacy.
Ollama — Run LLMs Locally
Run large language models locally on your machine. Supports Llama 3, Mistral, Gemma, Phi, and dozens more. One-command install, OpenAI-compatible API.
LocalAI — Run Any AI Model Locally, No GPU
LocalAI is an open-source AI engine running LLMs, vision, voice, and image models locally. 44.6K+ GitHub stars. OpenAI/Anthropic-compatible API, 35+ backends, MCP, agents. MIT licensed.
Documenso — Open Source Document Signing Platform
Documenso is an open-source DocuSign alternative for self-hosted document signing with PDF e-signatures, audit trails, and Next.js stack.
Void — Open-Source Cursor Alternative
Void is an open-source AI code editor alternative to Cursor. 28.5K+ stars. Checkpoints, custom models, local hosting, built on VS Code. Apache 2.0.
Ollama Model Library — Best AI Models for Local Use
Curated guide to the best models available on Ollama for coding, chat, and reasoning. Compare Llama, Mistral, Gemma, Phi, and Qwen models for local AI development.
LobeChat — Open-Source Multi-Model Chat UI
Beautiful open-source chat UI supporting Claude, GPT-4, Gemini, Ollama, and 50+ providers. Plugin system, knowledge base, TTS, image generation, and self-hostable. 55,000+ GitHub stars.
Tabby — Self-Hosted AI Coding Assistant
Self-hosted AI code completion and chat assistant. Privacy-first alternative to GitHub Copilot. Supports 20+ models, repo-aware context, and IDE integrations. 33K+ stars.
CC Status Board — Smart Status Bar for Claude Code
Add a context meter, AI asset discovery, and session info to your Claude Code status bar. Scans 300+ installed assets (skills, agents, MCP, plugins) and surfaces the most relevant ones as you type. Zero token cost, 100% local.
whisper.cpp — Local Speech-to-Text in Pure C/C++
High-performance port of OpenAI Whisper in C/C++. No Python, no GPU required. Runs on CPU, Apple Silicon, CUDA, and even Raspberry Pi. Real-time transcription.
Pal MCP Server — Multi-Model AI Gateway for Claude Code
MCP server that lets Claude Code use Gemini, OpenAI, Grok, and Ollama as a unified AI dev team. Features model routing, CLI-to-CLI bridge, and conversation continuity across 7+ providers.
Onyx — Self-Hosted AI Chat with 40+ Connectors
Onyx (formerly Danswer) is a self-hosted AI chat with RAG, custom agents, and 40+ knowledge connectors. 20.4K+ stars. Enterprise search. MIT.
Claude SEO — Complete SEO Skill for Claude Code
Universal SEO analysis skill with 15 sub-skills and 12 parallel subagents. Covers technical SEO, E-E-A-T, schema markup, GEO/AEO, local SEO, Google APIs, and PDF reporting. MIT license, 4,000+ stars.
OpenDeepWiki — Turn Any Repo into AI Documentation
Self-hosted tool that converts GitHub, GitLab, and Gitea repositories into AI-powered knowledge bases with Mermaid diagrams and conversational AI. MIT license, 3,000+ stars.
Coolify — Self-Hosted Vercel & Netlify Alternative
Deploy apps, databases, and services on your own server with one click. No vendor lock-in. 52K+ GitHub stars.
Uptime Kuma — Self-Hosted Uptime Monitoring
Monitor HTTP, TCP, DNS, Docker services with notifications to 90+ channels. Beautiful dashboard. 84K+ GitHub stars.
LibreChat — Self-Hosted Multi-AI Chat Platform
LibreChat is a self-hosted AI chat platform unifying Claude, OpenAI, Google, AWS in one interface. 35.1K+ GitHub stars. Agents, MCP, code interpreter, multi-user auth. MIT.
Remotion Rule: Fonts
Remotion skill rule: Loading Google Fonts and local fonts in Remotion. Part of the official Remotion Agent Skill for programmatic video in React.
自托管 AI 技术栈
The Self-Hosted AI Stack
Self-hosted AI has matured from a hobbyist pursuit to an enterprise requirement. Privacy regulations, data sovereignty laws, and the desire for predictable costs drive organizations to run AI on their own infrastructure. Local LLM Inference — Ollama, Jan, and GPT4All make running models like Llama, Mistral, and Qwen as simple as installing an app. Support for GPU acceleration, quantization, and model management.
Chat Interfaces — Open WebUI, LibreChat, LobeChat, and AnythingLLM provide ChatGPT-like interfaces for your self-hosted models. Features include conversation history, file upload, RAG integration, and multi-model switching. Knowledge Bases — Onyx, Quivr, and PrivateGPT let you build private RAG systems over your documents — no data leaves your servers.
Development Tools — Tabby (self-hosted Copilot), SearXNG (private search), and Puter (cloud desktop) provide developer infrastructure without external dependencies. TokRepo hosts deployment configs and Docker Compose files for the entire self-hosted AI stack.
Self-hosting AI isn't about avoiding costs — it's about owning your intelligence infrastructure.
常见问题
What hardware do I need to self-host AI?+
It depends on the model size. For 7B parameter models (good for most tasks): 16GB RAM + any modern GPU with 8GB VRAM. For 70B models (GPT-4 class): 64GB RAM + GPU with 48GB VRAM (A6000 or dual 3090). For CPU-only inference: Ollama with quantized models runs on any modern laptop, just slower. Apple Silicon Macs with 32GB+ unified memory are excellent for local AI.
Is self-hosted AI as good as cloud APIs?+
For many tasks, yes. Open-source models like Llama 3.1 70B and Qwen 2.5 72B match GPT-4 on coding, analysis, and general reasoning. They fall short on the most complex multi-step reasoning and creative tasks where Claude Opus or GPT-4o still lead. The gap narrows every quarter. For most business applications, self-hosted models are "good enough" with dramatically better privacy and cost.
What is the easiest way to start with self-hosted AI?+
Install Ollama (one command on Mac/Linux/Windows), pull a model ("ollama pull llama3.1"), then install Open WebUI for a ChatGPT-like interface. Total setup time: under 10 minutes. TokRepo hosts Docker Compose configs that bundle Ollama + Open WebUI + RAG pipeline into a single deployment.