RAG & Search

2026 最佳 RAG 检索增强工具推荐

RAG 框架、向量数据库、嵌入工具和知识库构建器。让你的 AI 基于真实数据做出回答。

30 个工具
💬

RAG Best Practices — Production Pipeline Guide 2026

Comprehensive guide to building production RAG pipelines. Covers chunking strategies, embedding models, vector databases, retrieval techniques, evaluation, and common pitfalls with code examples.

Prompt Lab 6Prompts

Claude Code Agent: Search Specialist — Build Search Systems

Claude Code agent for building search systems. Vector search, semantic retrieval, embedding strategies, and ranking optimization.

Skill Factory 20Skills
🧩

Spring AI — AI Engineering for Java/Spring

Spring AI provides Spring-friendly APIs for AI apps. 8.4K+ stars. Chat, embeddings, RAG, vector DBs, function calling. Major providers. Apache 2.0.

Skill Factory 15Skills
📜

txtai — All-in-One Embeddings Database

txtai is an all-in-one embeddings database for semantic search, LLM orchestration, and language model workflows. 10.4K+ GitHub stars. Vector search + SQL + RAG pipelines. Apache 2.0.

Script Depot 13Scripts

AnythingLLM — All-in-One AI Desktop with MCP

Full-stack AI desktop app with RAG, agents, MCP support, and multi-model chat. AnythingLLM manages documents, embeddings, and vector stores in one private interface.

MCP Hub 7MCP Configs
🔌

Qdrant MCP — Vector Search Engine for AI Agents

MCP server for Qdrant vector database. Gives AI agents the power to store and search embeddings for RAG, semantic search, and recommendation systems. 22,000+ stars on Qdrant.

MCP Hub 5MCP Configs
🧩

Together AI Embeddings & Reranking Skill for Agents

Skill that teaches Claude Code Together AI's embeddings and reranking API. Covers dense vector generation, semantic search, RAG pipelines, and result reranking patterns.

Prompt Lab 4Skills

Quivr — Opinionated RAG Framework for Any LLM

Quivr is an opinionated RAG framework supporting any LLM, multiple file types, and customizable retrieval. 39.1K+ stars. Apache 2.0.

Script Depot 24Scripts

Chroma — Open-Source Vector Database for AI

Chroma is the open-source vector database and data infrastructure for AI applications. 27.1K+ GitHub stars. Simple 4-function API for embedding, storing, and querying documents. Supports Python, JavaS

AI Open Source 24Configs

Weaviate — Open-Source Vector Database at Scale

Weaviate is an open-source vector database for semantic search at scale. 15.9K+ GitHub stars. Hybrid search (vector + BM25), built-in RAG, reranking, multi-tenancy, and horizontal scaling. BSD 3-Claus

AI Open Source 23Configs

Haystack MCP — Connect AI Pipelines to MCP Clients

Expose Haystack RAG pipelines as MCP servers. Let Claude Code and other AI tools query your document search, QA, and retrieval pipelines through the MCP protocol.

Skill Factory 18MCP Configs

Llama Index — Data Framework for LLM Applications

Leading data framework for connecting LLMs to external data. LlamaIndex handles ingestion, indexing, retrieval, and query engines for building production RAG applications.

Prompt Lab 16Workflows

AnythingLLM — All-in-One AI Knowledge Base

All-in-one AI app: chat with documents, RAG, agents, multi-user, and 30+ LLM/embedding providers. Desktop + Docker. Privacy-first, no setup needed. 57K+ stars.

Script Depot 15Scripts

Qdrant — Vector Search Engine for AI Applications

High-performance open-source vector database for AI search and RAG. Qdrant offers advanced filtering, quantization, distributed deployment, and a rich Python/JS SDK.

AI Open Source 13Workflows

Chroma — Open-Source Embedding Database for AI

Lightweight open-source vector database that runs anywhere. Chroma provides in-memory, local file, and client-server modes for embeddings with zero-config LangChain integration.

AI Open Source 13Workflows

LangChain4j — LLM Integration for Java

LangChain4j integrates 20+ LLM providers and 30+ vector stores into Java apps. 11.4K+ stars. Unified API, RAG, MCP, Spring Boot. Apache 2.0.

Skill Factory 11Skills

Langflow — Visual AI Workflow Builder

Low-code visual builder for AI workflows and RAG pipelines. Drag-and-drop components for LLMs, vector stores, tools, and agents with Python extensibility.

Agent Toolkit 7Workflows
🔌

Turbopuffer MCP — Serverless Vector DB for AI Agents

MCP server for Turbopuffer serverless vector database. Sub-10ms search, zero ops, auto-scaling. Perfect for AI agent memory and RAG without managing infrastructure. 1,200+ stars.

MCP Hub 4MCP Configs
henuwangkai

LLM Wiki Memory Upgrade Prompt

One-click prompt to upgrade your AI agent memory system to Karpathy LLM Wiki pattern. Send to Claude Code / Cursor / Windsurf — auto audits, compiles fragments, resolves contradictions, builds structured wiki.

henuwangkai 430PromptsKnowledge

Jina Reader — AI-Friendly Web Content Extraction

Convert any URL to clean markdown for AI consumption. Free API at r.jina.ai strips ads, navigation, and clutter. Used by AI agents for web research and RAG.

MCP Hub 75MCP Configs

CAMEL — Multi-Agent Framework at Scale

CAMEL is a multi-agent framework for studying scaling laws of AI agents. 16.6K+ GitHub stars. Up to 1M agents, RAG, memory systems, data generation. Apache 2.0.

Script Depot 49Scripts
💬

Awesome Prompt Engineering — Papers, Tools & Courses

Hand-curated collection of 60+ papers, 50+ tools, benchmarks, and courses for prompt engineering and context engineering. Covers CoT, RAG, agents, security, and multimodal. Apache 2.0.

Prompt Lab 48Prompts

Claude Memory Compiler — Evolving Knowledge Base

Auto-capture Claude Code sessions into a structured knowledge base. Hooks extract decisions and lessons, compiler organizes into cross-referenced articles. No vector DB needed. 365+ stars.

Skill Factory 48Skills

Onyx — Self-Hosted AI Chat with 40+ Connectors

Onyx (formerly Danswer) is a self-hosted AI chat with RAG, custom agents, and 40+ knowledge connectors. 20.4K+ stars. Enterprise search. MIT.

AI Open Source 47Configs

Dify — Open-Source LLM App Development Platform

Visual platform for building AI applications with workflow orchestration, RAG pipelines, agent capabilities, and model management. Supports 100+ models. 85,000+ GitHub stars.

AI Open Source 45Scripts

VoltAgent — TypeScript AI Agent Framework

Open-source TypeScript framework for building AI agents with built-in Memory, RAG, Guardrails, MCP, Voice, and Workflow support. Includes LLM observability console for debugging.

Script Depot 44Scripts
📜

LLMLingua — Compress Prompts 20x with Minimal Loss

Microsoft research tool for prompt compression. Reduce token usage up to 20x while maintaining LLM performance. Solves lost-in-the-middle for RAG. MIT, 6,000+ stars.

Script Depot 35Scripts

Reactive Resume — AI-Powered Open-Source Resume Builder

Free open-source resume builder with AI integration. Supports Claude, GPT, Gemini for content generation. Drag-and-drop, PDF export, self-hostable, privacy-first. MIT, 36,000+ stars.

AI Open Source 30Scripts
💬

Awesome LLM Apps — 50+ AI App Recipes with Source Code

Curated collection of 50+ production-ready AI application examples with full source code. RAG chatbots, AI agents, multi-model apps, and more. Each recipe is a complete, runnable project. 6,000+ stars.

Prompt Lab 28Prompts

Dify — Open-Source LLMOps Platform

Dify is an open-source LLMOps platform for building AI apps with visual workflows, RAG, agents, and model management. 130K+ stars. Apache 2.0.

AI Open Source 26Configs

生产级 RAG 系统

RAG in Production

Retrieval-Augmented Generation (RAG) has moved from research prototype to production standard. Every enterprise AI application that needs to answer questions about internal data uses some form of RAG. RAG Frameworks — RAGFlow, Haystack, and Kotaemon provide end-to-end pipelines for document ingestion, chunking, embedding, retrieval, and answer generation with source citations.

Vector Databases — Chroma, Milvus, Weaviate, LanceDB, and Pinecone store and retrieve document embeddings. The choice depends on scale (Milvus for billions of vectors), simplicity (Chroma for prototyping), or cost (LanceDB for serverless). GraphRAG — Microsoft's GraphRAG and related tools build knowledge graphs from documents, enabling more accurate retrieval for complex queries that span multiple documents.

Advanced RAG Patterns — Hybrid search (combining vector similarity with keyword matching), re-ranking (using cross-encoders to improve retrieval precision), and agentic RAG (letting AI agents decide when and how to retrieve information) represent the cutting edge of production RAG systems.

RAG is the bridge between what the model knows and what your organization knows.

常见问题

What is RAG (Retrieval-Augmented Generation)?+

RAG is a technique that gives AI models access to external knowledge by retrieving relevant documents before generating answers. Instead of relying solely on training data, the model searches your documents, finds relevant passages, and uses them to produce accurate, grounded answers with source citations. It's how companies build AI assistants that "know" their internal data.

Which vector database should I use?+

For prototyping: Chroma (in-memory, zero config). For production at scale: Milvus (billions of vectors) or Weaviate (hybrid search). For serverless/embedded: LanceDB or Turso with vector extensions. For managed cloud: Pinecone. Most TokRepo RAG assets include pre-configured vector database setups you can install with one command.

How do I improve RAG accuracy?+

Three key techniques: 1) Better chunking — split documents at semantic boundaries, not fixed character counts. 2) Hybrid retrieval — combine vector search with BM25 keyword matching. 3) Re-ranking — use a cross-encoder model to re-score retrieved chunks before sending them to the LLM. GraphRAG (building knowledge graphs) helps most for complex queries spanning multiple documents.

探索更多分类