RAG & Search

2026 最佳 RAG 检索增强工具推荐

RAG 框架、向量数据库、嵌入工具和知识库构建器。让你的 AI 基于真实数据做出回答。

30 个工具

RAG Best Practices — Production Pipeline Guide 2026

Comprehensive guide to building production RAG pipelines. Covers chunking strategies, embedding models, vector databases, retrieval techniques, evaluation, and common pitfalls with code examples.

Prompt Lab 397Prompts

Marqo — Tensor Search Engine for AI-Powered Retrieval

An end-to-end vector search engine that handles embedding generation, storage, and retrieval in a single service for text and image search.

AI Open Source 117Configs

LLMWare — Unified Framework for Enterprise RAG Pipelines

Build retrieval-augmented generation workflows with small specialized models, parsing, embeddings, and vector search in one framework.

AI Open Source 40Configs

TurboVec — High-Performance Rust Vector Index with Python Bindings

A vector similarity search index built on TurboQuant quantization, written in Rust with first-class Python bindings for embedding-based retrieval and RAG workloads.

AI Open Source 17Configs

AnythingLLM — All-in-One AI Desktop with MCP

Full-stack AI desktop app with RAG, agents, MCP support, and multi-model chat. AnythingLLM manages documents, embeddings, and vector stores in one private interface.

MCP Hub 458MCP Configs

Together AI Embeddings & Reranking Skill for Agents

Skill that teaches Claude Code Together AI's embeddings and reranking API. Covers dense vector generation, semantic search, RAG pipelines, and result reranking patterns.

Together AI 388Skills

Qdrant MCP — Vector Search Engine for AI Agents

MCP server for Qdrant vector database. Gives AI agents the power to store and search embeddings for RAG, semantic search, and recommendation systems. 22,000+ stars on Qdrant.

MCP Hub 377MCP Configs

Claude Code Agent: Search Specialist — Build Search Systems

Claude Code agent for building search systems. Vector search, semantic retrieval, embedding strategies, and ranking optimization.

Skill Factory 329Skills

Supabase — The Open Source Firebase Alternative

Supabase is an open-source backend platform built on Postgres. It provides a complete backend — database, authentication, real-time subscriptions, storage, edge functions, and vector embeddings — with instant APIs and a generous free tier.

Supabase 292Skills

Embedding Drift Monitoring — Retrieval Regression Runbook

Embedding drift monitoring runbook for RAG and agent search. Uses golden queries, recall@K, rank delta, and rollback gates.

henuwangkai 246Knowledge

R2R — Production-Ready Agentic RAG System

A state-of-the-art production-ready retrieval-augmented generation system with agentic capabilities, a RESTful API, and built-in document processing, vector search, and knowledge graph support.

AI Open Source 224Configs

PageIndex — Document Index for Reasoning-Based RAG

A document indexing system that enables vectorless retrieval-augmented generation by building structured page-level indexes for LLM reasoning.

AI Open Source 207Skills

AutoRAG — Automated RAG Pipeline Optimization

An open-source AutoML-style framework for evaluating and optimizing retrieval-augmented generation pipelines by automatically testing combinations of chunking, embedding, retrieval, and generation strategies.

AI Open Source 169Configs

Verba — The Golden RAGtriever by Weaviate

Verba is an open-source RAG (Retrieval-Augmented Generation) chatbot from the Weaviate team. Drop in PDFs, web pages, or notes; pick a model (OpenAI, Ollama, Anthropic); and get a polished chat UI with semantic search built in.

AI Open Source 439Skills

Cherry Studio Knowledge Base — Local RAG with 50+ Formats

Cherry Studio Knowledge Base ingests PDFs, Office docs, Markdown into a local vector index. Query offline, BYOK any LLM. Data stays on your machine.

Cherry Studio 428Knowledge

Chroma — Open-Source Vector Database for AI

Chroma is the open-source vector database and data infrastructure for AI applications. 27.1K+ GitHub stars. Simple 4-function API for embedding, storing, and querying documents. Supports Python, JavaS

AI Open Source 406Skills

Weaviate — Open-Source Vector Database at Scale

Weaviate is an open-source vector database for semantic search at scale. 15.9K+ GitHub stars. Hybrid search (vector + BM25), built-in RAG, reranking, multi-tenancy, and horizontal scaling. BSD 3-Claus

AI Open Source 400Skills

Quivr — Opinionated RAG Framework for Any LLM

Quivr is an opinionated RAG framework supporting any LLM, multiple file types, and customizable retrieval. 39.1K+ stars. Apache 2.0.

Script Depot 391Scripts

Langflow — Visual AI Workflow Builder

Low-code visual builder for AI workflows and RAG pipelines. Drag-and-drop components for LLMs, vector stores, tools, and agents with Python extensibility.

Agent Toolkit 379Skills

Memvid — Serverless Memory Layer for AI Agents

An open-source memory system that replaces complex RAG pipelines with a single-file, serverless memory layer providing instant retrieval and long-term storage for AI agents.

Script Depot 363Skills

MaxKB — Self-Hosted AI Knowledge Base with RAG

MaxKB is an open-source knowledge base platform that combines document management with retrieval-augmented generation, letting teams build AI-powered Q&A systems over their own documents without sending data to third parties.

AI Open Source 351Configs

Turbopuffer MCP — Serverless Vector DB for AI Agents

MCP server for Turbopuffer serverless vector database. Sub-10ms search, zero ops, auto-scaling. Perfect for AI agent memory and RAG without managing infrastructure. 1,200+ stars.

MCP Hub 349MCP Configs

PostgreSQL — The Most Advanced Open Source Relational Database

PostgreSQL is the most powerful open-source relational database system. It combines SQL compliance, extensibility, and reliability with advanced features like JSONB, full-text search, vector embeddings (pgvector), and PostGIS — making it the database of choice for modern applications.

AI Open Source 345Skills

Haystack MCP — Connect AI Pipelines to MCP Clients

Expose Haystack RAG pipelines as MCP servers. Let Claude Code and other AI tools query your document search, QA, and retrieval pipelines through the MCP protocol.

Skill Factory 341MCP Configs

pgvector — Vector Similarity Search Inside PostgreSQL

A PostgreSQL extension that adds a native `vector` type, HNSW and IVFFlat indexes, and distance operators so semantic search, RAG and recommendation workloads can reuse the same database as the rest of the app.

Script Depot 335Skills

Llama Index — Data Framework for LLM Applications

Leading data framework for connecting LLMs to external data. LlamaIndex handles ingestion, indexing, retrieval, and query engines for building production RAG applications.

Prompt Lab 325Skills

Cohere Rerank — Boost RAG Accuracy with Rerank-3

Cohere Rerank scores candidates against a query using a cross-encoder. Drop into any RAG to boost top-1 hit rate by 30-50% over vector search alone.

Cohere 299Skills

LightRAG — Graph-Enhanced Retrieval-Augmented Generation

LightRAG integrates knowledge graphs into the RAG pipeline, enabling both low-level entity retrieval and high-level thematic search for more accurate and context-rich LLM responses.

Script Depot 244Skills

CocoIndex — Incremental Data Indexing Engine for AI Agents

CocoIndex is an open-source framework for building incremental data indexing pipelines. It keeps embeddings and knowledge graphs in sync with source data using change-data-capture, enabling always-fresh context for AI agents and RAG applications.

Script Depot 192Skills

nano-graphrag — Lightweight GraphRAG Implementation

A simple, hackable implementation of Microsoft GraphRAG that builds knowledge graphs from documents and uses graph-based retrieval for more accurate LLM question answering.

AI Open Source 184Configs

生产级 RAG 系统

RAG in Production

Retrieval-Augmented Generation (RAG) has moved from research prototype to production standard. Every enterprise AI application that needs to answer questions about internal data uses some form of RAG. RAG Frameworks — RAGFlow, Haystack, and Kotaemon provide end-to-end pipelines for document ingestion, chunking, embedding, retrieval, and answer generation with source citations.

Vector Databases — Chroma, Milvus, Weaviate, LanceDB, and Pinecone store and retrieve document embeddings. The choice depends on scale (Milvus for billions of vectors), simplicity (Chroma for prototyping), or cost (LanceDB for serverless). GraphRAG — Microsoft's GraphRAG and related tools build knowledge graphs from documents, enabling more accurate retrieval for complex queries that span multiple documents.

Advanced RAG Patterns — Hybrid search (combining vector similarity with keyword matching), re-ranking (using cross-encoders to improve retrieval precision), and agentic RAG (letting AI agents decide when and how to retrieve information) represent the cutting edge of production RAG systems.

RAG is the bridge between what the model knows and what your organization knows.

常见问题

What is RAG (Retrieval-Augmented Generation)?+

RAG is a technique that gives AI models access to external knowledge by retrieving relevant documents before generating answers. Instead of relying solely on training data, the model searches your documents, finds relevant passages, and uses them to produce accurate, grounded answers with source citations. It's how companies build AI assistants that "know" their internal data.

Which vector database should I use?+

For prototyping: Chroma (in-memory, zero config). For production at scale: Milvus (billions of vectors) or Weaviate (hybrid search). For serverless/embedded: LanceDB or Turso with vector extensions. For managed cloud: Pinecone. Most TokRepo RAG assets include pre-configured vector database setups you can install with one command.

How do I improve RAG accuracy?+

Three key techniques: 1) Better chunking — split documents at semantic boundaries, not fixed character counts. 2) Hybrid retrieval — combine vector search with BM25 keyword matching. 3) Re-ranking — use a cross-encoder model to re-score retrieved chunks before sending them to the LLM. GraphRAG (building knowledge graphs) helps most for complex queries spanning multiple documents.