Esta página se muestra en inglés. Una traducción al español está en curso.
Esta página se muestra en inglés. Una traducción al español está en curso.
RAG & Search

Best AI Tools for RAG & Retrieval (2026)

Retrieval-Augmented Generation frameworks, vector databases, embedding tools, and knowledge base builders. Ground your AI in real data.

30 herramientas
💬

RAG Best Practices — Production Pipeline Guide 2026

Comprehensive guide to building production RAG pipelines. Covers chunking strategies, embedding models, vector databases, retrieval techniques, evaluation, and common pitfalls with code examples.

Prompt Lab 47Prompts
📜

txtai — All-in-One Embeddings Database

txtai is an all-in-one embeddings database for semantic search, LLM orchestration, and language model workflows. 10.4K+ GitHub stars. Vector search + SQL + RAG pipelines. Apache 2.0.

Script Depot 63Scripts
🧩

Together AI Embeddings & Reranking Skill for Agents

Skill that teaches Claude Code Together AI's embeddings and reranking API. Covers dense vector generation, semantic search, RAG pipelines, and result reranking patterns.

Prompt Lab 61Skills
🧩

Spring AI — AI Engineering for Java/Spring

Spring AI provides Spring-friendly APIs for AI apps. 8.4K+ stars. Chat, embeddings, RAG, vector DBs, function calling. Major providers. Apache 2.0.

Skill Factory 60Skills

Claude Code Agent: Search Specialist — Build Search Systems

Claude Code agent for building search systems. Vector search, semantic retrieval, embedding strategies, and ranking optimization.

Skill Factory 50Skills

AnythingLLM — All-in-One AI Desktop with MCP

Full-stack AI desktop app with RAG, agents, MCP support, and multi-model chat. AnythingLLM manages documents, embeddings, and vector stores in one private interface.

MCP Hub 47MCP Configs

Qdrant MCP — Vector Search Engine for AI Agents

MCP server for Qdrant vector database. Gives AI agents the power to store and search embeddings for RAG, semantic search, and recommendation systems. 22,000+ stars on Qdrant.

MCP Hub 40MCP Configs

Supabase — The Open Source Firebase Alternative

Supabase is an open-source backend platform built on Postgres. It provides a complete backend — database, authentication, real-time subscriptions, storage, edge functions, and vector embeddings — with instant APIs and a generous free tier.

AI Open Source 31Configs

Quivr — Opinionated RAG Framework for Any LLM

Quivr is an opinionated RAG framework supporting any LLM, multiple file types, and customizable retrieval. 39.1K+ stars. Apache 2.0.

Script Depot 81Scripts

Haystack MCP — Connect AI Pipelines to MCP Clients

Expose Haystack RAG pipelines as MCP servers. Let Claude Code and other AI tools query your document search, QA, and retrieval pipelines through the MCP protocol.

Skill Factory 62MCP Configs

Weaviate — Open-Source Vector Database at Scale

Weaviate is an open-source vector database for semantic search at scale. 15.9K+ GitHub stars. Hybrid search (vector + BM25), built-in RAG, reranking, multi-tenancy, and horizontal scaling. BSD 3-Claus

AI Open Source 62Configs

Chroma — Open-Source Vector Database for AI

Chroma is the open-source vector database and data infrastructure for AI applications. 27.1K+ GitHub stars. Simple 4-function API for embedding, storing, and querying documents. Supports Python, JavaS

AI Open Source 56Configs
💬

Cohere Embed — Multilingual AI Embeddings API

Generate high-quality multilingual embeddings for search and RAG. Cohere Embed v3 supports 100+ languages with specialized modes for documents, queries, and classification.

Prompt Lab 54Prompts

Llama Index — Data Framework for LLM Applications

Leading data framework for connecting LLMs to external data. LlamaIndex handles ingestion, indexing, retrieval, and query engines for building production RAG applications.

Prompt Lab 49Workflows

LangChain4j — LLM Integration for Java

LangChain4j integrates 20+ LLM providers and 30+ vector stores into Java apps. 11.4K+ stars. Unified API, RAG, MCP, Spring Boot. Apache 2.0.

Skill Factory 49Skills

Langflow — Visual AI Workflow Builder

Low-code visual builder for AI workflows and RAG pipelines. Drag-and-drop components for LLMs, vector stores, tools, and agents with Python extensibility.

Agent Toolkit 48Workflows

Verba — The Golden RAGtriever by Weaviate

Verba is an open-source RAG (Retrieval-Augmented Generation) chatbot from the Weaviate team. Drop in PDFs, web pages, or notes; pick a model (OpenAI, Ollama, Anthropic); and get a polished chat UI with semantic search built in.

AI Open Source 46Configs

PostgreSQL — The Most Advanced Open Source Relational Database

PostgreSQL is the most powerful open-source relational database system. It combines SQL compliance, extensibility, and reliability with advanced features like JSONB, full-text search, vector embeddings (pgvector), and PostGIS — making it the database of choice for modern applications.

AI Open Source 44Configs

Turbopuffer MCP — Serverless Vector DB for AI Agents

MCP server for Turbopuffer serverless vector database. Sub-10ms search, zero ops, auto-scaling. Perfect for AI agent memory and RAG without managing infrastructure. 1,200+ stars.

MCP Hub 35MCP Configs

pgvector — Vector Similarity Search Inside PostgreSQL

A PostgreSQL extension that adds a native `vector` type, HNSW and IVFFlat indexes, and distance operators so semantic search, RAG and recommendation workloads can reuse the same database as the rest of the app.

Script Depot 26Scripts
henuwangkai

LLM Wiki Memory Upgrade Prompt

One-click prompt to upgrade your AI agent memory system to Karpathy LLM Wiki pattern. Send to Claude Code / Cursor / Windsurf — auto audits, compiles fragments, resolves contradictions, builds structured wiki.

henuwangkai 560PromptsKnowledge

Jina Reader — AI-Friendly Web Content Extraction

Convert any URL to clean markdown for AI consumption. Free API at r.jina.ai strips ads, navigation, and clutter. Used by AI agents for web research and RAG.

MCP Hub 275MCP Configs

CAMEL — Multi-Agent Framework at Scale

CAMEL is a multi-agent framework for studying scaling laws of AI agents. 16.6K+ GitHub stars. Up to 1M agents, RAG, memory systems, data generation. Apache 2.0.

Script Depot 167Scripts
💬

Awesome Prompt Engineering — Papers, Tools & Courses

Hand-curated collection of 60+ papers, 50+ tools, benchmarks, and courses for prompt engineering and context engineering. Covers CoT, RAG, agents, security, and multimodal. Apache 2.0.

Prompt Lab 154Prompts

Claude Memory Compiler — Evolving Knowledge Base

Auto-capture Claude Code sessions into a structured knowledge base. Hooks extract decisions and lessons, compiler organizes into cross-referenced articles. No vector DB needed. 365+ stars.

Skill Factory 121Skills

Onyx — Self-Hosted AI Chat with 40+ Connectors

Onyx (formerly Danswer) is a self-hosted AI chat with RAG, custom agents, and 40+ knowledge connectors. 20.4K+ stars. Enterprise search. MIT.

AI Open Source 97Configs

VoltAgent — TypeScript AI Agent Framework

Open-source TypeScript framework for building AI agents with built-in Memory, RAG, Guardrails, MCP, Voice, and Workflow support. Includes LLM observability console for debugging.

Script Depot 90Scripts

Appwrite — Open-Source Backend for AI Apps

Complete cloud backend with auth, database, storage, functions, and messaging in one platform. Self-hostable. 55K+ GitHub stars.

TokRepo Curated 82MCP Configs
📜

LLMLingua — Compress Prompts 20x with Minimal Loss

Microsoft research tool for prompt compression. Reduce token usage up to 20x while maintaining LLM performance. Solves lost-in-the-middle for RAG. MIT, 6,000+ stars.

Script Depot 81Scripts

Reactive Resume — AI-Powered Open-Source Resume Builder

Free open-source resume builder with AI integration. Supports Claude, GPT, Gemini for content generation. Drag-and-drop, PDF export, self-hostable, privacy-first. MIT, 36,000+ stars.

AI Open Source 77Scripts

RAG in Production

RAG in Production

Retrieval-Augmented Generation (RAG) has moved from research prototype to production standard. Every enterprise AI application that needs to answer questions about internal data uses some form of RAG. RAG Frameworks — RAGFlow, Haystack, and Kotaemon provide end-to-end pipelines for document ingestion, chunking, embedding, retrieval, and answer generation with source citations.

Vector Databases — Chroma, Milvus, Weaviate, LanceDB, and Pinecone store and retrieve document embeddings. The choice depends on scale (Milvus for billions of vectors), simplicity (Chroma for prototyping), or cost (LanceDB for serverless). GraphRAG — Microsoft's GraphRAG and related tools build knowledge graphs from documents, enabling more accurate retrieval for complex queries that span multiple documents.

Advanced RAG Patterns — Hybrid search (combining vector similarity with keyword matching), re-ranking (using cross-encoders to improve retrieval precision), and agentic RAG (letting AI agents decide when and how to retrieve information) represent the cutting edge of production RAG systems.

RAG is the bridge between what the model knows and what your organization knows.

Preguntas frecuentes

What is RAG (Retrieval-Augmented Generation)?+

RAG is a technique that gives AI models access to external knowledge by retrieving relevant documents before generating answers. Instead of relying solely on training data, the model searches your documents, finds relevant passages, and uses them to produce accurate, grounded answers with source citations. It's how companies build AI assistants that "know" their internal data.

Which vector database should I use?+

For prototyping: Chroma (in-memory, zero config). For production at scale: Milvus (billions of vectors) or Weaviate (hybrid search). For serverless/embedded: LanceDB or Turso with vector extensions. For managed cloud: Pinecone. Most TokRepo RAG assets include pre-configured vector database setups you can install with one command.

How do I improve RAG accuracy?+

Three key techniques: 1) Better chunking — split documents at semantic boundaries, not fixed character counts. 2) Hybrid retrieval — combine vector search with BM25 keyword matching. 3) Re-ranking — use a cross-encoder model to re-score retrieved chunks before sending them to the LLM. GraphRAG (building knowledge graphs) helps most for complex queries spanning multiple documents.

Explora categorías relacionadas