Configs2026年5月21日·1 分钟阅读

R2R — Production-Ready Agentic RAG System

A state-of-the-art production-ready retrieval-augmented generation system with agentic capabilities, a RESTful API, and built-in document processing, vector search, and knowledge graph support.

Agent 就绪

这个资产可以被 Agent 直接读取和安装

TokRepo 同时提供通用 CLI 命令、安装契约、metadata JSON、按适配器生成的安装计划和原始内容链接,方便 Agent 判断适配度、风险和下一步动作。

Native · 98/100策略:允许
Agent 入口
任意 MCP/CLI Agent
类型
Skill
安装
Single
信任
信任等级:Established
入口
R2R
通用 CLI 安装命令
npx tokrepo install 4d12517d-5530-11f1-9bc6-00163e2b0d79

Introduction

R2R (Reason to Retrieve) is a production-grade retrieval-augmented generation framework built by SciPhi AI. It provides everything needed to go from raw documents to an agentic RAG pipeline with a single deployable service, removing the need to stitch together separate vector databases, embedding services, and LLM orchestration layers.

What R2R Does

  • Ingests documents in 20+ formats (PDF, DOCX, HTML, Markdown, images with OCR) and chunks them automatically
  • Runs hybrid search combining vector similarity and keyword matching with reciprocal rank fusion
  • Builds and queries a knowledge graph alongside the vector index for multi-hop reasoning
  • Exposes a full RESTful API for document management, search, RAG, and agent interactions
  • Supports agentic RAG where the system plans retrieval strategies and iterates on answers

Architecture Overview

R2R runs as a containerized service with three main subsystems: an ingestion pipeline that parses, chunks, and embeds documents into PostgreSQL with pgvector; a retrieval engine that performs hybrid search and optional graph traversal; and an agentic orchestrator that chains retrieval, reasoning, and tool use. The system uses Hatchet for async task orchestration and exposes all functionality through a FastAPI-based REST interface. A Python SDK and CLI wrap the API for developer convenience.

Self-Hosting & Configuration

  • Deploy with Docker Compose for a single-command setup including PostgreSQL, pgvector, and the R2R server
  • Configure LLM and embedding providers via environment variables (supports OpenAI, Anthropic, local models)
  • Customize chunking strategy, overlap, and embedding dimensions in the TOML config
  • Enable the knowledge graph module by setting the graph provider configuration
  • Scale horizontally by adding worker instances behind the task queue

Key Features

  • End-to-end RAG in a single service: ingestion, embedding, search, generation, and agent orchestration
  • Hybrid retrieval with vector search, full-text search, and knowledge graph traversal
  • Multi-tenant architecture with user-level document permissions and access control
  • Agentic RAG mode where the system autonomously decides when and how to retrieve
  • Built-in evaluation endpoints for measuring retrieval and generation quality

Comparison with Similar Tools

  • LangChain — general-purpose LLM framework requiring assembly; R2R is an integrated, deployable RAG service
  • LlamaIndex — strong indexing library but needs external infrastructure; R2R bundles everything in one container
  • Haystack — modular pipeline framework; R2R trades modularity for faster time-to-production
  • RAGFlow — document-focused RAG engine; R2R adds agentic capabilities and knowledge graph support
  • Verba — Weaviate-based RAG UI; R2R is backend-focused with a full API and more retrieval strategies

FAQ

Q: What database does R2R use? A: PostgreSQL with the pgvector extension for vector storage and optional graph storage. Everything runs in the provided Docker Compose stack.

Q: Can I use local models instead of OpenAI? A: Yes. R2R supports any OpenAI-compatible endpoint including Ollama, vLLM, and other local inference servers.

Q: How does the knowledge graph work? A: R2R extracts entities and relationships from ingested documents and stores them in a graph structure. During retrieval, the agent can traverse the graph for multi-hop reasoning alongside vector search.

Q: Is R2R suitable for production workloads? A: Yes. It includes authentication, multi-tenancy, async task processing, and horizontal scaling support.

Sources

讨论

登录后参与讨论。
还没有评论,来写第一条吧。

相关资产