Skills2026年3月31日·1 分钟阅读

RAGFlow — Deep Document Understanding RAG Engine

Open-source RAG engine with deep document understanding. Parses complex PDFs, tables, images. Agent-powered Q&A with citations. Multi-model. 77K+ stars.

Script Depot · Community

Agent 就绪

Agent 可直接安装

这个资产可安装；Agent 先选择当前运行时、检查安装计划，再运行匹配命令。

Native · 98/100策略：允许

Agent 入口

任意 MCP/CLI Agent

类型

Skill

安装

Single

信任

信任等级：Established

入口

RAGFlow — Deep Document Understanding RAG Engine

直接安装命令

npx -y tokrepo@latest install 7785d7a8-fc57-42ab-ba6b-4a970404fadc --target codex

先 dry-run 确认安装计划，再运行此命令。

TL;DR

RAGFlow parses complex documents (PDFs, tables, images) and powers RAG Q&A with citations.

§01

What it is

RAGFlow is an open-source retrieval-augmented generation engine that specializes in deep document understanding. It parses complex PDFs, tables, images, and structured documents with higher fidelity than generic text extractors. The engine powers agent-based Q&A with inline citations.

The project targets teams building knowledge bases, document search systems, or AI assistants that need accurate answers grounded in technical documentation, research papers, or enterprise documents.

§02

How it saves time or tokens

Generic RAG pipelines struggle with tables, multi-column layouts, and embedded images. RAGFlow's document understanding layer extracts structured content correctly, reducing hallucinations caused by garbled input. Better parsing means fewer tokens wasted on noise and more accurate retrieval.

§03

How to use

Deploy RAGFlow using Docker: docker compose up -d.
Upload your documents through the web UI or API.
Query your knowledge base via the built-in chat interface or REST API.

§04

Example

# Deploy RAGFlow with Docker
git clone https://github.com/infiniflow/ragflow.git
cd ragflow
docker compose -f docker-compose.yml up -d

# Access the web UI at http://localhost:9380
# Upload PDFs, Word docs, or Excel files
# Ask questions and get answers with citations

# API usage:
curl -X POST http://localhost:9380/api/v1/chat \
  -H 'Content-Type: application/json' \
  -d '{"question": "What are the key findings?", "kb_id": "your-kb-id"}'

§05

Related on TokRepo

AI Tools for RAG -- compare RAG frameworks and engines
AI Tools for Documents -- document processing and understanding tools

§06

Common pitfalls

RAGFlow requires significant resources for document parsing. Allocate at least 8GB RAM and 4 CPU cores for the Docker deployment.
Document parsing quality depends on the parser configuration. Test with representative samples before bulk uploading.
The web UI is functional but basic. For production use, integrate via the REST API and build your own frontend.

常见问题

What document formats does RAGFlow support?+

RAGFlow supports PDF, Word (docx), Excel (xlsx), PowerPoint (pptx), plain text, HTML, and markdown. Its strength is in complex PDFs with tables, multi-column layouts, and embedded images.

How does RAGFlow handle tables in PDFs?+

RAGFlow uses specialized table detection and extraction models to identify table boundaries, row/column structure, and cell content. Extracted tables are stored as structured data rather than flattened text, improving retrieval accuracy.

Does RAGFlow support multiple LLM providers?+

Yes. RAGFlow supports OpenAI, Anthropic, local models via Ollama, and other providers. You configure the LLM backend in the system settings.

Can I use RAGFlow without Docker?+

Docker is the recommended and easiest deployment method. Manual installation is possible but requires setting up Elasticsearch, Redis, MinIO, and the RAGFlow application separately.

How does RAGFlow compare to LlamaIndex or LangChain RAG?+

LlamaIndex and LangChain provide RAG pipeline libraries where you assemble components. RAGFlow is a complete RAG application with built-in document parsing, vector storage, and a web UI. It is closer to a turnkey solution than a toolkit.

引用来源 (3)

RAGFlow GitHub— Open-source RAG engine with deep document understanding
RAGFlow Documentation— Specialized table and image extraction from PDFs
RAGFlow Official Site— Agent-powered Q&A with inline citations

🙏

来源与感谢

Created by InfiniFlow. Licensed under Apache 2.0. infiniflow/ragflow — 77,000+ GitHub stars

讨论

登录后参与讨论。

还没有评论，来写第一条吧。

RAGFlow — Deep Document Understanding RAG Engine

Agent 可直接安装

What it is

How it saves time or tokens

How to use

Example

Related on TokRepo

Common pitfalls

常见问题

引用来源 (3)

TokRepo 相关

来源与感谢

讨论

相关资产

Kreuzberg — Polyglot Document Intelligence Framework with a Rust Core

PageIndex — Document Index for Reasoning-Based RAG

Local Deep Research — Privacy-First AI Research Agent

Babylon.js — Powerful 3D Game and Rendering Engine