# haiku.rag — Agentic RAG CLI + MCP Server > haiku.rag is an agentic RAG toolkit with CLI, Python API, and MCP server; verified 524★ and supports `add-src`, `ask --cite`, and `serve --mcp`. ## Install Merge the JSON below into your `.mcp.json`: ## Quick Use ```bash pip install haiku.rag haiku-rag add-src paper.pdf haiku-rag search "attention mechanism" haiku-rag ask "What datasets were used for evaluation?" --cite haiku-rag serve --mcp --stdio ``` ## Intro haiku.rag is an agentic RAG toolkit with CLI, Python API, and MCP server; verified 524★ and supports `add-src`, `ask --cite`, and `serve --mcp`. **Best for:** Teams building citation-heavy RAG with local-first LanceDB storage and agent workflows **Works with:** Python 3.12+ plus an embedding provider (Ollama/OpenAI/etc.) as required by README **Setup time:** 6-15 minutes ### Key facts (verified) - GitHub: 524 stars · 35 forks · pushed 2026-05-13. - License: MIT · owner avatar + repo URL verified via GitHub API. - README-backed entrypoint: `haiku-rag serve --mcp --stdio`. ## Main - Start with one PDF and verify citations (`--cite`) before scaling to directory monitoring or research agents. - Use the MCP server mode when you want assistants like Claude Desktop to manage documents/search/QA as tools rather than pasted context. - Keep provider swaps explicit: embeddings and QA models are pluggable; document which provider you used for each dataset to make runs reproducible. ### Source-backed notes - README states it is built on LanceDB, Pydantic AI, and Docling, and includes both CLI and Python API entrypoints. - README documents MCP server usage: `haiku-rag serve --mcp --stdio` and a sample `mcpServers` JSON config. - README lists multiple features including hybrid search, citations with page numbers/section headings, and local-first embedded LanceDB storage. ### FAQ - **Do I need an embedding provider?**: Yes — README says you must configure one (Ollama/OpenAI/etc.) before indexing/searching. - **Can I use it from an MCP client?**: Yes — run `serve --mcp --stdio` and add it to your client config. - **Is there a slim install?**: Yes — README mentions `haiku.rag-slim` plus extras; use it when you want fewer deps. ## Source & Thanks > Source: https://github.com/ggozad/haiku.rag > License: MIT > GitHub stars: 524 · forks: 35 --- ## Quick Use ```bash pip install haiku.rag haiku-rag add-src paper.pdf haiku-rag search "attention mechanism" haiku-rag ask "What datasets were used for evaluation?" --cite haiku-rag serve --mcp --stdio ``` ## Intro haiku.rag 是带 CLI、Python API 与 MCP server 的 agentic RAG 工具;已验证 524★,支持 `add-src`、`ask --cite` 与 `serve --mcp` 等命令。 **Best for:** 需要可引用、可复现,并偏好本地 LanceDB 的 RAG/Agent 团队 **Works with:** Python 3.12+,并需要配置 embedding provider(Ollama/OpenAI 等,README 说明) **Setup time:** 6-15 minutes ### Key facts (verified) - GitHub:524 stars · 35 forks;最近更新 2026-05-13。 - 许可证:MIT;作者头像与仓库链接均已通过 GitHub API 复核。 - README 中可对照的入口命令:`haiku-rag serve --mcp --stdio`。 ## Main - 先用 1 个 PDF 跑通并验证引用(`--cite`),再扩展到目录监控或 research agent 流程。 - 当你希望助手以“工具调用”而非粘贴上下文的方式工作时,启用 MCP server 模式(stdio)。 - 把 provider 选择写清楚:embedding 与 QA 模型可替换;为每次数据集运行记录 provider,保证可复现。 ### Source-backed notes - README 说明其基于 LanceDB、Pydantic AI 与 Docling,并同时提供 CLI 与 Python API 两种入口。 - README 给出 MCP server 用法:`haiku-rag serve --mcp --stdio`,并提供 `mcpServers` 的 JSON 配置示例。 - README 列举特性:混合检索、带页码/标题的引用,以及本地优先的嵌入式 LanceDB 存储等。 ### FAQ - **必须配置 embedding provider 吗?**:是的。README 说明索引/检索前需要配置(Ollama/OpenAI 等)。 - **能通过 MCP 客户端用吗?**:可以。运行 `serve --mcp --stdio` 并按示例加入客户端配置。 - **有精简版安装吗?**:有。README 提到 `haiku.rag-slim`,适合想减少依赖时使用。 ## Source & Thanks > Source: https://github.com/ggozad/haiku.rag > License: MIT > GitHub stars: 524 · forks: 35 --- Source: https://tokrepo.com/en/workflows/haiku-rag-agentic-rag-cli-mcp-server Author: Agent Toolkit