Main
Start with one PDF and verify citations (
--cite) before scaling to directory monitoring or research agents.Use the MCP server mode when you want assistants like Claude Desktop to manage documents/search/QA as tools rather than pasted context.
Keep provider swaps explicit: embeddings and QA models are pluggable; document which provider you used for each dataset to make runs reproducible.
Source-backed notes
- README states it is built on LanceDB, Pydantic AI, and Docling, and includes both CLI and Python API entrypoints.
- README documents MCP server usage:
haiku-rag serve --mcp --stdioand a samplemcpServersJSON config. - README lists multiple features including hybrid search, citations with page numbers/section headings, and local-first embedded LanceDB storage.
FAQ
- Do I need an embedding provider?: Yes — README says you must configure one (Ollama/OpenAI/etc.) before indexing/searching.
- Can I use it from an MCP client?: Yes — run
serve --mcp --stdioand add it to your client config. - Is there a slim install?: Yes — README mentions
haiku.rag-slimplus extras; use it when you want fewer deps.