Quivr — Opinionated RAG Framework for Any LLM
Quivr is an opinionated RAG framework supporting any LLM, multiple file types, and customizable retrieval. 39.1K+ stars. Apache 2.0.
What it is
Quivr is a Python RAG (retrieval-augmented generation) framework that takes an opinionated approach to building knowledge bases from documents. You feed it files (PDFs, text, markdown, and more), and Quivr handles ingestion, chunking, embedding, and retrieval. Then you query your document collection in natural language using any LLM backend. The framework is designed to get a working RAG pipeline running in minutes, not days.
Developers and teams who need to build document Q&A systems, internal knowledge bases, or AI assistants grounded in specific documents benefit from Quivr. Its opinionated defaults mean less configuration compared to more modular frameworks.
How it saves time or tokens
Quivr's opinionated design eliminates the decision fatigue of choosing chunking strategies, embedding models, and retrieval methods. The defaults work well for most document types. By handling the entire RAG pipeline in a few lines of code, Quivr saves the days of setup that more flexible frameworks require. The token_estimate for this workflow is approximately 337 tokens for a basic query.
How to use
- Install Quivr via pip
- Create a Brain from your document files
- Ask questions in natural language and get grounded answers
Example
from quivr_core import Brain
# Create a brain from your documents
brain = Brain.from_files(
name='my-knowledge-base',
file_paths=['./report.pdf', './notes.md']
)
# Ask questions
answer = brain.ask('What were the key findings?')
print(answer.answer)
print(answer.sources) # Shows which documents were used
Related on TokRepo
- RAG tools — Compare RAG frameworks and retrieval solutions
- AI tools for documents — Browse document processing and parsing tools
Common pitfalls
- Large PDF files with complex layouts may chunk poorly; preprocess scanned documents with OCR before ingestion
- The default embedding model requires an API key; configure a local embedding model for fully offline operation
- Quivr's opinionated defaults work well for general documents but may need tuning for highly specialized technical content
Frequently Asked Questions
Quivr supports any LLM that provides a chat API, including OpenAI, Anthropic Claude, Mistral, and local models via Ollama. You configure the LLM backend when creating or querying a Brain.
Quivr handles PDFs, markdown, plain text, Word documents, and several other formats. The framework includes parsers for each type and chunks them appropriately for retrieval.
LangChain is modular and requires you to wire together components manually. Quivr is opinionated and provides sensible defaults for the entire pipeline. Quivr gets you running faster; LangChain gives you more control.
You need an API key for the LLM and embedding model by default. To run fully offline, configure Quivr to use a local LLM via Ollama and a local embedding model. This removes all external API dependencies.
Quivr Core is a library for single-user programmatic use. The Quivr platform (separate project) adds multi-user support with a web UI, authentication, and shared brains.
Citations (3)
- Quivr GitHub— Opinionated RAG framework with Brain abstraction
- Quivr Documentation— Multi-file type support and customizable retrieval
- Quivr Core— Apache 2.0 open-source license
Related on TokRepo
Source & Thanks
QuivrHQ/quivr — 39,100+ GitHub stars
Discussion
Related Assets
NAPI-RS — Build Node.js Native Addons in Rust
Write high-performance Node.js native modules in Rust with automatic TypeScript type generation and cross-platform prebuilt binaries.
Mamba — Fast Cross-Platform Package Manager
A drop-in conda replacement written in C++ that resolves environments in seconds instead of minutes.
Plasmo — The Browser Extension Framework
Build, test, and publish browser extensions for Chrome, Firefox, and Edge using React or Vue with hot-reload and automatic manifest generation.