Quivr — Opinionated RAG Framework for Any LLM
Quivr is an opinionated RAG framework supporting any LLM, multiple file types, and customizable retrieval. 39.1K+ stars. Apache 2.0.
Safe staging for this asset
This asset is staged first. The copied prompt tells the agent to inspect the staged files and ask before activating scripts, MCP config, or global config.
npx -y tokrepo@latest install 96223597-08c2-4e60-b84e-0c4779641933 --target codexStages files first; activation requires review of the staged README and plan.
What it is
Quivr is a Python RAG (retrieval-augmented generation) framework that takes an opinionated approach to building knowledge bases from documents. You feed it files (PDFs, text, markdown, and more), and Quivr handles ingestion, chunking, embedding, and retrieval. Then you query your document collection in natural language using any LLM backend. The framework is designed to get a working RAG pipeline running in minutes, not days.
Developers and teams who need to build document Q&A systems, internal knowledge bases, or AI assistants grounded in specific documents benefit from Quivr. Its opinionated defaults mean less configuration compared to more modular frameworks.
How it saves time or tokens
Quivr's opinionated design eliminates the decision fatigue of choosing chunking strategies, embedding models, and retrieval methods. The defaults work well for most document types. By handling the entire RAG pipeline in a few lines of code, Quivr saves the days of setup that more flexible frameworks require. The token_estimate for this workflow is approximately 337 tokens for a basic query.
How to use
- Install Quivr via pip
- Create a Brain from your document files
- Ask questions in natural language and get grounded answers
Example
from quivr_core import Brain
# Create a brain from your documents
brain = Brain.from_files(
name='my-knowledge-base',
file_paths=['./report.pdf', './notes.md']
)
# Ask questions
answer = brain.ask('What were the key findings?')
print(answer.answer)
print(answer.sources) # Shows which documents were used
Related on TokRepo
- RAG tools — Compare RAG frameworks and retrieval solutions
- AI tools for documents — Browse document processing and parsing tools
Common pitfalls
- Large PDF files with complex layouts may chunk poorly; preprocess scanned documents with OCR before ingestion
- The default embedding model requires an API key; configure a local embedding model for fully offline operation
- Quivr's opinionated defaults work well for general documents but may need tuning for highly specialized technical content
Frequently Asked Questions
Quivr supports any LLM that provides a chat API, including OpenAI, Anthropic Claude, Mistral, and local models via Ollama. You configure the LLM backend when creating or querying a Brain.
Quivr handles PDFs, markdown, plain text, Word documents, and several other formats. The framework includes parsers for each type and chunks them appropriately for retrieval.
LangChain is modular and requires you to wire together components manually. Quivr is opinionated and provides sensible defaults for the entire pipeline. Quivr gets you running faster; LangChain gives you more control.
You need an API key for the LLM and embedding model by default. To run fully offline, configure Quivr to use a local LLM via Ollama and a local embedding model. This removes all external API dependencies.
Quivr Core is a library for single-user programmatic use. The Quivr platform (separate project) adds multi-user support with a web UI, authentication, and shared brains.
Citations (3)
- Quivr GitHub— Opinionated RAG framework with Brain abstraction
- Quivr Documentation— Multi-file type support and customizable retrieval
- Quivr Core— Apache 2.0 open-source license
Related on TokRepo
Source & Thanks
QuivrHQ/quivr — 39,100+ GitHub stars
Discussion
Related Assets
Ragas — Evaluate RAG & LLM Applications
Ragas evaluates LLM applications with objective metrics, test data generation, and data-driven insights. 13.2K+ GitHub stars. RAG evaluation, auto test generation. Apache 2.0.
mcp-use — Fullstack MCP Framework for AI Agents & Apps
Build MCP servers and apps for ChatGPT, Claude, and any LLM with TypeScript or Python SDK. Includes inspector, cloud deploy, and interactive widgets.
Ember.js — Opinionated Framework for Ambitious Web Applications
A productive JavaScript framework for building large-scale single-page applications with strong conventions and a batteries-included approach.
Haystack — AI Orchestration for Search & RAG
Open-source AI orchestration framework by deepset. Build production RAG pipelines, semantic search, and agent workflows with modular components. 25K+ GitHub stars.