Introduction
Local Deep Research is an open-source AI research agent that runs entirely on your own hardware, ensuring full data privacy. It orchestrates local or cloud LLMs with multiple search backends to perform multi-step research tasks, producing detailed reports with citations and source verification.
What Local Deep Research Does
- Runs iterative deep research loops using any local LLM via Ollama or llama.cpp
- Searches across 10+ engines including Google, Brave, arXiv, PubMed, and your private documents
- Generates structured research reports with inline citations and source links
- Supports both local and cloud LLM providers (OpenAI, Anthropic, Google)
- Keeps all data encrypted and on-premise with zero cloud dependency when using local models
Architecture Overview
The system uses a multi-agent pipeline: a planning agent decomposes the research question into sub-queries, a search agent fetches results from configured backends, and a synthesis agent combines findings into a coherent report. Each iteration refines the search based on gaps identified in prior rounds. The web UI is a Flask app that streams progress in real time.
Self-Hosting & Configuration
- Install via pip or run with Docker using the provided docker-compose.yml
- Configure LLM backends in the settings file (Ollama endpoint, API keys for cloud providers)
- Add custom search engines by implementing a simple plugin interface
- Set max research iterations and token budgets to control cost and runtime
- Store results locally in SQLite or export to Markdown and PDF
Key Features
- Achieves ~95% accuracy on the SimpleQA benchmark with mid-size local models
- Full offline operation when paired with Ollama and local search indices
- Built-in document ingestion for searching your own PDFs, notes, and knowledge bases
- Web UI with real-time streaming of research progress and intermediate findings
- Extensible architecture supporting custom search backends and output formats
Comparison with Similar Tools
- Perplexica — similar concept but requires cloud LLMs; Local Deep Research runs fully offline
- GPT Researcher — cloud-only approach using OpenAI; this tool supports any LLM backend
- Tavily — commercial search API; Local Deep Research integrates free and self-hosted search engines
- Khoj — broader personal AI assistant; Local Deep Research focuses specifically on deep research workflows
FAQ
Q: Can I use this without any cloud API keys? A: Yes. Pair it with Ollama for the LLM and SearXNG for web search to run fully offline.
Q: What hardware do I need? A: A machine that can run a 7B+ parameter model via Ollama. A GPU with 8 GB VRAM is recommended for reasonable speed.
Q: Does it support RAG over my own documents? A: Yes. Point it at a folder of documents and it indexes them as a searchable backend alongside web sources.
Q: How does it compare to commercial deep research tools? A: It trades some polish for full privacy and zero ongoing cost. Research quality depends on the LLM you choose.