ScriptsMay 13, 2026·3 min read

Local Deep Research — Privacy-First AI Research Agent

A self-hosted deep research agent that achieves near-perfect accuracy on benchmarks using local or cloud LLMs, with support for 10+ search engines and fully encrypted processing.

Introduction

Local Deep Research is an open-source AI research agent that runs entirely on your own hardware, ensuring full data privacy. It orchestrates local or cloud LLMs with multiple search backends to perform multi-step research tasks, producing detailed reports with citations and source verification.

What Local Deep Research Does

  • Runs iterative deep research loops using any local LLM via Ollama or llama.cpp
  • Searches across 10+ engines including Google, Brave, arXiv, PubMed, and your private documents
  • Generates structured research reports with inline citations and source links
  • Supports both local and cloud LLM providers (OpenAI, Anthropic, Google)
  • Keeps all data encrypted and on-premise with zero cloud dependency when using local models

Architecture Overview

The system uses a multi-agent pipeline: a planning agent decomposes the research question into sub-queries, a search agent fetches results from configured backends, and a synthesis agent combines findings into a coherent report. Each iteration refines the search based on gaps identified in prior rounds. The web UI is a Flask app that streams progress in real time.

Self-Hosting & Configuration

  • Install via pip or run with Docker using the provided docker-compose.yml
  • Configure LLM backends in the settings file (Ollama endpoint, API keys for cloud providers)
  • Add custom search engines by implementing a simple plugin interface
  • Set max research iterations and token budgets to control cost and runtime
  • Store results locally in SQLite or export to Markdown and PDF

Key Features

  • Achieves ~95% accuracy on the SimpleQA benchmark with mid-size local models
  • Full offline operation when paired with Ollama and local search indices
  • Built-in document ingestion for searching your own PDFs, notes, and knowledge bases
  • Web UI with real-time streaming of research progress and intermediate findings
  • Extensible architecture supporting custom search backends and output formats

Comparison with Similar Tools

  • Perplexica — similar concept but requires cloud LLMs; Local Deep Research runs fully offline
  • GPT Researcher — cloud-only approach using OpenAI; this tool supports any LLM backend
  • Tavily — commercial search API; Local Deep Research integrates free and self-hosted search engines
  • Khoj — broader personal AI assistant; Local Deep Research focuses specifically on deep research workflows

FAQ

Q: Can I use this without any cloud API keys? A: Yes. Pair it with Ollama for the LLM and SearXNG for web search to run fully offline.

Q: What hardware do I need? A: A machine that can run a 7B+ parameter model via Ollama. A GPU with 8 GB VRAM is recommended for reasonable speed.

Q: Does it support RAG over my own documents? A: Yes. Point it at a folder of documents and it indexes them as a searchable backend alongside web sources.

Q: How does it compare to commercial deep research tools? A: It trades some polish for full privacy and zero ongoing cost. Research quality depends on the LLM you choose.

Sources

Discussion

Sign in to join the discussion.
No comments yet. Be the first to share your thoughts.

Related Assets