Khoj — Your AI Second Brain
Khoj is a personal AI app for chat, search, and knowledge management. 33.8K+ stars. Multi-LLM, docs, Obsidian, WhatsApp, custom agents. AGPL-3.0.
What it is
Khoj is an open-source personal AI application for chat, search, and knowledge management. It connects to your documents, notes, and files to provide AI-powered search and conversation grounded in your personal knowledge base.
Khoj supports multiple LLM providers (OpenAI, Anthropic, local models), integrates with Obsidian and other note-taking tools, and can be accessed via web, WhatsApp, or custom agents. It aims to be your AI second brain.
How it saves time or tokens
Khoj indexes your personal documents and notes, so you can ask questions about your own knowledge base without manually searching through files. The AI retrieves relevant context from your documents before generating answers, reducing hallucination.
By running locally or self-hosted, Khoj keeps your personal data private and avoids per-query costs for document retrieval operations.
Additionally, the project's well-structured documentation and active community mean developers spend less time troubleshooting integration issues. When AI coding assistants generate code for this tool, they can reference established patterns from the documentation, producing correct implementations with fewer iterations and lower token costs.
How to use
- Install Khoj:
pip install khoj
khoj --anonymous-mode
- Open the web UI at
http://localhost:42110and configure your data sources (Obsidian vault, markdown files, PDFs, org files).
- Chat with your documents:
You: What were my notes about the Q1 product roadmap?
Khoj: Based on your notes in planning/q1-roadmap.md, the key priorities were...
- Use the Obsidian plugin for in-editor AI search and chat.
Example
# Khoj API for programmatic access
import requests
response = requests.get(
'http://localhost:42110/api/chat',
params={'q': 'What meetings do I have this week?'},
headers={'Authorization': 'Bearer your-api-key'}
)
print(response.json()['response'])
Related on TokRepo
- AI Tools for Knowledge Graph — Knowledge management and graph tools
- AI Tools for Research — AI-powered research and search tools
Common pitfalls
- Not re-indexing after adding new documents. Khoj needs to index your files to search them. Set up automatic re-indexing or manually trigger it after adding content.
- Using Khoj with too many large files without chunking configuration. Adjust chunk size and overlap settings for optimal retrieval quality on large document collections.
- Expecting Khoj to replace a full RAG pipeline. Khoj is designed for personal knowledge management. For enterprise document search, consider dedicated RAG solutions.
- Failing to review community discussions and changelogs before upgrading. Breaking changes in major versions can disrupt existing workflows. Pin versions in production and test upgrades in staging first.
Frequently Asked Questions
Khoj supports markdown, org-mode, PDF, plaintext, and Obsidian vault formats. It indexes the content and makes it searchable through AI chat. Additional format support can be added through plugins.
Yes. Khoj runs locally on your machine with pip install khoj. It can use local LLMs via Ollama for completely offline, private AI search. No data leaves your machine.
Yes. Khoj provides an Obsidian plugin that adds AI search and chat directly in the Obsidian editor. It indexes your vault and lets you ask questions about your notes without leaving Obsidian.
Khoj supports OpenAI (GPT-4), Anthropic (Claude), Google (Gemini), and local models via Ollama. You can switch providers in the settings. Local models keep all processing on your machine.
Yes. Khoj supports WhatsApp integration, allowing you to chat with your knowledge base from your phone. Send a message to your Khoj WhatsApp number and get AI-powered answers grounded in your documents.
Citations (3)
- Khoj GitHub— Khoj is a personal AI app for chat, search, and knowledge management
- Khoj Documentation— Khoj documentation and setup guide
- RAG Paper (arXiv)— RAG architecture for grounded AI responses
Related on TokRepo
Source & Thanks
khoj-ai/khoj — 33,800+ GitHub stars
Discussion
Related Assets
Hugging Face Tokenizers — Fast Text Tokenization for ML Pipelines
Hugging Face Tokenizers is a Rust-powered tokenization library with Python bindings that implements BPE, WordPiece, Unigram, and SentencePiece tokenizers with training and encoding speeds of gigabytes per second, used as the backbone for Transformers model tokenization.
Cleanlab — Find and Fix Label Errors in Any ML Dataset
Cleanlab is a data-centric AI Python library that automatically detects label errors, outliers, and data quality issues in classification and regression datasets, helping improve model accuracy by cleaning training data rather than tuning models.
Hugging Face Datasets — Access and Process ML Datasets at Scale
Hugging Face Datasets is a Python library for efficiently loading, processing, and sharing machine learning datasets with Apache Arrow-backed memory mapping, streaming support, and access to thousands of community datasets on the Hub.