KnowledgeApr 2, 2026·3 min read

GraphRAG — Knowledge Graph RAG by Microsoft

Build knowledge graphs from documents for smarter RAG. Local and global search over entity relationships. By Microsoft Research. 31K+ stars.

TO
TokRepo精选 · Community
Quick Use

Use it first, then decide how deep to go

This block should tell both the user and the agent what to copy, install, and apply first.

```bash pip install graphrag ``` ```bash # Initialize a project graphrag init --root ./my-project # Index your documents (builds the knowledge graph) graphrag index --root ./my-project # Query with local search (entity-focused) graphrag query --root ./my-project --method local \ "What are the main findings about transformer architectures?" # Query with global search (holistic summary) graphrag query --root ./my-project --method global \ "What are the key themes across all documents?" ``` Configure your LLM provider in `settings.yaml`: ```yaml llm: type: openai_chat model: gpt-4o api_key: ${GRAPHRAG_API_KEY} ``` ---
Intro
GraphRAG is a modular graph-based Retrieval-Augmented Generation system by Microsoft Research with 31,900+ GitHub stars. Unlike traditional RAG that simply retrieves text chunks by vector similarity, GraphRAG first extracts a structured knowledge graph from your documents — entities, relationships, and community structures — then uses this graph to answer questions with deeper reasoning. It offers two search modes: **local search** for questions about specific entities, and **global search** for holistic questions about the entire corpus. Research shows GraphRAG substantially outperforms naive RAG on complex reasoning tasks. Works with: OpenAI GPT-4, Anthropic Claude, Azure OpenAI, any OpenAI-compatible API. Best for teams building RAG over large document collections that need multi-hop reasoning. Setup time: under 10 minutes. ---
## How GraphRAG Works ### Traditional RAG vs GraphRAG | Aspect | Traditional RAG | GraphRAG | |--------|----------------|----------| | **Indexing** | Chunk text → embed → vector store | Chunk text → extract entities/relations → build graph → detect communities → summarize | | **Retrieval** | Top-K similar chunks | Graph traversal + community reports | | **Strengths** | Fast, simple | Multi-hop reasoning, holistic understanding | | **Weakness** | Misses cross-document connections | Higher indexing cost (LLM calls) | ### Indexing Pipeline ``` Documents (PDF, TXT, CSV) │ ├─ 1. Text Chunking │ Split into overlapping chunks │ ├─ 2. Entity & Relationship Extraction (LLM) │ "Albert Einstein" ──[worked_at]──> "Princeton" │ "Einstein" ──[developed]──> "General Relativity" │ ├─ 3. Knowledge Graph Construction │ Nodes: entities with descriptions │ Edges: relationships with weights │ ├─ 4. Community Detection (Leiden algorithm) │ Group related entities into clusters │ └─ 5. Community Summarization (LLM) Generate report for each community ``` ### Local Search Answers questions about specific entities by combining: - Entity descriptions from the knowledge graph - Relationship context (neighboring entities) - Relevant text chunks from source documents - Community reports for broader context ```python # Example: "What did Einstein contribute to quantum mechanics?" # GraphRAG traverses the graph from "Einstein" node, # follows edges to "quantum mechanics", "photoelectric effect", # and retrieves relevant source chunks + community summaries ``` ### Global Search Answers holistic questions using community reports in a map-reduce pattern: 1. **Map**: Each community report answers the question independently 2. **Reduce**: Responses are aggregated and synthesized into a final answer ```python # Example: "What are the major themes in this research corpus?" # GraphRAG uses ALL community summaries to provide a comprehensive overview # Traditional RAG would fail — no single chunk contains this information ``` ### Configuration Options ```yaml # settings.yaml llm: type: openai_chat model: gpt-4o api_key: ${GRAPHRAG_API_KEY} chunks: size: 1200 overlap: 100 entity_extraction: max_gleanings: 1 # Re-extraction passes for quality community_reports: max_length: 2000 # Summary length per community snapshots: graphml: true # Export graph for visualization ``` ### Performance Benchmarks From Microsoft Research evaluation: - **Comprehensiveness**: GraphRAG wins 72-83% vs naive RAG on holistic queries - **Diversity of answers**: GraphRAG wins 73-82% on breadth of response - **Specific entity queries**: Local search comparable to traditional RAG - **Indexing cost**: ~$5-15 per 1M tokens of input (depends on model) --- ## FAQ **Q: What is GraphRAG?** A: GraphRAG is Microsoft Research's open-source graph-based RAG system with 31,900+ GitHub stars. It extracts knowledge graphs from documents and uses graph traversal + community summaries for retrieval, enabling multi-hop reasoning that traditional vector RAG cannot achieve. **Q: When should I use GraphRAG instead of regular RAG?** A: Use GraphRAG when your questions require reasoning across multiple documents, understanding relationships between entities, or summarizing themes across a corpus. For simple factual lookup from a single document, traditional RAG is faster and cheaper. **Q: Is GraphRAG free?** A: Yes, fully open-source under MIT license. You pay for LLM API calls during indexing and querying. Indexing costs scale with corpus size. ---
🙏

Source & Thanks

> Created by [Microsoft Research](https://github.com/microsoft). Licensed under MIT. > > [graphrag](https://github.com/microsoft/graphrag) — ⭐ 31,900+ Thanks to Microsoft Research for advancing RAG with knowledge graph techniques.

Discussion

Sign in to join the discussion.
No comments yet. Be the first to share your thoughts.

Related Assets