Esta página se muestra en inglés. Una traducción al español está en curso.
KnowledgeApr 2, 2026·3 min de lectura

GraphRAG — Knowledge Graph RAG by Microsoft

Build knowledge graphs from documents for smarter RAG. Local and global search over entity relationships. By Microsoft Research. 31K+ stars.

Introducción

GraphRAG is a modular graph-based Retrieval-Augmented Generation system by Microsoft Research with 31,900+ GitHub stars. Unlike traditional RAG that simply retrieves text chunks by vector similarity, GraphRAG first extracts a structured knowledge graph from your documents — entities, relationships, and community structures — then uses this graph to answer questions with deeper reasoning. It offers two search modes: local search for questions about specific entities, and global search for holistic questions about the entire corpus. Research shows GraphRAG substantially outperforms naive RAG on complex reasoning tasks.

Works with: OpenAI GPT-4, Anthropic Claude, Azure OpenAI, any OpenAI-compatible API. Best for teams building RAG over large document collections that need multi-hop reasoning. Setup time: under 10 minutes.


How GraphRAG Works

Traditional RAG vs GraphRAG

Aspect Traditional RAG GraphRAG
Indexing Chunk text → embed → vector store Chunk text → extract entities/relations → build graph → detect communities → summarize
Retrieval Top-K similar chunks Graph traversal + community reports
Strengths Fast, simple Multi-hop reasoning, holistic understanding
Weakness Misses cross-document connections Higher indexing cost (LLM calls)

Indexing Pipeline

Documents (PDF, TXT, CSV)
    │
    ├─ 1. Text Chunking
    │     Split into overlapping chunks
    │
    ├─ 2. Entity & Relationship Extraction (LLM)
    │     "Albert Einstein" ──[worked_at]──> "Princeton"
    │     "Einstein" ──[developed]──> "General Relativity"
    │
    ├─ 3. Knowledge Graph Construction
    │     Nodes: entities with descriptions
    │     Edges: relationships with weights
    │
    ├─ 4. Community Detection (Leiden algorithm)
    │     Group related entities into clusters
    │
    └─ 5. Community Summarization (LLM)
          Generate report for each community

Local Search

Answers questions about specific entities by combining:

  • Entity descriptions from the knowledge graph
  • Relationship context (neighboring entities)
  • Relevant text chunks from source documents
  • Community reports for broader context
# Example: "What did Einstein contribute to quantum mechanics?"
# GraphRAG traverses the graph from "Einstein" node,
# follows edges to "quantum mechanics", "photoelectric effect",
# and retrieves relevant source chunks + community summaries

Global Search

Answers holistic questions using community reports in a map-reduce pattern:

  1. Map: Each community report answers the question independently
  2. Reduce: Responses are aggregated and synthesized into a final answer
# Example: "What are the major themes in this research corpus?"
# GraphRAG uses ALL community summaries to provide a comprehensive overview
# Traditional RAG would fail — no single chunk contains this information

Configuration Options

# settings.yaml
llm:
  type: openai_chat
  model: gpt-4o
  api_key: ${GRAPHRAG_API_KEY}

chunks:
  size: 1200
  overlap: 100

entity_extraction:
  max_gleanings: 1          # Re-extraction passes for quality

community_reports:
  max_length: 2000           # Summary length per community

snapshots:
  graphml: true              # Export graph for visualization

Performance Benchmarks

From Microsoft Research evaluation:

  • Comprehensiveness: GraphRAG wins 72-83% vs naive RAG on holistic queries
  • Diversity of answers: GraphRAG wins 73-82% on breadth of response
  • Specific entity queries: Local search comparable to traditional RAG
  • Indexing cost: ~$5-15 per 1M tokens of input (depends on model)

FAQ

Q: What is GraphRAG? A: GraphRAG is Microsoft Research's open-source graph-based RAG system with 31,900+ GitHub stars. It extracts knowledge graphs from documents and uses graph traversal + community summaries for retrieval, enabling multi-hop reasoning that traditional vector RAG cannot achieve.

Q: When should I use GraphRAG instead of regular RAG? A: Use GraphRAG when your questions require reasoning across multiple documents, understanding relationships between entities, or summarizing themes across a corpus. For simple factual lookup from a single document, traditional RAG is faster and cheaper.

Q: Is GraphRAG free? A: Yes, fully open-source under MIT license. You pay for LLM API calls during indexing and querying. Indexing costs scale with corpus size.


🙏

Fuente y agradecimientos

Created by Microsoft Research. Licensed under MIT.

graphrag — ⭐ 31,900+

Thanks to Microsoft Research for advancing RAG with knowledge graph techniques.

Discusión

Inicia sesión para unirte a la discusión.
Aún no hay comentarios. Sé el primero en compartir tus ideas.

Activos relacionados