What is GraphRAG — Knowledge Graph RAG by Microsoft?

Build knowledge graphs from documents for smarter RAG. Local and global search over entity relationships. By Microsoft Research. 31K+ stars.

Is GraphRAG — Knowledge Graph RAG by Microsoft free to use?

Yes. GraphRAG — Knowledge Graph RAG by Microsoft is freely available on TokRepo. Check the Source & Thanks section on the asset page for the specific open-source license.

How do I install GraphRAG — Knowledge Graph RAG by Microsoft?

Visit the asset page on TokRepo and click "Copy for agent" to get the installation instructions. Most assets can be installed with a single command.

GraphRAG — Knowledge Graph RAG by Microsoft

GraphRAG is a modular graph-based Retrieval-Augmented Generation system by Microsoft Research with 31,900+ GitHub stars. Unlike traditional RAG that simply retrieves text chunks by vector similarity, GraphRAG first extracts a structured knowledge graph from your documents — entities, relationships, and community structures — then uses this graph to answer questions with deeper reasoning. It offers two search modes: local search for questions about specific entities, and global search for holistic questions about the entire corpus. Research shows GraphRAG substantially outperforms naive RAG on complex reasoning tasks.

Works with: OpenAI GPT-4, Anthropic Claude, Azure OpenAI, any OpenAI-compatible API. Best for teams building RAG over large document collections that need multi-hop reasoning. Setup time: under 10 minutes.

How GraphRAG Works

Traditional RAG vs GraphRAG

Aspect	Traditional RAG	GraphRAG
Indexing	Chunk text → embed → vector store	Chunk text → extract entities/relations → build graph → detect communities → summarize
Retrieval	Top-K similar chunks	Graph traversal + community reports
Strengths	Fast, simple	Multi-hop reasoning, holistic understanding
Weakness	Misses cross-document connections	Higher indexing cost (LLM calls)

Indexing Pipeline

Documents (PDF, TXT, CSV)
    │
    ├─ 1. Text Chunking
    │     Split into overlapping chunks
    │
    ├─ 2. Entity & Relationship Extraction (LLM)
    │     "Albert Einstein" ──[worked_at]──> "Princeton"
    │     "Einstein" ──[developed]──> "General Relativity"
    │
    ├─ 3. Knowledge Graph Construction
    │     Nodes: entities with descriptions
    │     Edges: relationships with weights
    │
    ├─ 4. Community Detection (Leiden algorithm)
    │     Group related entities into clusters
    │
    └─ 5. Community Summarization (LLM)
          Generate report for each community

Local Search

Answers questions about specific entities by combining:

Entity descriptions from the knowledge graph
Relationship context (neighboring entities)
Relevant text chunks from source documents
Community reports for broader context

# Example: "What did Einstein contribute to quantum mechanics?"
# GraphRAG traverses the graph from "Einstein" node,
# follows edges to "quantum mechanics", "photoelectric effect",
# and retrieves relevant source chunks + community summaries

Global Search

Answers holistic questions using community reports in a map-reduce pattern:

Map: Each community report answers the question independently
Reduce: Responses are aggregated and synthesized into a final answer

# Example: "What are the major themes in this research corpus?"
# GraphRAG uses ALL community summaries to provide a comprehensive overview
# Traditional RAG would fail — no single chunk contains this information

Configuration Options

# settings.yaml
llm:
  type: openai_chat
  model: gpt-4o
  api_key: ${GRAPHRAG_API_KEY}

chunks:
  size: 1200
  overlap: 100

entity_extraction:
  max_gleanings: 1          # Re-extraction passes for quality

community_reports:
  max_length: 2000           # Summary length per community

snapshots:
  graphml: true              # Export graph for visualization

Performance Benchmarks

From Microsoft Research evaluation:

Comprehensiveness: GraphRAG wins 72-83% vs naive RAG on holistic queries
Diversity of answers: GraphRAG wins 73-82% on breadth of response
Specific entity queries: Local search comparable to traditional RAG
Indexing cost: ~$5-15 per 1M tokens of input (depends on model)

FAQ

Q: What is GraphRAG? A: GraphRAG is Microsoft Research's open-source graph-based RAG system with 31,900+ GitHub stars. It extracts knowledge graphs from documents and uses graph traversal + community summaries for retrieval, enabling multi-hop reasoning that traditional vector RAG cannot achieve.

Q: When should I use GraphRAG instead of regular RAG? A: Use GraphRAG when your questions require reasoning across multiple documents, understanding relationships between entities, or summarizing themes across a corpus. For simple factual lookup from a single document, traditional RAG is faster and cheaper.

Q: Is GraphRAG free? A: Yes, fully open-source under MIT license. You pay for LLM API calls during indexing and querying. Indexing costs scale with corpus size.

GraphRAG — Knowledge Graph RAG by Microsoft

How GraphRAG Works

Traditional RAG vs GraphRAG

Indexing Pipeline

Local Search

Global Search

Configuration Options

Performance Benchmarks

FAQ

Fuente y agradecimientos

Discusión

Activos relacionados

Mathesar — Open-Source Database Interface for PostgreSQL

Livebook — Interactive Notebooks for Elixir

Nango — Open-Source Platform for Product API Integrations