NebulaGraph — Distributed Open-Source Graph Database
Horizontally scalable graph database for storing and querying billions of vertices and trillions of edges with sub-millisecond latency.
What it is
NebulaGraph is a horizontally scalable, open-source distributed graph database designed for storing and querying massive graph datasets. It handles billions of vertices and trillions of edges with sub-millisecond latency.
NebulaGraph targets teams building knowledge graphs, social networks, fraud detection systems, recommendation engines, and any application where relationship-heavy data structures outperform relational tables.
The project is actively maintained and suitable for both individual developers and teams looking to integrate it into their existing toolchain. Documentation and community support are available for onboarding.
How it saves time or tokens
NebulaGraph's nGQL query language expresses multi-hop graph traversals in a few lines that would require complex recursive SQL joins in relational databases. Its shared-nothing architecture scales horizontally by adding nodes, so you avoid the vertical scaling ceiling of single-server graph databases.
For teams evaluating multiple tools in the same category, the clear documentation and active community reduce the time spent on research and troubleshooting. Getting started takes minutes rather than hours of configuration.
How to use
- Deploy NebulaGraph using Docker Compose for development or Kubernetes Helm charts for production.
- Create a graph space, define tag (vertex) and edge schemas.
- Insert vertices and edges using nGQL statements.
- Query with graph traversal patterns (GO, MATCH, LOOKUP) via the NebulaGraph Console or SDK.
Example
-- Create a graph space
CREATE SPACE social_network(partition_num=10, replica_factor=1, vid_type=FIXED_STRING(32));
USE social_network;
-- Define schemas
CREATE TAG person(name string, age int);
CREATE EDGE follows(since datetime);
-- Insert data
INSERT VERTEX person(name, age) VALUES 'alice':('Alice', 30);
INSERT VERTEX person(name, age) VALUES 'bob':('Bob', 25);
INSERT EDGE follows(since) VALUES 'alice'->'bob':(datetime('2026-01-01'));
-- Find who Alice follows
GO FROM 'alice' OVER follows YIELD dst(edge) AS friend;
-- 2-hop friends of friends
GO 2 STEPS FROM 'alice' OVER follows YIELD dst(edge) AS fof;
Related on TokRepo
- AI Tools for Knowledge Graph — Graph databases and knowledge graph tools.
- AI Tools for Database — Compare NebulaGraph with other database solutions.
Common pitfalls
- Using NebulaGraph for simple key-value lookups where a relational or document database would suffice. Graph databases excel at relationship traversals, not flat record retrieval.
- Not planning your partition strategy. NebulaGraph distributes data across partitions. Poor partitioning leads to hot spots and uneven query performance.
- Skipping index creation for property lookups. Without indexes, LOOKUP queries scan entire partitions. Create indexes on frequently queried properties.
- Not reading the changelog before upgrading. Breaking changes between versions can cause unexpected failures in production. Pin your version and review release notes.
Frequently Asked Questions
NebulaGraph is distributed and horizontally scalable, designed for trillion-edge graphs. Neo4j is single-server by default (cluster mode requires Enterprise). NebulaGraph uses nGQL; Neo4j uses Cypher. Choose NebulaGraph for massive scale, Neo4j for smaller graphs with a more mature ecosystem.
NebulaGraph uses nGQL (Nebula Graph Query Language) and also supports a subset of openCypher. nGQL provides GO-based traversal syntax optimized for distributed execution.
Yes, for development and testing. Docker Compose deploys all components (graphd, storaged, metad) on one machine. For production, distribute components across multiple machines for fault tolerance and performance.
NebulaGraph supports ACID transactions within a single partition. Cross-partition transactions use eventual consistency. Plan your data model to keep frequently co-accessed data in the same partition.
Common use cases include social network analysis, fraud detection rings, recommendation engines, knowledge graphs, network topology mapping, and identity resolution. Any domain where multi-hop relationship queries are core to the application benefits from a graph database.
Citations (3)
- NebulaGraph Official Site— Distributed graph database for billions of vertices
- NebulaGraph GitHub— Open-source graph database
- NebulaGraph Docs— nGQL query language documentation
Related on TokRepo
Discussion
Related Assets
Conda — Cross-Platform Package and Environment Manager
Install, update, and manage packages and isolated environments for Python, R, C/C++, and hundreds of other languages from a single tool.
Sphinx — Python Documentation Generator
Generate professional documentation from reStructuredText and Markdown with cross-references, API autodoc, and multiple output formats.
Neutralinojs — Lightweight Cross-Platform Desktop Apps
Build desktop applications with HTML, CSS, and JavaScript using a tiny native runtime instead of bundling Chromium.