# NebulaGraph — Distributed Open-Source Graph Database

> Horizontally scalable graph database for storing and querying billions of vertices and trillions of edges with sub-millisecond latency.

## Install

Save in your project root:

# NebulaGraph — Distributed Graph Database at Trillion-Edge Scale

## Quick Use
```bash
# Docker Compose all-in-one for local dev
git clone https://github.com/vesoft-inc/nebula-docker-compose
cd nebula-docker-compose
docker-compose up -d

# Connect with the nebula console
docker run --rm -ti --network nebula-docker-compose_nebula-net 
  vesoft/nebula-console:v3 -addr graphd -port 9669 -u root -p nebula

# nGQL — create space, schema, and insert
CREATE SPACE demo (vid_type=FIXED_STRING(32));
USE demo;
CREATE TAG player(name string, age int);
INSERT VERTEX player(name, age) VALUES "p1":("Alice", 30);
```

## Introduction
NebulaGraph is a distributed, horizontally scalable graph database designed to store and query graphs with hundreds of billions of vertices and trillions of edges at sub-millisecond latency. Written in C++ with a shared-nothing architecture, it targets fraud detection, knowledge graphs, recommendation systems, and cybersecurity workloads.

## What NebulaGraph Does
- Stores property graphs (tags, edges, and attributes) with ACID guarantees.
- Executes nGQL, a declarative graph language with Cypher-like syntax.
- Scales reads and writes linearly by sharding graph partitions across servers.
- Supports multi-graph isolation via "spaces" with independent schemas.
- Integrates with Spark, Flink, Kafka, and graph analytics libraries for large-scale processing.

## Architecture Overview
Three services run independently: Meta (cluster metadata and schema), Graph (query parsing and planning), and Storage (sharded KV on RocksDB with Raft replication). Partitions are load-balanced; queries are parsed, optimized, and dispatched to relevant storage replicas, with edge index pushdown. A separate nebula-algorithm service runs large graph algorithms on Spark.

## Self-Hosting & Configuration
- Install via RPM/DEB, Docker Compose, Helm chart `nebula-operator`, or Kubernetes Operator.
- A typical prod cluster runs 3 meta + 3+ graph + 3+ storage services with Raft quorum.
- Tune `wal_ttl`, `rocksdb_block_cache`, and `num_io_threads` for workload characteristics.
- Enable authentication (`--enable_authorize=true`), TLS, and RBAC in `nebula-*.conf`.
- Export with Nebula Exchange to/from Hive, Neo4j, ClickHouse, CSV, and Parquet.

## Key Features
- Shared-nothing design lets you scale out by just adding storage nodes.
- nGQL + Cypher-compatible clauses lower the learning curve for Neo4j users.
- GeoSpatial types, full-text search via Elasticsearch, and built-in graph algorithms.
- Vertex/edge pushdown with RocksDB bloom filters makes multi-hop traversals fast.
- Visualization with NebulaGraph Studio and Explorer; Python / Java / Go SDKs.

## Comparison with Similar Tools
- **Neo4j** — Single-writer leader design; Nebula scales horizontally with multi-shard writes.
- **JanusGraph** — Depends on Cassandra/HBase; Nebula ships its own distributed storage.
- **Dgraph** — GraphQL-native; Nebula chooses nGQL for more flexible graph traversals.
- **TigerGraph** — Proprietary; Nebula is Apache-2.0 open source.
- **ArangoDB** — Multi-model; Nebula specializes purely in graph for lower latency.

## FAQ
**Q:** What is the largest graph it can handle?
A: Production deployments store trillions of edges across 100+ storage nodes with horizontal sharding.

**Q:** Is nGQL compatible with Cypher?
A: Nebula supports a subset of OpenCypher syntax, making migrations from Neo4j approachable.

**Q:** Can I run graph ML on it?
A: Yes, via nebula-algorithm (GraphX/Spark) and integrations with DGL and PyTorch Geometric.

**Q:** How do I back up a cluster?
A: Use `br` (Nebula Backup & Restore) to snapshot meta + storage to S3, GCS, or local disk.

## Sources
- https://github.com/vesoft-inc/nebula
- https://docs.nebula-graph.io

---
Source: https://tokrepo.com/en/workflows/9e58f35f-3931-11f1-9bc6-00163e2b0d79
Author: AI Open Source