What Neo4j Does
- Property graph — nodes, relationships, both with properties
- Cypher — SQL-like declarative query language (now ISO GQL standard)
- Index-free adjacency — constant-time traversal regardless of graph size
- ACID transactions — full consistency guarantees
- Clustering — Enterprise causal clustering for HA
- Graph algorithms — via Graph Data Science (GDS) library
- Vector search — Neo4j 5.11+ built-in vector indexes for RAG
- GraphQL — Neo4j GraphQL library to generate API
Architecture
Native graph storage: nodes and relationships are first-class records on disk with direct pointers (index-free adjacency). Query engine compiles Cypher to an execution plan that walks the graph. Bolt binary protocol for driver communication.
Self-Hosting
# docker-compose.yml
version: "3"
services:
neo4j:
image: neo4j:5-community
ports: ["7474:7474", "7687:7687"]
volumes:
- neo4j-data:/data
- neo4j-logs:/logs
environment:
NEO4J_AUTH: neo4j/tokrepo12345
NEO4J_server_memory_heap_max__size: 2G
NEO4J_PLUGINS: "[\"apoc\", \"graph-data-science\"]"
volumes:
neo4j-data:
neo4j-logs:Key Features
- Native property graph storage
- Cypher query language
- ACID transactions
- Graph Data Science library (65+ algorithms)
- Vector indexes for AI/RAG
- Neo4j Browser UI
- Bolt binary protocol
- APOC stored procedures library
- Causal clustering (Enterprise)
- Fabric for multi-database queries
Comparison
| Graph DB | Query Lang | Type | Storage |
|---|---|---|---|
| Neo4j | Cypher | Property graph | Native |
| ArangoDB | AQL | Multi-model | Native |
| JanusGraph | Gremlin | Property graph | Cassandra/HBase |
| TigerGraph | GSQL | Property graph | Native |
| Dgraph | DQL + GraphQL | RDF-like | Native |
| Memgraph | Cypher | Property graph | In-memory |
| NebulaGraph | nGQL | Property graph | Distributed |
FAQ
Q: When to choose a graph database? A: Social networks, recommendation systems, fraud detection, knowledge graphs, dependency analysis, identity auth chains — any query where relationships are central.
Q: Cypher vs Gremlin? A: Cypher is declarative (SQL-like); Gremlin is imperative (like a program). Cypher is more readable and has become the ISO GQL standard; Gremlin is more general (any TinkerPop graph).
Q: Vector search? A: Neo4j 5.11+ supports native vector indexes, enabling GraphRAG by combining vectors with graph structure — graph nodes can carry embeddings.
Sources
- Docs: https://neo4j.com/docs
- GitHub: https://github.com/neo4j/neo4j
- License: GPLv3 Community / commercial Enterprise