# JanusGraph — Distributed Open-Source Graph Database > A scalable graph database optimized for storing and querying billions of vertices and edges, with pluggable storage backends and TinkerPop Gremlin query support. ## Install Save as a script file and run: # JanusGraph — Distributed Open-Source Graph Database ## Quick Use ```bash # Run with Docker docker run -d -p 8182:8182 janusgraph/janusgraph:latest # Connect via Gremlin console docker exec -it bin/gremlin.sh gremlin> :remote connect tinkerpop.server conf/remote.yaml gremlin> g.addV('person').property('name','Alice') ``` ## Introduction JanusGraph is a distributed, open-source graph database designed for large-scale relationship data. It supports the Apache TinkerPop Gremlin query language and can scale horizontally by plugging into storage backends like Apache Cassandra, HBase, or Google Cloud Bigtable. ## What JanusGraph Does - Stores and traverses property graphs with billions of vertices and edges - Supports ACID transactions for consistent graph mutations - Provides full-text, geo, and numeric indexing via Elasticsearch, Solr, or Lucene - Exposes the standard Gremlin traversal API for queries and analytics - Scales horizontally through distributed storage backends ## Architecture Overview JanusGraph runs as a query layer on top of a pluggable storage engine. Graph data is stored as wide-row adjacency lists in the chosen backend. An optional indexing layer enables global vertex lookups by property. The Gremlin Server component accepts remote traversals over WebSocket. ## Self-Hosting & Configuration - Run via Docker: `docker run janusgraph/janusgraph:latest` - Configure storage backend in `janusgraph.properties` (Cassandra, HBase, BerkeleyDB) - Enable search indexing by setting `index.search.backend=elasticsearch` - Scale by adding more storage nodes; JanusGraph distributes data automatically - Deploy Gremlin Server for remote client access over WebSocket or HTTP ## Key Features - Linear horizontal scalability via Cassandra or HBase backends - ACID-compliant local transactions and eventual consistency for distributed ops - Mixed index support combining exact match, full-text, range, and geo queries - Compatible with the full Apache TinkerPop ecosystem and OLAP via Spark - Schema-optional with support for property types, edge labels, and vertex labels ## Comparison with Similar Tools - **Neo4j** — more mature tooling and Cypher language, but limited horizontal scaling in Community Edition - **Amazon Neptune** — managed service, supports both Gremlin and SPARQL, no self-hosted option - **ArangoDB** — multi-model (document + graph), uses AQL instead of Gremlin - **Dgraph** — GraphQL-native distributed graph DB, different query paradigm - **TigerGraph** — high-performance commercial graph DB with a custom query language ## FAQ **Q: Which storage backend should I choose?** A: Cassandra for large-scale distributed deployments; BerkeleyDB for single-node development and testing. **Q: Does JanusGraph support Cypher queries?** A: Not natively. It uses Gremlin (TinkerPop). Third-party translators exist but Gremlin is the primary interface. **Q: Can I run graph analytics at scale?** A: Yes. JanusGraph integrates with TinkerPop's SparkGraphComputer for OLAP-style bulk traversals. **Q: How does it handle schema evolution?** A: Schema changes (new property keys, edge labels) are additive and applied online without downtime. ## Sources - https://github.com/JanusGraph/janusgraph - https://janusgraph.org/docs/ --- Source: https://tokrepo.com/en/workflows/asset-e7b23ffb Author: Script Depot