# JanusGraph — Distributed Open-Source Graph Database

> A scalable graph database optimized for storing and querying billions of vertices and edges, with pluggable storage backends and TinkerPop Gremlin query support.

## Install

Save as a script file and run:

# JanusGraph — Distributed Open-Source Graph Database

## Quick Use
```bash
# Run with Docker
docker run -d -p 8182:8182 janusgraph/janusgraph:latest

# Connect via Gremlin console
docker exec -it <container> bin/gremlin.sh
gremlin> :remote connect tinkerpop.server conf/remote.yaml
gremlin> g.addV('person').property('name','Alice')
```

## Introduction
JanusGraph is a distributed, open-source graph database designed for large-scale relationship data. It supports the Apache TinkerPop Gremlin query language and can scale horizontally by plugging into storage backends like Apache Cassandra, HBase, or Google Cloud Bigtable.

## What JanusGraph Does
- Stores and traverses property graphs with billions of vertices and edges
- Supports ACID transactions for consistent graph mutations
- Provides full-text, geo, and numeric indexing via Elasticsearch, Solr, or Lucene
- Exposes the standard Gremlin traversal API for queries and analytics
- Scales horizontally through distributed storage backends

## Architecture Overview
JanusGraph runs as a query layer on top of a pluggable storage engine. Graph data is stored as wide-row adjacency lists in the chosen backend. An optional indexing layer enables global vertex lookups by property. The Gremlin Server component accepts remote traversals over WebSocket.

## Self-Hosting & Configuration
- Run via Docker: `docker run janusgraph/janusgraph:latest`
- Configure storage backend in `janusgraph.properties` (Cassandra, HBase, BerkeleyDB)
- Enable search indexing by setting `index.search.backend=elasticsearch`
- Scale by adding more storage nodes; JanusGraph distributes data automatically
- Deploy Gremlin Server for remote client access over WebSocket or HTTP

## Key Features
- Linear horizontal scalability via Cassandra or HBase backends
- ACID-compliant local transactions and eventual consistency for distributed ops
- Mixed index support combining exact match, full-text, range, and geo queries
- Compatible with the full Apache TinkerPop ecosystem and OLAP via Spark
- Schema-optional with support for property types, edge labels, and vertex labels

## Comparison with Similar Tools
- **Neo4j** — more mature tooling and Cypher language, but limited horizontal scaling in Community Edition
- **Amazon Neptune** — managed service, supports both Gremlin and SPARQL, no self-hosted option
- **ArangoDB** — multi-model (document + graph), uses AQL instead of Gremlin
- **Dgraph** — GraphQL-native distributed graph DB, different query paradigm
- **TigerGraph** — high-performance commercial graph DB with a custom query language

## FAQ
**Q: Which storage backend should I choose?**
A: Cassandra for large-scale distributed deployments; BerkeleyDB for single-node development and testing.

**Q: Does JanusGraph support Cypher queries?**
A: Not natively. It uses Gremlin (TinkerPop). Third-party translators exist but Gremlin is the primary interface.

**Q: Can I run graph analytics at scale?**
A: Yes. JanusGraph integrates with TinkerPop's SparkGraphComputer for OLAP-style bulk traversals.

**Q: How does it handle schema evolution?**
A: Schema changes (new property keys, edge labels) are additive and applied online without downtime.

## Sources
- https://github.com/JanusGraph/janusgraph
- https://janusgraph.org/docs/

---
Source: https://tokrepo.com/en/workflows/asset-e7b23ffb
Author: Script Depot