How do I install Apache Cassandra — Distributed Wide-Column Database at Scale?

Visit the asset page on TokRepo and click "Copy for agent" to get the installation instructions. Most assets can be installed with a single command.

Apache Cassandra — Distributed Wide-Column Database at Scale

What Cassandra Does

Wide-column model — partition key + clustering keys + columns
CQL query language — SQL-like declarative syntax
Masterless peer-to-peer — all nodes equal (no single point of failure)
Tunable consistency — per-query consistency level (ONE, QUORUM, ALL)
Multi-DC replication — NetworkTopologyStrategy
Lightweight transactions (LWT) — Paxos-based compare-and-set
Materialized views — denormalized auto-maintained tables
TTL — per-cell time-to-live
Gossip protocol — peer state distribution
Compaction strategies — STCS, LCS, TWCS for different workloads

Architecture

Peer-to-peer ring: each node owns a range of partition keys determined by consistent hashing. Data is replicated to N nodes. Writes go to a memtable + commit log, flushed to SSTables. Reads merge SSTables + memtable + possibly bloom filters and row cache.

Self-Hosting

# 3-node cluster
version: "3"
services:
  cassandra-node1:
    image: cassandra:5
    environment:
      CASSANDRA_CLUSTER_NAME: tokrepo
      CASSANDRA_SEEDS: cassandra-node1
  cassandra-node2:
    image: cassandra:5
    environment:
      CASSANDRA_CLUSTER_NAME: tokrepo
      CASSANDRA_SEEDS: cassandra-node1
  cassandra-node3:
    image: cassandra:5
    environment:
      CASSANDRA_CLUSTER_NAME: tokrepo
      CASSANDRA_SEEDS: cassandra-node1

Key Features

Linear horizontal scaling
Masterless architecture
Tunable consistency
Multi-DC replication
CQL query language
Secondary indexes
Materialized views
Lightweight transactions
TTL for auto-expiration
Battle-tested at petabyte scale

Comparison

Database	Model	Consistency	Scale
Cassandra	Wide column	Tunable (AP)	Linear (masterless)
ScyllaDB	Wide column (CQL compatible)	Tunable	Linear (shard-per-core)
HBase	Wide column	Strong (CP)	Region servers
DynamoDB	Key-value + doc	Tunable	Managed
Bigtable	Wide column	Strong	Managed
MongoDB	Document	Tunable	Sharding

FAQ

Q: Cassandra vs ScyllaDB? A: Fully API-compatible. ScyllaDB is implemented in C++ with a shard-per-core architecture and performs several times better, but its commercial edition is proprietary. Cassandra is an Apache Foundation project with a more mature ecosystem.

Q: What scenarios is it good for? A: Large-scale time series data, event logs, IoT, message history, and recommendation systems. Not suitable for scenarios requiring complex JOINs or strong transactions (no multi-table JOINs).

Q: Data modeling principles? A: Query-driven. Decide on your query patterns first, then design your table schema. Denormalization and data duplication are the norm — one table per query pattern.

Sources

Docs: https://cassandra.apache.org/doc
GitHub: https://github.com/apache/cassandra
License: Apache 2.0

Apache Cassandra — Distributed Wide-Column Database at Scale

What Cassandra Does

Architecture

Self-Hosting

Key Features

Comparison

FAQ

Sources

Discusión

Activos relacionados

Unkey — Open-Source API Key Management Platform

Flagsmith — Open-Source Feature Flags and Remote Config

OpenStatus — Open-Source Monitoring and Status Page Platform