Introduction
Apache Kvrocks is a distributed key-value database that speaks the Redis wire protocol but stores data on disk using RocksDB. Originally created at Meitu, it graduated to an Apache Software Foundation project. Kvrocks lets teams keep the Redis API they know while dramatically reducing memory costs for large datasets.
What Apache Kvrocks Does
- Implements Redis commands (strings, hashes, lists, sets, sorted sets, streams) with on-disk storage
- Reduces infrastructure costs by storing data on SSDs instead of requiring everything in RAM
- Supports Redis Cluster protocol for horizontal scaling across multiple nodes
- Provides namespace-based multi-tenancy to isolate workloads on a shared cluster
- Handles replication with master-replica topology using Redis replication protocol
Architecture Overview
Kvrocks maps Redis data structures onto RocksDB column families. Strings map directly to key-value pairs, while complex types (hashes, lists, sorted sets) use composite keys encoding the structure metadata and elements. The server processes commands through a Redis-compatible protocol parser, translates them into RocksDB read/write operations, and returns results in Redis format. Cluster mode uses a gossip protocol and hash-slot assignment compatible with Redis Cluster clients.
Self-Hosting & Configuration
- Deploy via Docker or compile from source with CMake and a C++17 compiler
- Edit kvrocks.conf to set bind address, port, data directory, and RocksDB tuning parameters
- Configure rocksdb.write_buffer_size and rocksdb.max_write_buffer_number for write throughput
- Enable cluster mode with cluster-enabled yes and manage slots with the CLUSTERX commands
- Set up replication with slaveof or replicaof directives, same as Redis
Key Features
- Full Redis protocol compatibility allows existing Redis clients and libraries to work unchanged
- Disk-based storage with RocksDB handles datasets far larger than available RAM at lower cost
- Namespace isolation lets multiple applications share one Kvrocks cluster securely
- Lua scripting support for server-side logic using the EVAL command family
- Slow log and performance statistics help identify bottleneck commands and optimize workloads
Comparison with Similar Tools
- Redis — In-memory with optional persistence; faster for hot data but requires RAM proportional to dataset size
- DragonflyDB — Modern multi-threaded in-memory store; higher throughput than Redis but still memory-bound
- KeyDB — Multi-threaded Redis fork staying in-memory; does not solve the cost problem for large datasets
- Valkey — Redis fork maintaining in-memory model with community governance; Kvrocks trades latency for storage cost savings
- Garnet — Microsoft's C# Redis-compatible cache; in-memory focused with different performance characteristics
FAQ
Q: Is Kvrocks slower than Redis? A: For hot data that fits in the OS page cache, Kvrocks performance approaches Redis. For cold data requiring disk reads, latency is higher (typically low single-digit milliseconds on NVMe SSDs) but still fast for most applications.
Q: Which Redis commands are supported? A: Kvrocks supports the majority of Redis commands including strings, hashes, lists, sets, sorted sets, streams, pub/sub, Lua scripting, and transactions. Some cluster management commands differ slightly.
Q: Can I migrate from Redis to Kvrocks? A: Yes. Use redis-cli --pipe or replication to transfer data. Since Kvrocks speaks the Redis protocol, applications typically need only a connection string change.
Q: How does Kvrocks handle large datasets? A: Kvrocks stores data on disk via RocksDB, so dataset size is limited by disk capacity rather than RAM. Clusters with terabytes of data are feasible on commodity SSD servers.