ConfigsApr 15, 2026·3 min read

BadgerDB — Fast Embeddable Key-Value Store in Pure Go

A pure-Go LSM-tree database built for SSDs, the storage engine behind Dgraph. No CGO, ACID transactions, encryption at rest, tiny binary — ideal for Go services that need embedded persistence.

Introduction

BadgerDB is an embeddable, persistent, fast key-value database written in pure Go by the Dgraph team. Unlike traditional LSM engines, Badger separates keys from values (WiscKey design) to minimise write amplification on SSDs, while providing serializable ACID transactions and a snappy Go-native API.

What Badger Does

  • Stores ordered byte keys with ACID transactions and MVCC snapshots.
  • Splits keys (LSM tree) from values (value log) for low write amplification.
  • Supports TTL on keys, useful for caches and session stores.
  • Offers streaming backup/restore and online compaction.
  • Provides iterators with prefetching for scan-heavy workloads.

Architecture Overview

Keys and small values live in a classic LSM with memtables and SSTables. Larger values are appended to a value log and referenced by pointers in the LSM, so compaction rewrites only pointers — not bytes. The database runs entirely inside your Go process, reading memtables first, then bloom-filtered SSTables, then the value log. Garbage collection rewrites value-log files to reclaim space left by deletes and updates.

Self-Hosting & Configuration

  • Pure Go, no CGO: cross-compile to any target with GOOS=linux GOARCH=arm64 go build.
  • Tune ValueLogFileSize, MemTableSize and NumMemtables for write-heavy vs read-heavy loads.
  • Run db.RunValueLogGC(0.5) periodically to reclaim space in the value log.
  • Enable AES-GCM encryption at rest with WithEncryptionKey for sensitive data.
  • Use Badger in-memory mode (InMemory: true) for tests and ephemeral caches.

Key Features

  • Serializable snapshot isolation with optimistic concurrency.
  • Managed mode for custom timestamping (used by Dgraph for transactions).
  • Stream framework for parallel key-space scans into snapshots or external systems.
  • Prometheus metrics and structured logging out of the box.
  • Tiny dependency footprint: a couple hundred KB in your binary.

Comparison with Similar Tools

  • RocksDB / gorocksdb — more features & tuning knobs, but CGO and C++ build pain.
  • bbolt — B+tree, simpler, but single-writer and much slower writes.
  • Pebble — CockroachDB's RocksDB-compatible Go LSM, great alternative.
  • LevelDB / goleveldb — older, unmaintained; Badger is faster and richer.
  • SQLite — structured data, but not optimised for plain KV throughput.

FAQ

Q: Is Badger good for large values? A: Yes — the value log design keeps write amplification low even for MB-sized blobs. Q: Can I access Badger from multiple processes? A: No, it is embedded and takes an exclusive lock. Expose it via your own gRPC/HTTP service. Q: How does Badger compare to Pebble? A: Both are pure-Go LSMs; Pebble is RocksDB-compatible and used by CockroachDB, Badger powers Dgraph and has the WiscKey split design. Q: Is Badger production-ready? A: Yes — in production at Dgraph, FastJet and many Go services since 2017.

Sources

Discussion

Sign in to join the discussion.
No comments yet. Be the first to share your thoughts.

Related Assets