etcd — Distributed Reliable Key-Value Store for Critical Data
etcd is a strongly consistent, distributed key-value store for configuration, service discovery, and coordination. Uses the Raft consensus algorithm. Powers Kubernetes, OpenShift, CoreOS, and many other distributed systems.
先审查再安装
这个资产需要先审查。复制的指令会要求 Agent dry-run、列出写入项,确认后再继续。
npx -y tokrepo@latest install 90135e1f-35f6-11f1-9bc6-00163e2b0d79 --target codex先 dry-run,确认写入项后再运行此命令。
What it is
etcd is an open-source, strongly consistent, distributed key-value store written in Go. It uses the Raft consensus algorithm to ensure data reliability across a cluster of machines. etcd is the backbone of Kubernetes, storing all cluster state, configuration, and metadata.
Beyond Kubernetes, etcd serves as a building block for service discovery, distributed locking, leader election, and configuration management in any distributed system that needs a reliable source of truth.
How it saves time or tokens
Building a distributed consensus system from scratch is a multi-year effort. etcd provides a battle-tested implementation of Raft with a simple key-value API. Teams adopt etcd instead of implementing their own coordination layer, saving months of engineering. Its watch API enables reactive architectures where services respond to configuration changes instantly rather than polling.
How to use
- Download the etcd binary from the official releases page or install via your package manager.
- Start a single-node cluster for development with
etcd(no flags needed) or configure a multi-node cluster with peer URLs. - Use
etcdctlto read and write keys:etcdctl put mykey myvalueandetcdctl get mykey.
Example
# Start etcd locally
etcd --listen-client-urls http://localhost:2379 \
--advertise-client-urls http://localhost:2379
# Put and get a key
etcdctl put /config/db_host '192.168.1.100'
etcdctl get /config/db_host
# Output: /config/db_host
# 192.168.1.100
# Watch for changes
etcdctl watch /config/ --prefix
Related on TokRepo
- DevOps tools — Infrastructure and deployment automation
- Self-hosted solutions — Run critical services on your own infrastructure
Common pitfalls
- etcd is not a general-purpose database. It is designed for small amounts of critical metadata (default max value size is 1.5MB). Do not store large blobs or high-volume transactional data.
- Cluster sizing matters. A 3-node cluster tolerates 1 failure, a 5-node cluster tolerates 2. Always run an odd number of nodes.
- Disk I/O latency directly affects etcd performance. Use SSDs for the data directory. Slow disks cause leader election timeouts and cluster instability.
常见问题
Kubernetes needs a reliable, consistent store for all cluster state: pod definitions, service endpoints, secrets, and config maps. etcd provides strong consistency guarantees via Raft consensus, which ensures that all Kubernetes API servers see the same data even during network partitions or node failures.
Run 3 nodes for most production deployments (tolerates 1 failure). Run 5 nodes if you need higher fault tolerance (tolerates 2 failures). More nodes increase write latency because Raft requires a majority quorum. Never run an even number of nodes.
etcd and Redis serve different purposes. Redis is an in-memory data structure store optimized for speed and variety of data types. etcd is optimized for consistency and reliability of small configuration data. Consul overlaps more with etcd for service discovery but adds health checking and a service mesh.
If a minority of nodes fail, the cluster continues operating normally. Reads and writes proceed through the remaining majority. When the failed node comes back, it automatically syncs with the cluster. If a majority fails, the cluster becomes read-only until quorum is restored.
Use etcdctl snapshot save to create a point-in-time snapshot of the entire datastore. Store these snapshots off-cluster. To restore, use etcdctl snapshot restore. For Kubernetes clusters, this is the primary disaster recovery mechanism for cluster state.
引用来源 (3)
- etcd GitHub Repository— etcd uses Raft consensus algorithm for distributed consistency
- Kubernetes Components Docs— Kubernetes stores all cluster state in etcd
- Raft Paper (Stanford)— Raft consensus algorithm specification
讨论
相关资产
MMKV — Efficient Mobile Key-Value Storage Framework by WeChat
MMKV is a high-performance key-value storage framework developed by WeChat. It uses memory-mapped files and protobuf encoding to deliver fast, reliable persistence on Android, iOS, macOS, Windows, and POSIX systems.
Apache Mesos — Distributed Systems Kernel for Data Center Resources
Apache Mesos abstracts CPU, memory, storage, and other compute resources across a cluster, enabling fault-tolerant distributed applications and frameworks to share infrastructure efficiently.
Apache Ignite — Distributed In-Memory Computing Platform
A distributed database and computing platform that combines in-memory speed with disk persistence, providing distributed SQL, key-value storage, and compute grid capabilities.
Apache ZooKeeper — Distributed Coordination Service for Reliable Systems
A centralized service for maintaining configuration, naming, synchronization, and group services across distributed applications in the Hadoop and Kafka ecosystems.