# Apache Kafka — Distributed Event Streaming Platform > Apache Kafka is the open-source distributed event streaming platform used by thousands of companies for high-performance data pipelines, streaming analytics, and mission-critical applications. Trillions of messages per day at LinkedIn, Netflix, Uber. ## Install Save as a script file and run: ## Quick Use Start Kafka locally with KRaft (no ZooKeeper needed since v3.5): ```bash # Download curl -O https://downloads.apache.org/kafka/3.7.0/kafka_2.13-3.7.0.tgz tar xzf kafka_2.13-3.7.0.tgz && cd kafka_2.13-3.7.0 # Generate cluster ID KAFKA_CLUSTER_ID=$(bin/kafka-storage.sh random-uuid) bin/kafka-storage.sh format -t $KAFKA_CLUSTER_ID -c config/kraft/server.properties # Start broker bin/kafka-server-start.sh config/kraft/server.properties ``` Produce and consume: ```bash # Create topic bin/kafka-topics.sh --create --topic orders --bootstrap-server localhost:9092 # Producer bin/kafka-console-producer.sh --topic orders --bootstrap-server localhost:9092 > { "id": 1, "amount": 49.99 } # Consumer bin/kafka-console-consumer.sh --topic orders --from-beginning --bootstrap-server localhost:9092 ``` ## Intro Apache Kafka is a distributed event streaming platform originally created at LinkedIn (by Jay Kreps, Neha Narkhede, and Jun Rao) and open-sourced in 2011. Now donated to the Apache Software Foundation. Kafka powers data pipelines at thousands of companies, handling trillions of messages per day. - **Repo**: https://github.com/apache/kafka - **Stars**: 32K+ - **Language**: Java + Scala - **License**: Apache 2.0 ## What Kafka Does - **Publish and subscribe** — producers write, consumers read - **Topics and partitions** — horizontally scalable logs - **Persistence** — durable disk storage with configurable retention - **Replication** — per-partition replicas across brokers - **Consumer groups** — parallel consumption with auto rebalance - **Streams API** — stateful stream processing - **Connect** — pre-built integrations (JDBC, S3, Elastic, etc.) - **Exactly-once** — transactional semantics - **KRaft** — Raft-based metadata (replaces ZooKeeper) ## Architecture Brokers form a cluster, each holding partition replicas. Producers write to partitions (by key-based hashing). Consumers pull from partitions, tracking offsets. KRaft nodes (v3.5+) handle cluster metadata instead of ZooKeeper. ## Self-Hosting ```bash # Docker Compose (single broker) version: "3" services: kafka: image: bitnami/kafka:3.7 ports: - "9092:9092" environment: KAFKA_CFG_NODE_ID: 1 KAFKA_CFG_PROCESS_ROLES: controller,broker KAFKA_CFG_CONTROLLER_QUORUM_VOTERS: 1@kafka:9093 KAFKA_CFG_LISTENERS: PLAINTEXT://:9092,CONTROLLER://:9093 KAFKA_CFG_ADVERTISED_LISTENERS: PLAINTEXT://localhost:9092 KAFKA_CFG_LISTENER_SECURITY_PROTOCOL_MAP: CONTROLLER:PLAINTEXT,PLAINTEXT:PLAINTEXT KAFKA_CFG_CONTROLLER_LISTENER_NAMES: CONTROLLER ``` ## Key Features - Distributed commit log - Horizontal scale via partitions - Replication for durability - Consumer groups for parallelism - Exactly-once transactional semantics - Kafka Connect ecosystem - Kafka Streams for stateful processing - KRaft mode (no ZooKeeper) - MirrorMaker for cross-cluster replication - Schema Registry (Confluent) ## Comparison | System | Model | Durability | Ecosystem | |---|---|---|---| | Kafka | Distributed log | Disk + replicas | Largest | | Redpanda | Kafka-compatible (C++) | Disk + replicas | Kafka-compatible | | Pulsar | Segmented storage | BookKeeper | Growing | | NATS JetStream | Streaming | Disk | Simpler | | RabbitMQ | Traditional MQ | Persistent queues | Mature | ## 常见问题 FAQ **Q: 和 RabbitMQ 区别?** A: Kafka 是分布式日志(持久存储、按时间保留、高吞吐);RabbitMQ 是传统消息队列(FIFO、ack、routing)。流式数据 + 事件溯源选 Kafka;异步任务队列选 RabbitMQ。 **Q: 还需要 ZooKeeper 吗?** A: v3.5+ 的 KRaft 模式已经 GA。新集群不再需要 ZooKeeper,部署简化。 **Q: 性能如何?** A: 单 broker 轻松几十万 msg/s。LinkedIn 单集群峰值达到 7 trillion 消息/天。 ## 来源与致谢 Sources - Docs: https://kafka.apache.org/documentation - GitHub: https://github.com/apache/kafka - License: Apache 2.0 --- Source: https://tokrepo.com/en/workflows/a2aa8afb-35f3-11f1-9bc6-00163e2b0d79 Author: Script Depot