Apache Pulsar — Cloud-Native Distributed Messaging and Streaming
Apache Pulsar is a cloud-native distributed messaging and streaming platform. It combines the best of traditional messaging (like RabbitMQ) with streaming (like Kafka) — providing multi-tenancy, geo-replication, and tiered storage in a single system.
What it is
Apache Pulsar is a cloud-native distributed messaging and streaming platform that combines the capabilities of traditional message queues (like RabbitMQ) with event streaming (like Kafka) in a single system. It provides multi-tenancy, geo-replication, and tiered storage as built-in features rather than add-ons.
Pulsar is designed for platform teams and backend engineers who need a unified messaging layer that scales from simple pub-sub to complex event streaming without deploying separate systems for each use case.
How it saves time or tokens
Pulsar's architecture separates compute (brokers) from storage (BookKeeper), which means you can scale throughput and storage independently. This eliminates the rebalancing pain common with broker-storage-coupled systems. Multi-tenancy is built in, so a single Pulsar cluster can serve multiple teams with namespace-level isolation, reducing operational overhead.
The unified messaging model means you do not need to maintain separate Kafka clusters for streaming and RabbitMQ for queuing. One Pulsar cluster handles both patterns with topic-level configuration.
How to use
- Start Pulsar with Docker:
docker run -d --name pulsar -p 6650:6650 -p 8080:8080 apachepulsar/pulsar:latest bin/pulsar standalone. - Produce a message:
bin/pulsar-client produce my-topic --messages 'hello pulsar'. - Consume messages:
bin/pulsar-client consume my-topic -s my-sub --num-messages 0.
Example
# Start Pulsar standalone in Docker
docker run -d --name pulsar \
-p 6650:6650 -p 8080:8080 \
apachepulsar/pulsar:latest bin/pulsar standalone
# Produce messages
bin/pulsar-client produce my-topic --messages 'hello pulsar'
# Consume messages
bin/pulsar-client consume my-topic -s my-subscription --num-messages 0
# Python client
pip install pulsar-client
import pulsar
client = pulsar.Client('pulsar://localhost:6650')
producer = client.create_producer('my-topic')
producer.send('hello from python'.encode())
client.close()
Related on TokRepo
- DevOps tools -- infrastructure and operations tooling
- Self-hosted solutions -- open-source self-hosted platforms
Common pitfalls
- Pulsar standalone mode is for development only; production deployments require a ZooKeeper cluster and BookKeeper ensemble, which adds operational complexity.
- The broker-storage separation is powerful but means more moving parts to monitor; invest in observability (Prometheus metrics are built in) from day one.
- Client library support varies by language; Java and Python clients are most mature, while Go and Node.js clients may lag in feature parity.
Frequently Asked Questions
Pulsar separates compute (brokers) from storage (BookKeeper), enabling independent scaling. Kafka couples brokers and storage, requiring partition rebalancing when scaling. Pulsar also provides built-in multi-tenancy and geo-replication that Kafka requires additional tooling for.
Pulsar supports tenant and namespace isolation at the cluster level. Different teams or applications can share a single Pulsar cluster with independent topic namespaces, access controls, and resource quotas.
Yes. Pulsar supports exactly-once message delivery through transactional messaging. Producers can send messages within transactions, and consumers can acknowledge messages atomically, ensuring no duplicates or losses.
Tiered storage automatically offloads older messages from BookKeeper to cheaper object storage (S3, GCS, Azure Blob). This lets you retain months or years of data without the cost of keeping it all on fast storage.
Yes. Pulsar Functions is a lightweight compute framework for processing messages in-flight. Functions can transform, route, or enrich messages without deploying a separate stream processing framework.
Citations (3)
- Apache Pulsar GitHub— Apache Pulsar is a distributed messaging and streaming platform
- Apache Pulsar Documentation— Multi-tenancy and geo-replication built into the architecture
- Apache Pulsar— Apache Software Foundation top-level project
Related on TokRepo
Discussion
Related Assets
doctest — The Fastest Feature-Rich C++ Testing Framework
doctest is a single-header C++ testing framework designed for minimal compile-time overhead and maximum speed.
Chai — BDD/TDD Assertion Library for Node.js
Chai is a flexible assertion library for Node.js and browsers that supports expect, should, and assert styles.
Supertest — HTTP Assertion Library for Node.js APIs
Supertest provides a high-level API for testing HTTP servers in Node.js with fluent assertion chaining.