Is Debezium — Real-Time Change Data Capture Platform free to use?

Yes. Debezium — Real-Time Change Data Capture Platform is freely available on TokRepo. Check the Source & Thanks section on the asset page for the specific open-source license.

How do I install Debezium — Real-Time Change Data Capture Platform?

Visit the asset page on TokRepo and click "Copy for agent" to get the installation instructions. Most assets can be installed with a single command.

ConfigsApr 16, 2026·3 min read

Debezium — Real-Time Change Data Capture Platform

A distributed platform for streaming database changes into event logs, capturing row-level inserts, updates, and deletes from MySQL, PostgreSQL, MongoDB, and more.

AI Open Source · Community

TL;DR

Debezium captures row-level database changes and streams them to Kafka for real-time data pipelines.

§01

What it is

Debezium is a distributed platform for change data capture (CDC). It monitors database transaction logs and streams row-level inserts, updates, and deletes into Apache Kafka topics. Debezium supports MySQL, PostgreSQL, MongoDB, SQL Server, Oracle, Cassandra, and Db2.

Debezium targets data engineers and platform teams building real-time data pipelines, event-driven architectures, cache invalidation systems, and data warehouse synchronization.

§02

How it saves time or tokens

Debezium eliminates polling-based data synchronization. Instead of querying databases on an interval to detect changes, Debezium reads the transaction log and emits changes as they happen. This reduces database load, eliminates missed changes between poll intervals, and provides sub-second latency. The Kafka Connect architecture means you configure connectors declaratively without writing code.

§03

How to use

Start the required infrastructure:

docker run -d --name zookeeper -p 2181:2181 quay.io/debezium/zookeeper
docker run -d --name kafka -p 9092:9092 \
  --link zookeeper quay.io/debezium/kafka
docker run -d --name connect -p 8083:8083 \
  --link kafka --link zookeeper quay.io/debezium/connect

curl -X POST http://localhost:8083/connectors -H 'Content-Type: application/json' -d '{
  "name": "mysql-connector",
  "config": {
    "connector.class": "io.debezium.connector.mysql.MySqlConnector",
    "database.hostname": "mysql",
    "database.port": "3306",
    "database.user": "debezium",
    "database.password": "dbz",
    "database.server.id": "1",
    "topic.prefix": "dbserver1",
    "schema.history.internal.kafka.bootstrap.servers": "kafka:9092",
    "schema.history.internal.kafka.topic": "schema-changes"
  }
}'

Consume change events from Kafka topics named dbserver1.<database>.<table>.

§04

Example

A Debezium change event JSON structure:

{
  "before": {"id": 1, "name": "Alice", "email": "alice@old.com"},
  "after": {"id": 1, "name": "Alice", "email": "alice@new.com"},
  "source": {"db": "inventory", "table": "customers"},
  "op": "u",
  "ts_ms": 1713000000000
}

§05

Related on TokRepo

Database tools — database utilities and connectors
DevOps tools — infrastructure and data pipeline resources

§06

Common pitfalls

MySQL requires binlog_format=ROW and binlog_row_image=FULL. Without these settings, Debezium cannot capture complete change events.
Initial snapshots of large tables can take hours and put load on the source database. Schedule initial snapshots during low-traffic periods.
Kafka topic retention must outlast your downstream consumer lag. If consumers fall behind, they lose events when topics are compacted.

Frequently Asked Questions

Does Debezium require Kafka?+

The primary deployment uses Kafka Connect. However, Debezium Server provides a standalone runtime that can send events to Amazon Kinesis, Google Pub/Sub, Apache Pulsar, and other messaging systems without Kafka.

Which databases does Debezium support?+

Debezium supports MySQL, PostgreSQL, MongoDB, SQL Server, Oracle, Db2, Cassandra, and Vitess. Each database has a dedicated connector that reads its specific transaction log format.

How does CDC differ from polling?+

CDC reads the database transaction log to capture every change in order with sub-second latency. Polling queries the database on an interval, missing changes between polls and adding query load to the database.

Can Debezium handle schema changes?+

Yes. Debezium tracks schema changes through the transaction log and records them in a schema history topic. Downstream consumers can detect when columns are added, removed, or modified.

What happens if the connector goes down?+

Debezium stores its position in the transaction log in Kafka Connect offsets. When the connector restarts, it resumes from the last committed offset without missing or duplicating events.

Citations (3)

Debezium GitHub— Debezium captures row-level changes from database transaction logs
Debezium Documentation— Supports MySQL, PostgreSQL, MongoDB, SQL Server, Oracle, and more
Debezium Tutorial— Kafka Connect architecture for declarative connector configuration

Related on TokRepo

Database tools DevOps tools Featured workflows

Discussion

No comments yet. Be the first to share your thoughts.

Debezium — Real-Time Change Data Capture Platform

What it is

How it saves time or tokens

How to use

Example

Related on TokRepo

Common pitfalls

Frequently Asked Questions

Citations (3)

Related on TokRepo

Discussion

Related Assets

Flower — Federated Learning Framework for Any ML Platform

H2O-3 — Scalable Open-Source Machine Learning Platform

Open3D — Modern Library for 3D Data Processing