Alibaba Canal — MySQL Binlog Incremental Subscription and Consumption

Introduction

Canal is an open-source incremental data subscription and consumption platform developed by Alibaba. It parses MySQL binlog events in real time, acting as a MySQL replica, and delivers row-level change events to downstream consumers for cache invalidation, search index updates, data warehousing, and cross-database synchronization.

What Canal Does

Captures MySQL row-level changes (INSERT, UPDATE, DELETE) by parsing binary logs in real time
Simulates a MySQL slave to subscribe to binlog streams without impacting the source database
Delivers change events to Kafka, RocketMQ, RabbitMQ, or Elasticsearch as downstream sinks
Supports filtering by database, table, and column to reduce unnecessary event processing
Provides a client API for building custom change data capture consumers in Java

Architecture Overview

Canal Server connects to MySQL as a replication slave, receiving binlog events through the MySQL replication protocol. The server parses these binary events into structured row-change objects. Canal instances are organized by destination (one per source database). A Canal Client or Canal Adapter connects to the server and consumes parsed events. The Admin component provides a web UI for managing instances and monitoring lag.

Self-Hosting & Configuration

Enable binlog on MySQL with ROW format and create a replication user for Canal
Deploy Canal Server and configure instance.properties with MySQL connection details
Use Canal Admin for web-based instance management and monitoring
Configure Canal Adapter to sink changes directly to Elasticsearch, HBase, or RDB targets
Deploy Canal in cluster mode with ZooKeeper for high availability and failover

Key Features

Near-zero latency change data capture with sub-second binlog parsing
Cluster mode with ZooKeeper-based HA for automatic failover between Canal instances
Built-in adapters for Elasticsearch, RDB, HBase, and Kafka without custom code
Position tracking and resumption to handle restarts without data loss
Support for MySQL, MariaDB, and PolarDB-X as source databases

Comparison with Similar Tools

Debezium — Kafka Connect-based CDC; Canal is standalone and lighter for MySQL-only workloads
Maxwell — MySQL CDC to Kafka; Canal offers more sinks, clustering, and a management UI
MySQL Replication — Native replication syncs whole databases; Canal enables selective, event-driven consumption
AWS DMS — Managed migration service; Canal is self-hosted with no cloud vendor dependency
Flink CDC — Stream processing CDC; Canal focuses on capture and delivery, pairs well with Flink downstream

FAQ

Q: Does Canal modify the source MySQL database? A: No. Canal connects as a read-only replication slave. It only reads binlog events and never writes to the source.

Q: What MySQL binlog format does Canal require? A: Canal requires ROW format binlog. STATEMENT and MIXED formats do not provide the row-level detail Canal needs.

Q: Can Canal handle schema changes (DDL)? A: Yes. Canal parses DDL events and updates its internal schema cache. Consumers receive DDL events alongside DML changes.

Q: How does Canal compare to Debezium for MySQL CDC? A: Canal is lighter and standalone for MySQL-focused use cases. Debezium supports more databases and integrates deeply with Kafka Connect for broader ecosystems.

Alibaba Canal — MySQL Binlog Incremental Subscription and Consumption

Introduction

What Canal Does

Architecture Overview

Self-Hosting & Configuration

Key Features

Comparison with Similar Tools

FAQ

Sources

Discussion

Related Assets

Apache ShardingSphere — Distributed Database Middleware Ecosystem

Alibaba Sentinel — Flow Control and Circuit Breaking for Distributed Systems

NSQ — Real-Time Distributed Messaging Platform in Go