ScriptsApr 21, 2026·3 min read

Alibaba Canal — MySQL Binlog Incremental Subscription and Consumption

A guide to Canal, the open-source platform that captures MySQL binlog changes in real time for data synchronization, caching, and search index updates.

Introduction

Canal is an open-source incremental data subscription and consumption platform developed by Alibaba. It parses MySQL binlog events in real time, acting as a MySQL replica, and delivers row-level change events to downstream consumers for cache invalidation, search index updates, data warehousing, and cross-database synchronization.

What Canal Does

  • Captures MySQL row-level changes (INSERT, UPDATE, DELETE) by parsing binary logs in real time
  • Simulates a MySQL slave to subscribe to binlog streams without impacting the source database
  • Delivers change events to Kafka, RocketMQ, RabbitMQ, or Elasticsearch as downstream sinks
  • Supports filtering by database, table, and column to reduce unnecessary event processing
  • Provides a client API for building custom change data capture consumers in Java

Architecture Overview

Canal Server connects to MySQL as a replication slave, receiving binlog events through the MySQL replication protocol. The server parses these binary events into structured row-change objects. Canal instances are organized by destination (one per source database). A Canal Client or Canal Adapter connects to the server and consumes parsed events. The Admin component provides a web UI for managing instances and monitoring lag.

Self-Hosting & Configuration

  • Enable binlog on MySQL with ROW format and create a replication user for Canal
  • Deploy Canal Server and configure instance.properties with MySQL connection details
  • Use Canal Admin for web-based instance management and monitoring
  • Configure Canal Adapter to sink changes directly to Elasticsearch, HBase, or RDB targets
  • Deploy Canal in cluster mode with ZooKeeper for high availability and failover

Key Features

  • Near-zero latency change data capture with sub-second binlog parsing
  • Cluster mode with ZooKeeper-based HA for automatic failover between Canal instances
  • Built-in adapters for Elasticsearch, RDB, HBase, and Kafka without custom code
  • Position tracking and resumption to handle restarts without data loss
  • Support for MySQL, MariaDB, and PolarDB-X as source databases

Comparison with Similar Tools

  • Debezium — Kafka Connect-based CDC; Canal is standalone and lighter for MySQL-only workloads
  • Maxwell — MySQL CDC to Kafka; Canal offers more sinks, clustering, and a management UI
  • MySQL Replication — Native replication syncs whole databases; Canal enables selective, event-driven consumption
  • AWS DMS — Managed migration service; Canal is self-hosted with no cloud vendor dependency
  • Flink CDC — Stream processing CDC; Canal focuses on capture and delivery, pairs well with Flink downstream

FAQ

Q: Does Canal modify the source MySQL database? A: No. Canal connects as a read-only replication slave. It only reads binlog events and never writes to the source.

Q: What MySQL binlog format does Canal require? A: Canal requires ROW format binlog. STATEMENT and MIXED formats do not provide the row-level detail Canal needs.

Q: Can Canal handle schema changes (DDL)? A: Yes. Canal parses DDL events and updates its internal schema cache. Consumers receive DDL events alongside DML changes.

Q: How does Canal compare to Debezium for MySQL CDC? A: Canal is lighter and standalone for MySQL-focused use cases. Debezium supports more databases and integrates deeply with Kafka Connect for broader ecosystems.

Sources

Discussion

Sign in to join the discussion.
No comments yet. Be the first to share your thoughts.

Related Assets