Cette page est affichée en anglais. Une traduction française est en cours.
ScriptsApr 21, 2026·3 min de lecture

Alibaba Canal — MySQL Binlog Incremental Subscription and Consumption

A guide to Canal, the open-source platform that captures MySQL binlog changes in real time for data synchronization, caching, and search index updates.

Introduction

Canal is an open-source incremental data subscription and consumption platform developed by Alibaba. It parses MySQL binlog events in real time, acting as a MySQL replica, and delivers row-level change events to downstream consumers for cache invalidation, search index updates, data warehousing, and cross-database synchronization.

What Canal Does

  • Captures MySQL row-level changes (INSERT, UPDATE, DELETE) by parsing binary logs in real time
  • Simulates a MySQL slave to subscribe to binlog streams without impacting the source database
  • Delivers change events to Kafka, RocketMQ, RabbitMQ, or Elasticsearch as downstream sinks
  • Supports filtering by database, table, and column to reduce unnecessary event processing
  • Provides a client API for building custom change data capture consumers in Java

Architecture Overview

Canal Server connects to MySQL as a replication slave, receiving binlog events through the MySQL replication protocol. The server parses these binary events into structured row-change objects. Canal instances are organized by destination (one per source database). A Canal Client or Canal Adapter connects to the server and consumes parsed events. The Admin component provides a web UI for managing instances and monitoring lag.

Self-Hosting & Configuration

  • Enable binlog on MySQL with ROW format and create a replication user for Canal
  • Deploy Canal Server and configure instance.properties with MySQL connection details
  • Use Canal Admin for web-based instance management and monitoring
  • Configure Canal Adapter to sink changes directly to Elasticsearch, HBase, or RDB targets
  • Deploy Canal in cluster mode with ZooKeeper for high availability and failover

Key Features

  • Near-zero latency change data capture with sub-second binlog parsing
  • Cluster mode with ZooKeeper-based HA for automatic failover between Canal instances
  • Built-in adapters for Elasticsearch, RDB, HBase, and Kafka without custom code
  • Position tracking and resumption to handle restarts without data loss
  • Support for MySQL, MariaDB, and PolarDB-X as source databases

Comparison with Similar Tools

  • Debezium — Kafka Connect-based CDC; Canal is standalone and lighter for MySQL-only workloads
  • Maxwell — MySQL CDC to Kafka; Canal offers more sinks, clustering, and a management UI
  • MySQL Replication — Native replication syncs whole databases; Canal enables selective, event-driven consumption
  • AWS DMS — Managed migration service; Canal is self-hosted with no cloud vendor dependency
  • Flink CDC — Stream processing CDC; Canal focuses on capture and delivery, pairs well with Flink downstream

FAQ

Q: Does Canal modify the source MySQL database? A: No. Canal connects as a read-only replication slave. It only reads binlog events and never writes to the source.

Q: What MySQL binlog format does Canal require? A: Canal requires ROW format binlog. STATEMENT and MIXED formats do not provide the row-level detail Canal needs.

Q: Can Canal handle schema changes (DDL)? A: Yes. Canal parses DDL events and updates its internal schema cache. Consumers receive DDL events alongside DML changes.

Q: How does Canal compare to Debezium for MySQL CDC? A: Canal is lighter and standalone for MySQL-focused use cases. Debezium supports more databases and integrates deeply with Kafka Connect for broader ecosystems.

Sources

Discussion

Connectez-vous pour rejoindre la discussion.
Aucun commentaire pour l'instant. Soyez le premier à partager votre avis.

Actifs similaires