# Zero-Downtime DB Migration — Expand Contract Checklist > Expand-contract database migration checklist for agents. Covers additive schema changes, batched backfills, rollback, and contract gates. ## Install Copy the content below into your project: --- title: Zero-Downtime DB Migration — Expand Contract Checklist asset_kind: knowledge target_tools: [codex, claude_code, cursor, gemini_cli] install_mode: single entrypoint: README.md --- # Zero-Downtime DB Migration — Expand Contract Checklist Use this checklist when an agent needs to change a production database without taking the app down. The safe pattern is expand, backfill, dual-read or dual-write when needed, switch reads, then contract after the old path has no traffic. ## Quick Use Never start with a destructive migration. Start with an additive change: ```sql ALTER TABLE tb_example ADD COLUMN new_status VARCHAR(32) NULL; ``` Deploy application code that can tolerate both old and new shapes: ```text release 1: write old column, read old column, tolerate new column backfill: copy old column into new column in small batches release 2: write both columns, read new column with old fallback release 3: read new column only after verification contract: drop old column in a later maintenance window ``` ## Agent Checklist Before proposing or running a migration, verify: 1. The first schema change is additive: new nullable column, new table, new index, or new enum shadow table. 2. Old code can still run after the new schema exists. 3. New code can run before the backfill is complete. 4. Backfill runs in bounded batches and can resume safely. 5. There is a rollback path that does not require restoring the whole database. 6. Monitoring includes error rate, lock wait, replication lag, row count, and backfill progress. 7. The contract step is delayed until logs show no reads or writes to the old path. ## Batch Backfill Shape Use small batches with an id cursor: ```sql UPDATE tb_example SET new_status = old_status WHERE id > ? AND id <= ? AND new_status IS NULL; ``` For MySQL, keep transactions short and watch lock waits. For PostgreSQL, avoid long table rewrites and validate constraints separately when possible. For both, make the job idempotent. ## Red Flags - `DROP COLUMN`, `RENAME COLUMN`, or non-null constraint in the first release. - A migration that assumes every app server restarts at the same second. - Backfill that scans the whole table in one transaction. - Rollback plan that says "restore backup" for an ordinary deploy. - No verification query for migrated row counts. ## Source & Thanks This is an original TokRepo checklist by William Wang. It follows the widely used expand-contract migration pattern documented by [Martin Fowler](https://martinfowler.com/bliki/ParallelChange.html) and operational guidance from database migration tools such as [gh-ost](https://github.com/github/gh-ost). # 零停机数据库迁移:Expand Contract 清单 Agent 要改生产数据库时,不要从破坏性 SQL 开始。安全路径是:扩展 schema、批量回填、必要时双写/双读、切读新字段,最后等旧路径无流量后再收缩。 ## 快速使用 第一步只做 additive change,例如新增 nullable 字段、新表、新索引。不要第一步就 `DROP COLUMN`、`RENAME COLUMN` 或加非空约束。 ## 判定清单 1. 旧代码在新 schema 下还能运行。 2. 新代码在回填未完成时也能运行。 3. 回填可分批、可重试、可恢复。 4. rollback 不依赖整库恢复。 5. 监控包含错误率、锁等待、复制延迟、行数和回填进度。 6. 删除旧字段必须推迟到日志证明旧读写路径已经没有流量。 --- Source: https://tokrepo.com/en/workflows/zero-downtime-db-migration-expand-contract-checklist-84ab1d5e Author: henuwangkai