Introduction
CloudNative-PG (CNPG) is a Kubernetes operator that manages the full lifecycle of PostgreSQL clusters. Developed by EDB and now a CNCF Sandbox project, it treats PostgreSQL as a first-class Kubernetes workload with automated failover, continuous WAL archiving, and declarative configuration.
What CloudNative-PG Does
- Deploys and manages multi-instance PostgreSQL clusters as Kubernetes custom resources
- Performs automated failover by promoting a replica within seconds when the primary fails
- Streams WAL files continuously to S3, GCS, or Azure Blob for point-in-time recovery
- Handles rolling updates of PostgreSQL minor and major versions with zero-downtime switchover
- Monitors cluster health via built-in Prometheus metrics and Kubernetes events
Architecture Overview
CNPG runs a single operator pod that watches Cluster custom resources. Each PostgreSQL instance runs in its own pod with a dedicated PVC, avoiding shared storage. Streaming replication connects replicas to the primary. The operator manages the primary election using Kubernetes lease objects rather than external consensus tools. Backup and WAL archiving use Barman Cloud under the hood, writing directly to object storage without an intermediate backup server.
Self-Hosting & Configuration
- Install the operator via kubectl apply, Helm chart, or OLM (Operator Lifecycle Manager) on OpenShift
- Define clusters declaratively in YAML with instance count, storage size, resource limits, and PostgreSQL parameters
- Configure continuous backup by pointing the Cluster spec to an S3/GCS/Azure bucket with credentials stored in a Kubernetes Secret
- Set up connection pooling with the built-in PgBouncer integration defined inline in the Cluster spec
- Use the cnpg kubectl plugin (
kubectl cnpg status,kubectl cnpg promote) for operational commands
Key Features
- Fencing and self-healing automatically restart or recreate failed instances without manual intervention
- Declarative tablespace support lets you separate data, indexes, and WAL on different storage classes
- Native support for PostgreSQL's synchronous replication modes for zero data loss configurations
- Replica clusters enable cross-region disaster recovery by streaming WAL from a primary cluster
- Volume snapshot backups leverage Kubernetes CSI snapshots for fast, storage-level backup and restore
Comparison with Similar Tools
- Zalando Postgres Operator — uses Patroni for HA; CNPG handles failover natively through Kubernetes primitives without Patroni or etcd
- CrunchyData PGO — mature operator with pgBackRest; CNPG is lighter weight and uses Barman Cloud for backup
- Bitnami PostgreSQL Helm Chart — simple StatefulSet deployment; CNPG adds automated failover, backup, and day-2 operations
- Amazon RDS — fully managed but cloud-locked; CNPG runs on any Kubernetes cluster with portable backup to any S3-compatible store
- Patroni — standalone HA tool; CNPG integrates HA directly into the Kubernetes control loop
FAQ
Q: Does CNPG support PostgreSQL major version upgrades? A: Yes. CNPG can perform in-place major upgrades using pg_upgrade or create a new cluster and import data from the old one.
Q: How fast is failover? A: Automated failover typically completes within 5-10 seconds, depending on readiness probe configuration and replica lag.
Q: Can I use CNPG with existing PostgreSQL data? A: Yes. CNPG supports importing databases from pg_dump, external PostgreSQL servers via streaming replication, or object storage backups.
Q: Is CNPG a CNCF project? A: Yes. CloudNative-PG joined the CNCF Sandbox in 2024, with backing from EDB and a growing contributor community.