Configs2026年5月13日·1 分钟阅读

Patroni — High Availability PostgreSQL Cluster Manager

Patroni is a Python-based template for creating and managing highly available PostgreSQL clusters with automatic failover, using distributed consensus stores like etcd, Consul, or ZooKeeper.

Introduction

Patroni automates PostgreSQL high availability by managing streaming replication, leader election, and automatic failover. It uses a distributed consensus store (etcd, Consul, or ZooKeeper) to coordinate cluster state, ensuring one primary and zero or more synchronous or asynchronous replicas.

What Patroni Does

  • Bootstraps new PostgreSQL clusters or takes over existing instances
  • Performs automatic leader election and failover via distributed consensus
  • Manages streaming replication configuration between primary and replicas
  • Provides a REST API and patronictl CLI for cluster operations and switchover
  • Supports scheduled and on-demand switchovers with no data loss

Architecture Overview

Each PostgreSQL node runs a Patroni agent that registers itself with a DCS (distributed configuration store). The DCS holds the leader lock; the agent holding the lock configures its PostgreSQL instance as primary, while others configure as replicas. If the leader fails to renew its lock, a replica with the most recent WAL position is promoted. HAProxy or PgBouncer sits in front to route connections to the current primary.

Self-Hosting & Configuration

  • Requires a running DCS (etcd, Consul, or ZooKeeper) for consensus
  • Configure patroni.yml with PostgreSQL data directory, replication settings, and DCS endpoints
  • Deploy one Patroni agent per PostgreSQL node, managed by systemd
  • Place HAProxy or a connection pooler in front for transparent client routing
  • Tune TTL, loop_wait, and retry_timeout to balance failover speed and stability

Key Features

  • Automatic failover with configurable data loss tolerance (synchronous mode available)
  • REST API for health checks, switchover, and reinitializing failed nodes
  • Supports custom bootstrap methods including pg_basebackup, WAL-E, and pgBackRest
  • Watchdog integration for split-brain prevention
  • Used in production by Zalando and many other organizations

Comparison with Similar Tools

  • Stolon — Go-based PostgreSQL HA manager; similar architecture but less actively maintained
  • repmgr — replication manager with manual or automatic failover; Patroni offers tighter DCS integration
  • pg_auto_failover (Citus) — built-in monitor node for HA; simpler setup but less flexible than Patroni
  • CloudNativePG — Kubernetes-native PostgreSQL operator; Patroni is infrastructure-agnostic

FAQ

Q: What happens if the DCS goes down? A: Patroni enters a safe mode where the current primary continues serving but no failover can occur until the DCS recovers.

Q: Can I use Patroni with existing PostgreSQL instances? A: Yes. Patroni can adopt running PostgreSQL instances without reinitializing them.

Q: How fast is automatic failover? A: Typically 10-30 seconds, depending on TTL and loop_wait configuration.

Q: Does Patroni handle connection routing? A: Patroni exposes health endpoints. Pair it with HAProxy, PgBouncer, or a service mesh for automatic connection routing.

Sources

讨论

登录后参与讨论。
还没有评论,来写第一条吧。

相关资产