Introduction
Flagger turns Kubernetes Deployments into safe, automated rollouts. Instead of "kubectl set image and pray", Flagger creates a canary Deployment alongside the primary, shifts a small slice of traffic through the service mesh, queries Prometheus (or Datadog, New Relic, CloudWatch) on a schedule, and promotes only if success-rate and latency meet thresholds. Fail the analysis and it rolls back. It''s part of the FluxCD family and graduated through the CNCF Sandbox.
What Flagger Does
- Runs canary analysis with configurable steps, intervals, and metric thresholds.
- Supports traffic shifting via Istio, Linkerd, App Mesh, Gloo, Contour, NGINX, Traefik, Kuma, and OSM.
- Queries Prometheus, Datadog, New Relic, Dynatrace, CloudWatch, or Graphite for promotion signals.
- Runs webhook-driven load generation, acceptance tests, and post-promotion hooks.
- Handles Deployment, DaemonSet, StatefulSet, and custom resources with
primaryScalerReplicas.
Architecture Overview
Flagger is a Kubernetes operator watching Canary custom resources. When spec.image changes on the referenced target, it copies the current primary into a canary Deployment and, depending on strategy, creates the mesh resources (VirtualServices, TrafficSplits, etc.) needed to split traffic between primary and canary. A reconcile loop steps the weight up on success, queries metric providers, and either promotes (copies canary spec into primary, scales canary to zero) or rolls back. Deployment strategies include canary, A/B, blue/green, and traffic mirroring.
Self-Hosting & Configuration
- Install via the official Helm chart; one Flagger per mesh provider, not per namespace.
- Point
metricsServerat your Prometheus or configure a Datadog / New Relic provider with secrets. - Annotate target Deployments to opt into Flagger management; others are untouched.
- Use
webhooksto call k6, Locust, or smoke tests before each weight step. - Keep
alertmanagerwired in — Flagger sends slack/msteams/webhook alerts on canary events.
Key Features
- Multi-mesh: one control plane that speaks to Istio, Linkerd, App Mesh, and more.
- Progressive traffic shifting or session-affinity A/B testing via HTTP headers/cookies.
- MetricTemplates as first-class CRDs for reusable PromQL/NRQL/Datadog queries.
- Prometheus Operator integration for shipping standard dashboards.
- GitOps-friendly — canaries are declarative YAML managed by Flux or Argo CD.
Comparison with Similar Tools
- Argo Rollouts — Works without a service mesh (direct Deployment replicas and traffic providers); similar feature set.
- Spinnaker + Kayenta — Heavier multi-cloud CD; Flagger is lighter and Kubernetes-only.
- Istio Gateway API manual rollouts — DIY; Flagger automates traffic shifting + analysis.
- Helm + feature flags — App-level toggles; doesn''t provide traffic-shifting canaries.
- Managed Delivery (Keel) — Spinnaker-centric; Flagger fits GitOps stacks already using Flux or Argo.
FAQ
Q: Do I need a service mesh? A: Strongly recommended for canary weight shifting. For mesh-less, use Nginx or Contour providers.
Q: What if my metric provider has a blip?
A: Flagger tolerates up to threshold consecutive failures before rollback to avoid flapping.
Q: Can I do blue/green with manual gate?
A: Yes — use the approve-gate webhook to pause until an operator approves promotion.
Q: Does it work with Argo CD? A: Yes — Flagger manages the canary, Argo CD manages the desired state. No direct conflict.