Introduction
KEDA (Kubernetes Event-Driven Autoscaler) is a CNCF graduated project that extends Kubernetes Horizontal Pod Autoscaler with 70+ external event sources — queues, streams, databases, cloud services — and adds the critical ability to scale workloads all the way down to zero when there is no work. It is additive: HPA, VPA, and cluster autoscalers continue to do their jobs.
What KEDA Does
- Scales any Deployment, StatefulSet, Job, or custom resource based on external metrics.
- Supports 70+ scalers: Kafka, RabbitMQ, SQS, Azure Service Bus, Prometheus, Postgres, Redis, Pulsar, NATS, GCP Pub/Sub, Kubernetes workload, CPU/memory, cron, and more.
- Scales to and from zero — pods are fully terminated until work appears.
- Delivers external metrics through a Kubernetes External Metrics API adapter so existing HPAs can use them.
- Runs ScaledJobs for queue-consuming batch work where each message becomes a Job.
Architecture Overview
KEDA is two small components: the Operator and the Metrics Server. The Operator watches ScaledObject/ScaledJob CRs and manages HPAs for each target. The Metrics Server implements Kubernetes' external metrics API; HPA reads metrics from KEDA, which in turn polls scalers (Kafka lag, Prometheus queries, etc.). When the metric drops to zero and the cooldown passes, KEDA sets replicas to 0; when a poll returns non-zero, it bumps to the minReplicaCount and lets HPA take over.
Self-Hosting & Configuration
- Official Helm chart covers CRDs, RBAC, and the two pods.
- Use
TriggerAuthenticationandClusterTriggerAuthenticationfor secrets — never inline them. pollingInterval,cooldownPeriod, andidleReplicaCountare the three knobs that matter most.- For private Kafka/RabbitMQ, pair with
SecretTargetRefso credentials stay in cluster secrets. - KEDA HTTP Add-On (separate install) lets you scale to zero HTTP services with a traffic interceptor.
Key Features
- Scale-to-zero for deployment workloads, not just Jobs.
- 70+ first-class scalers plus a generic Prometheus/Metrics API scaler.
- CNCF graduated — production proven at Microsoft, Alibaba, Shopify, and others.
- Non-invasive: does not replace HPA, it produces metrics for it.
- ScaledJobs enable one-pod-per-message batch processing.
Comparison with Similar Tools
- Kubernetes HPA — CPU/memory only unless extended; KEDA supplies the extension.
- KNative Serving — scales HTTP to zero, but is a full serverless stack.
- OpenFaaS — function-platform with built-in scaling; KEDA is workload-agnostic.
- Karpenter — node autoscaler, complementary to KEDA (which scales pods).
- Prometheus Adapter — exposes Prometheus metrics to HPA; KEDA covers that plus many other sources.
FAQ
Q: Does KEDA replace HPA? A: No. KEDA creates the HPA for you and feeds it external metrics.
Q: Can it scale StatefulSets? A: Yes, and custom resources that implement the scale subresource.
Q: How does scale-to-zero handle in-flight work?
A: KEDA waits for the metric to stay at zero for cooldownPeriod before scaling down.
Q: Is there a way to scale HTTP services to zero? A: Yes, the KEDA HTTP Add-On adds an interceptor that wakes pods on request.