KEDA — Kubernetes Event-Driven Autoscaling
CNCF-graduated autoscaler that scales Kubernetes workloads to and from zero using 70+ event sources like Kafka, SQS, Prometheus and Redis.
Ready-to-run agent install
This asset can be installed after the agent chooses its runtime, checks the plan, and runs the matching command.
npx -y tokrepo@latest install 298b2541-38d7-11f1-9bc6-00163e2b0d79 --target codexRun after dry-run confirms the install plan.
What it is
KEDA (Kubernetes Event-Driven Autoscaling) is a CNCF-graduated project that extends the native Kubernetes Horizontal Pod Autoscaler (HPA) with event-driven triggers. Instead of scaling only on CPU or memory, KEDA lets you scale workloads based on queue depth, stream lag, cron schedules, database row counts, or metrics from 70+ sources.
KEDA is built for platform engineers, DevOps teams, and backend developers who run event-driven microservices on Kubernetes. If your workloads are bursty or idle most of the time, KEDA can scale them to zero when there is no work and spin them up within seconds when events arrive.
How it saves time or tokens
KEDA eliminates the need to write custom autoscaling controllers. Without KEDA, scaling a Kafka consumer based on consumer group lag requires a custom metrics adapter, a Prometheus query, and an HPA configuration. KEDA wraps all of that into a single ScaledObject YAML manifest. The result is fewer moving parts, less custom code to maintain, and faster iteration on scaling policies.
For cost optimization, KEDA's scale-to-zero capability means idle workloads consume no compute resources. In a multi-tenant cluster running dozens of event consumers, this can reduce node count significantly during off-peak hours.
How to use
- Install KEDA via Helm into your cluster:
helm repo add kedacore https://kedacore.github.io/charts
helm upgrade --install keda kedacore/keda -n keda --create-namespace
- Define a
ScaledObjectthat targets your Deployment and specifies triggers:
apiVersion: keda.sh/v1alpha1
kind: ScaledObject
metadata:
name: kafka-scaler
spec:
scaleTargetRef:
name: consumer
minReplicaCount: 0
maxReplicaCount: 50
triggers:
- type: kafka
metadata:
bootstrapServers: kafka:9092
consumerGroup: my-group
topic: orders
lagThreshold: '10'
- Apply the manifest and KEDA handles the rest -- it creates an HPA under the hood, polls the trigger source, and adjusts replica count accordingly.
Example
Scale an AWS SQS consumer to zero when the queue is empty:
apiVersion: keda.sh/v1alpha1
kind: ScaledObject
metadata:
name: sqs-scaler
spec:
scaleTargetRef:
name: sqs-worker
minReplicaCount: 0
maxReplicaCount: 20
triggers:
- type: aws-sqs-queue
metadata:
queueURL: https://sqs.us-east-1.amazonaws.com/123456/orders
queueLength: '5'
authenticationRef:
name: aws-credentials
When queueLength exceeds 5 messages per pod, KEDA adds replicas. When the queue drains, pods scale to zero.
Related on TokRepo
- DevOps AI tools -- more tools for infrastructure automation and CI/CD
- Automation tools -- event-driven and workflow automation resources
Common pitfalls
- Setting
minReplicaCount: 0means cold starts. If your workload cannot tolerate startup latency, set the minimum to 1. - KEDA polls trigger sources at a configurable interval (default 30s). For sub-second scaling, you need to tune
pollingIntervaland accept higher API call volume. - Mixing KEDA ScaledObjects with manually created HPAs on the same Deployment causes conflicts. Let KEDA own the HPA entirely.
Frequently Asked Questions
KEDA supports Kubernetes 1.24 and above. Each KEDA release documents a compatibility matrix in its GitHub repository. Older clusters may work but are not officially tested.
Yes. KEDA provides a ScaledJob resource that creates Kubernetes Jobs in response to events. This is useful for batch processing where each event should spawn an independent, run-to-completion Job rather than a long-running pod.
The native HPA only scales on CPU, memory, or custom metrics you expose via a metrics adapter. KEDA adds 70+ built-in trigger types and supports scaling to zero, which the native HPA cannot do.
KEDA and Knative solve overlapping problems but are independent projects. KEDA focuses on scaling Deployments and Jobs, while Knative provides a broader serverless platform. Some teams use KEDA for worker scaling and Knative for HTTP-serving workloads.
KEDA is a CNCF-graduated project, which is the highest maturity level in the Cloud Native Computing Foundation. It is used in production by organizations running large Kubernetes clusters.
Citations (3)
- KEDA GitHub— KEDA is a CNCF-graduated project with 70+ scalers
- KEDA Documentation— ScaledObject and ScaledJob resource specifications
- Kubernetes Documentation— Kubernetes HPA limitations and external metrics
Related on TokRepo
Discussion
Related Assets
Knative Serving — Serverless and Event-Driven Workloads on Kubernetes
Knative Serving brings request-driven autoscaling, revision management and a simple Service CRD to Kubernetes so developers can deploy containers as HTTP-accessible services that scale to zero.
Twisted — Event-Driven Networking Engine for Python
Twisted is a mature event-driven networking framework for Python that supports TCP, UDP, TLS, HTTP, SMTP, SSH, DNS, and many other protocols in a single cohesive library.
SaltStack — Scalable Event-Driven Infrastructure Automation
Salt is a Python-based configuration management and remote execution engine that manages thousands of servers in real time using an event-driven architecture, ZeroMQ transport, and declarative YAML states.
AsyncAPI — Open Standard for Event-Driven API Documentation
AsyncAPI is an open specification for defining asynchronous APIs. It provides a machine-readable format for documenting message-driven architectures using protocols like Kafka, MQTT, WebSocket, and AMQP, with tooling for code generation and validation.