Kubernetes Cluster Autoscaler — Node-Level Autoscaling for K8s
Official Kubernetes project that adds and removes nodes from cloud or Cluster API node groups to match scheduling pressure across AWS, Azure, GCP, and on-prem.
Instalación lista para agent
Este activo puede instalarse después de elegir el runtime, revisar el plan y ejecutar el comando correspondiente.
npx -y tokrepo@latest install 5c1a9982-38d7-11f1-9bc6-00163e2b0d79 --target codexEjecutar después de confirmar el plan con dry-run.
What it is
Kubernetes Cluster Autoscaler is the official project that automatically adjusts the number of nodes in your cluster to match workload demand. When pods cannot be scheduled due to insufficient resources, it adds nodes. When nodes are underutilized, it removes them. It supports AWS (EKS/ASG), Azure (AKS/VMSS), GCP (GKE/MIG), and on-premises Cluster API providers.
The autoscaler targets platform engineering teams running Kubernetes clusters that experience variable workload patterns. It ensures you have enough capacity for peak demand without paying for idle nodes during quiet periods.
How it saves time or tokens
Cluster Autoscaler removes the need for manual node pool management. Instead of pre-provisioning for peak load and wasting resources during off-peak, the autoscaler right-sizes your cluster automatically. For teams running batch jobs, CI/CD workloads, or applications with traffic patterns that vary by time of day, this translates directly to infrastructure cost savings. Scale-down eliminates idle node costs; scale-up prevents scheduling failures.
How to use
- Install with Helm (AWS EKS example):
helm repo add autoscaler https://kubernetes.github.io/autoscaler
helm upgrade --install cluster-autoscaler autoscaler/cluster-autoscaler \
-n kube-system \
--set autoDiscovery.clusterName=my-cluster \
--set awsRegion=us-east-1
- Tag your Auto Scaling Groups with the required tags for auto-discovery.
- The autoscaler watches for unschedulable pods and adjusts node count accordingly.
Example
Scale-up behavior when pods cannot be scheduled:
# Deploy a workload that requires more resources than currently available
apiVersion: apps/v1
kind: Deployment
metadata:
name: batch-processor
spec:
replicas: 50
template:
spec:
containers:
- name: worker
resources:
requests:
cpu: '2'
memory: '4Gi'
# Cluster Autoscaler detects unschedulable pods
# Adds nodes to the node group until all 50 pods are scheduled
# Scale-up takes 2-5 minutes depending on cloud provider
Scale-down happens automatically when nodes are underutilized for a configurable duration.
Related on TokRepo
- AI Tools for DevOps — DevOps tools for Kubernetes management and deployment
- AI Tools for Monitoring — Monitoring tools for tracking cluster resource utilization
Common pitfalls
- Scale-up takes 2-5 minutes because cloud providers need to provision new VMs. For latency-sensitive workloads, maintain a small buffer of over-provisioned capacity.
- Pod disruption budgets (PDBs) can prevent scale-down. Nodes with pods protected by strict PDBs will not be drained, even when underutilized.
- The autoscaler does not consider external metrics (CPU utilization, request rate). It only reacts to scheduling pressure. Use Horizontal Pod Autoscaler (HPA) for application-level scaling.
- Always check the official documentation for the latest version-specific changes and migration guides before upgrading in production environments.
Preguntas frecuentes
Cluster Autoscaler adds/removes nodes (infrastructure scaling). HPA adds/removes pod replicas (application scaling). They work together: HPA increases pod count based on metrics, and if pods cannot be scheduled, Cluster Autoscaler adds nodes to accommodate them.
Scale-up typically takes 2-5 minutes. The autoscaler detects unschedulable pods immediately, but provisioning new cloud VMs takes time. The exact duration depends on your cloud provider and instance type.
Scale-down occurs when a node has been underutilized (below a configurable threshold, default 50%) for a configurable duration (default 10 minutes). The autoscaler checks if all pods on the node can be moved to other nodes before removing it.
Yes. You can configure node groups with spot or preemptible instances. The autoscaler manages these groups like regular node groups, adding and removing instances as needed. Spot interruptions are handled by Kubernetes pod rescheduling.
Cluster Autoscaler supports AWS (EKS with Auto Scaling Groups), Azure (AKS with VMSS), GCP (GKE with Managed Instance Groups), and on-premises clusters using Cluster API. Each provider has a dedicated implementation.
Referencias (3)
- Cluster Autoscaler GitHub— Kubernetes Cluster Autoscaler is the official node-level autoscaler
- Cluster Autoscaler FAQ— AWS, Azure, GCP, and Cluster API provider support
- Kubernetes Documentation— Kubernetes autoscaling concepts
Relacionados en TokRepo
Discusión
Activos relacionados
Karpenter — Just-in-Time Kubernetes Node Autoscaler for AWS
AWS-origin cluster autoscaler that launches the right EC2 instance shape and size for pending pods in ~30 seconds.
KEDA — Kubernetes Event-Driven Autoscaling
CNCF-graduated autoscaler that scales Kubernetes workloads to and from zero using 70+ event sources like Kafka, SQS, Prometheus and Redis.
Cluster API — Declarative Kubernetes Cluster Lifecycle Management
A Kubernetes sub-project that uses declarative CRDs to provision, upgrade, and manage the lifecycle of Kubernetes clusters across multiple infrastructure providers.
kube-state-metrics — Kubernetes Cluster State Metrics Exporter
kube-state-metrics is a Kubernetes add-on that listens to the API server and generates Prometheus metrics about the state of Kubernetes objects like deployments, nodes, and pods.