Kubernetes Cluster Autoscaler — Node-Level Autoscaling for K8s
Official Kubernetes project that adds and removes nodes from cloud or Cluster API node groups to match scheduling pressure across AWS, Azure, GCP, and on-prem.
What it is
Kubernetes Cluster Autoscaler is the official project that automatically adjusts the number of nodes in your cluster to match workload demand. When pods cannot be scheduled due to insufficient resources, it adds nodes. When nodes are underutilized, it removes them. It supports AWS (EKS/ASG), Azure (AKS/VMSS), GCP (GKE/MIG), and on-premises Cluster API providers.
The autoscaler targets platform engineering teams running Kubernetes clusters that experience variable workload patterns. It ensures you have enough capacity for peak demand without paying for idle nodes during quiet periods.
How it saves time or tokens
Cluster Autoscaler removes the need for manual node pool management. Instead of pre-provisioning for peak load and wasting resources during off-peak, the autoscaler right-sizes your cluster automatically. For teams running batch jobs, CI/CD workloads, or applications with traffic patterns that vary by time of day, this translates directly to infrastructure cost savings. Scale-down eliminates idle node costs; scale-up prevents scheduling failures.
How to use
- Install with Helm (AWS EKS example):
helm repo add autoscaler https://kubernetes.github.io/autoscaler
helm upgrade --install cluster-autoscaler autoscaler/cluster-autoscaler \
-n kube-system \
--set autoDiscovery.clusterName=my-cluster \
--set awsRegion=us-east-1
- Tag your Auto Scaling Groups with the required tags for auto-discovery.
- The autoscaler watches for unschedulable pods and adjusts node count accordingly.
Example
Scale-up behavior when pods cannot be scheduled:
# Deploy a workload that requires more resources than currently available
apiVersion: apps/v1
kind: Deployment
metadata:
name: batch-processor
spec:
replicas: 50
template:
spec:
containers:
- name: worker
resources:
requests:
cpu: '2'
memory: '4Gi'
# Cluster Autoscaler detects unschedulable pods
# Adds nodes to the node group until all 50 pods are scheduled
# Scale-up takes 2-5 minutes depending on cloud provider
Scale-down happens automatically when nodes are underutilized for a configurable duration.
Related on TokRepo
- AI Tools for DevOps — DevOps tools for Kubernetes management and deployment
- AI Tools for Monitoring — Monitoring tools for tracking cluster resource utilization
Common pitfalls
- Scale-up takes 2-5 minutes because cloud providers need to provision new VMs. For latency-sensitive workloads, maintain a small buffer of over-provisioned capacity.
- Pod disruption budgets (PDBs) can prevent scale-down. Nodes with pods protected by strict PDBs will not be drained, even when underutilized.
- The autoscaler does not consider external metrics (CPU utilization, request rate). It only reacts to scheduling pressure. Use Horizontal Pod Autoscaler (HPA) for application-level scaling.
- Always check the official documentation for the latest version-specific changes and migration guides before upgrading in production environments.
Frequently Asked Questions
Cluster Autoscaler adds/removes nodes (infrastructure scaling). HPA adds/removes pod replicas (application scaling). They work together: HPA increases pod count based on metrics, and if pods cannot be scheduled, Cluster Autoscaler adds nodes to accommodate them.
Scale-up typically takes 2-5 minutes. The autoscaler detects unschedulable pods immediately, but provisioning new cloud VMs takes time. The exact duration depends on your cloud provider and instance type.
Scale-down occurs when a node has been underutilized (below a configurable threshold, default 50%) for a configurable duration (default 10 minutes). The autoscaler checks if all pods on the node can be moved to other nodes before removing it.
Yes. You can configure node groups with spot or preemptible instances. The autoscaler manages these groups like regular node groups, adding and removing instances as needed. Spot interruptions are handled by Kubernetes pod rescheduling.
Cluster Autoscaler supports AWS (EKS with Auto Scaling Groups), Azure (AKS with VMSS), GCP (GKE with Managed Instance Groups), and on-premises clusters using Cluster API. Each provider has a dedicated implementation.
Citations (3)
- Cluster Autoscaler GitHub— Kubernetes Cluster Autoscaler is the official node-level autoscaler
- Cluster Autoscaler FAQ— AWS, Azure, GCP, and Cluster API provider support
- Kubernetes Documentation— Kubernetes autoscaling concepts
Related on TokRepo
Discussion
Related Assets
WCDB — WeChat Cross-Platform Database Framework
A high-performance, cross-platform database framework developed by WeChat, built on SQLite with ORM, encryption, repair, and migration capabilities.
sql.js — Run SQLite in the Browser with WebAssembly
A JavaScript library that compiles SQLite to WebAssembly, letting you run a full SQL database entirely in the browser or Node.js.
Realm — High-Performance Mobile Database
A fast, object-oriented mobile database designed as a modern replacement for SQLite and Core Data on iOS and Android.