kOps — Production-Grade Kubernetes Cluster Management
Create, upgrade, and manage production Kubernetes clusters on AWS, GCE, and other clouds with kOps, the official Kubernetes operations tool.
Agent 可直接安装
这个资产可安装;Agent 先选择当前运行时、检查安装计划,再运行匹配命令。
npx -y tokrepo@latest install 7a39c111-3997-11f1-9bc6-00163e2b0d79 --target codex先 dry-run 确认安装计划,再运行此命令。
What it is
kOps (Kubernetes Operations) is the official tool for provisioning and managing production-grade Kubernetes clusters on cloud infrastructure. It automates the full lifecycle -- creation, upgrades, rolling updates, and teardown -- while following best practices for high availability and security.
It is built for platform engineers and DevOps teams who need to run Kubernetes on AWS, GCE, DigitalOcean, or other cloud providers without relying on managed services like EKS or GKE.
How it saves time or tokens
kOps replaces dozens of manual steps (VPC setup, IAM roles, etcd configuration, node provisioning) with a single declarative spec. Rolling upgrades happen with zero downtime. The validate command catches configuration drift before it causes incidents, reducing the debugging cycle.
How to use
- Install kOps:
brew install kops - Create a state store:
export KOPS_STATE_STORE=s3://my-kops-state - Create a cluster:
kops create cluster --name=k8s.example.com --zones=us-east-1a - Apply the cluster:
kops update cluster --name=k8s.example.com --yes - Validate:
kops validate cluster
Example
# Install kOps
brew install kops
# Set up state store
export KOPS_STATE_STORE=s3://my-kops-state
# Create a high-availability cluster
kops create cluster \
--name=k8s.example.com \
--zones=us-east-1a,us-east-1b,us-east-1c \
--master-count=3 \
--node-count=5 \
--node-size=t3.large \
--master-size=t3.medium \
--networking=calico
# Preview changes
kops update cluster --name=k8s.example.com
# Apply
kops update cluster --name=k8s.example.com --yes
# Validate cluster health
kops validate cluster
# Generate Terraform output for GitOps
kops update cluster --target=terraform --out=./tf
Related on TokRepo
- AI Tools for DevOps -- DevOps and infrastructure automation tools
- Featured Workflows -- Top-rated workflows on TokRepo
Common pitfalls
- The S3 state store must be versioned; without versioning, accidental deletions destroy your cluster spec
- DNS setup is required before cluster creation; kOps needs a real domain or gossip-based DNS (.k8s.local suffix)
- Upgrading Kubernetes versions requires running both
kops updateandkops rolling-updatein sequence
常见问题
kOps officially supports AWS (most mature), GCE, DigitalOcean, Hetzner, and OpenStack. AWS has the most complete feature set including private topology, bastion hosts, and Terraform output generation.
kOps gives you full control over the Kubernetes control plane, which managed services abstract away. Use kOps when you need custom configurations, specific Kubernetes versions, or want to avoid vendor lock-in. Use managed services when you want less operational overhead.
Yes. Run kops update cluster --target=terraform --out=./tf to generate Terraform HCL files. This enables GitOps workflows where infrastructure changes go through pull request reviews before applying.
kOps drains nodes one at a time, replaces them with updated instances, and waits for the new nodes to become ready before proceeding to the next. This ensures zero-downtime upgrades for both control plane and worker nodes.
kOps supports multiple CNI plugins including Calico, Cilium, Flannel, and others. You choose the networking provider at cluster creation time. Private topology with bastion hosts is also supported on AWS.
引用来源 (3)
- kOps GitHub— kOps is the official Kubernetes operations tool
- kOps Documentation— Supports AWS, GCE, DigitalOcean, Hetzner, OpenStack
- Kubernetes SIGs— Kubernetes official project under SIG Cluster Lifecycle
讨论
相关资产
Kubernetes — Production-Grade Container Orchestration
Kubernetes (K8s) is the open-source platform for automating deployment, scaling, and management of containerized applications. Originally designed by Google and now maintained by the CNCF, it is the industry standard for running containers in production.
CloudNativePG — Production-Grade PostgreSQL Operator for Kubernetes
CloudNativePG is a Level V Kubernetes operator that manages PostgreSQL clusters with streaming replication, online backups, point-in-time recovery and rolling upgrades — without any external pgBouncer-like layer.
Kubespray — Production-Ready Kubernetes via Ansible
Kubespray is a Kubernetes SIG project that uses Ansible to deploy highly-available, production-grade Kubernetes clusters on any bare-metal, VM, or cloud infrastructure.
Spring Boot — Production-Grade Java Apps with Minimum Fuss
Spring Boot makes it easy to create stand-alone, production-grade Spring-based applications. Auto-configuration, embedded servers, actuator endpoints, and a massive starter ecosystem. The dominant framework for enterprise Java backends.