How do I install LitmusChaos — Cloud-Native Chaos Engineering for Kubernetes?

Visit the asset page on TokRepo and click "Copy for agent" to get the installation instructions. Most assets can be installed with a single command.

LitmusChaos — Cloud-Native Chaos Engineering for Kubernetes

Introduction

LitmusChaos is a CNCF incubating project that brings chaos engineering to Kubernetes. It provides a framework for running controlled failure experiments, pod kills, network delays, CPU stress, and more, so teams can verify that their applications recover gracefully under adverse conditions.

What LitmusChaos Does

Runs chaos experiments as Kubernetes CRDs with a declarative YAML workflow
Offers a ChaosHub with 50+ prebuilt experiments for pods, nodes, and infrastructure
Provides a web-based ChaosCenter for designing, scheduling, and observing experiments
Supports steady-state hypothesis checks to validate resilience automatically
Integrates with CI/CD pipelines to run chaos tests as part of deployment validation

Architecture Overview

LitmusChaos consists of a control plane (ChaosCenter) and an execution plane. ChaosCenter is a web application backed by MongoDB that manages experiment definitions and schedules. The execution plane runs in each target cluster as a set of operators: the Chaos Operator watches ChaosEngine CRDs and launches experiment pods that inject the specified failure. Results are reported back to ChaosCenter for analysis and visualization.

Self-Hosting & Configuration

Deploy ChaosCenter via Helm chart or kubectl manifests into a management cluster
Register target clusters as Chaos Delegates through the ChaosCenter UI
Browse the ChaosHub to select and customize experiments
Define ChaosWorkflows combining multiple experiments with steady-state checks
Schedule recurring chaos tests via cron expressions in the workflow definition

Key Features

CNCF incubating project with an active community and vendor-neutral governance
50+ prebuilt experiments covering pod, node, network, DNS, and cloud provider faults
GitOps-native experiment management with version-controlled workflow definitions
Observability integration with Prometheus metrics and Grafana dashboards
Multi-cluster chaos orchestration from a single ChaosCenter instance

Comparison with Similar Tools

Chaos Mesh — CNCF project with similar Kubernetes-native chaos; LitmusChaos offers a richer web UI and ChaosHub marketplace
Gremlin — Commercial SaaS chaos platform; LitmusChaos is fully open-source and self-hosted
AWS Fault Injection Simulator — AWS-only managed service; LitmusChaos works on any Kubernetes cluster
Pumba — Docker-level chaos tool; LitmusChaos operates at the Kubernetes abstraction layer with CRD-driven workflows

FAQ

Q: Can LitmusChaos cause production outages? A: Experiments are scoped by namespace, labels, and blast radius controls. Start with non-production clusters and narrow targeting to reduce risk.

Q: Does it require ChaosCenter to run experiments? A: No. You can run experiments directly via ChaosEngine CRDs and kubectl without ChaosCenter, though the UI simplifies workflow management.

Q: How do I create a custom chaos experiment? A: Write a Go or shell-based experiment, package it as a container image, and register it in a custom ChaosHub or inline in your workflow.

Q: What steady-state hypothesis checks are supported? A: Built-in probes support HTTP endpoints, command output, Kubernetes resource conditions, and Prometheus queries.

LitmusChaos — Cloud-Native Chaos Engineering for Kubernetes

Introduction

What LitmusChaos Does

Architecture Overview

Self-Hosting & Configuration

Key Features

Comparison with Similar Tools

FAQ

Sources

Discussion

Related Assets

CloudQuery — Sync Cloud Infrastructure to SQL for Security and Compliance

Operator SDK — Build Kubernetes Operators the Easy Way

Prowler — Cloud Security Assessment for AWS, Azure and GCP