ScriptsApr 16, 2026·3 min read

LitmusChaos — Cloud-Native Chaos Engineering for Kubernetes

Inject controlled failures into your Kubernetes workloads to test resilience. A CNCF incubating project with a library of 50+ chaos experiments.

Introduction

LitmusChaos is a CNCF incubating project that brings chaos engineering to Kubernetes. It provides a framework for running controlled failure experiments, pod kills, network delays, CPU stress, and more, so teams can verify that their applications recover gracefully under adverse conditions.

What LitmusChaos Does

  • Runs chaos experiments as Kubernetes CRDs with a declarative YAML workflow
  • Offers a ChaosHub with 50+ prebuilt experiments for pods, nodes, and infrastructure
  • Provides a web-based ChaosCenter for designing, scheduling, and observing experiments
  • Supports steady-state hypothesis checks to validate resilience automatically
  • Integrates with CI/CD pipelines to run chaos tests as part of deployment validation

Architecture Overview

LitmusChaos consists of a control plane (ChaosCenter) and an execution plane. ChaosCenter is a web application backed by MongoDB that manages experiment definitions and schedules. The execution plane runs in each target cluster as a set of operators: the Chaos Operator watches ChaosEngine CRDs and launches experiment pods that inject the specified failure. Results are reported back to ChaosCenter for analysis and visualization.

Self-Hosting & Configuration

  • Deploy ChaosCenter via Helm chart or kubectl manifests into a management cluster
  • Register target clusters as Chaos Delegates through the ChaosCenter UI
  • Browse the ChaosHub to select and customize experiments
  • Define ChaosWorkflows combining multiple experiments with steady-state checks
  • Schedule recurring chaos tests via cron expressions in the workflow definition

Key Features

  • CNCF incubating project with an active community and vendor-neutral governance
  • 50+ prebuilt experiments covering pod, node, network, DNS, and cloud provider faults
  • GitOps-native experiment management with version-controlled workflow definitions
  • Observability integration with Prometheus metrics and Grafana dashboards
  • Multi-cluster chaos orchestration from a single ChaosCenter instance

Comparison with Similar Tools

  • Chaos Mesh — CNCF project with similar Kubernetes-native chaos; LitmusChaos offers a richer web UI and ChaosHub marketplace
  • Gremlin — Commercial SaaS chaos platform; LitmusChaos is fully open-source and self-hosted
  • AWS Fault Injection Simulator — AWS-only managed service; LitmusChaos works on any Kubernetes cluster
  • Pumba — Docker-level chaos tool; LitmusChaos operates at the Kubernetes abstraction layer with CRD-driven workflows

FAQ

Q: Can LitmusChaos cause production outages? A: Experiments are scoped by namespace, labels, and blast radius controls. Start with non-production clusters and narrow targeting to reduce risk.

Q: Does it require ChaosCenter to run experiments? A: No. You can run experiments directly via ChaosEngine CRDs and kubectl without ChaosCenter, though the UI simplifies workflow management.

Q: How do I create a custom chaos experiment? A: Write a Go or shell-based experiment, package it as a container image, and register it in a custom ChaosHub or inline in your workflow.

Q: What steady-state hypothesis checks are supported? A: Built-in probes support HTTP endpoints, command output, Kubernetes resource conditions, and Prometheus queries.

Sources

Discussion

Sign in to join the discussion.
No comments yet. Be the first to share your thoughts.

Related Assets