ChaosBlade — Cloud-Native Chaos Engineering Toolkit by Alibaba

Introduction

ChaosBlade is an open-source chaos engineering toolkit created by Alibaba that helps teams verify the resilience of distributed systems. It provides a unified CLI and Kubernetes operator for injecting faults at the OS, container, pod, and application layer without writing custom failure scripts.

What ChaosBlade Does

Injects network faults (delay, loss, corruption, partition) at the OS and container level
Simulates CPU, memory, disk, and process failures on bare-metal and virtual hosts
Targets Kubernetes pods, nodes, and containers through a CRD-based operator
Injects application-level faults into JVM processes (method delay, exception throw, thread pool exhaust)
Provides a destroy command to cleanly roll back any active experiment

Architecture Overview

ChaosBlade is built on a plugin model. The core CLI (blade) dispatches experiment commands to executors specific to each target: os-executor for host-level faults, docker-executor for containers, and the chaosblade-operator for Kubernetes resources. JVM faults use a Java agent attached to the target process. Each experiment is tracked by a unique ID so it can be queried or destroyed independently. The Kubernetes operator watches ChaosBlade CRDs and translates them into targeted fault injections on the selected pods.

Self-Hosting & Configuration

Download the prebuilt binary for Linux or macOS from the GitHub releases page
Deploy the chaosblade-operator via Helm for Kubernetes chaos experiments
Define experiments in YAML CRDs specifying target scope, fault type, and duration
Use the blade CLI directly for ad-hoc host or container experiments without Kubernetes
Integrate with the ChaosBlade Box web platform for visual experiment orchestration and scheduling

Key Features

Unified experiment model covers hosts, Docker containers, Kubernetes pods, and JVM applications with the same CLI syntax
Atomic experiment design ensures every fault has a matching destroy command for safe rollback
Kubernetes label selectors and namespace scoping limit blast radius to specific pods or services
JVM sandbox engine injects faults at the bytecode level without restarting the target application
Experiment history and status tracking via the blade status command for audit and debugging

Comparison with Similar Tools

Chaos Mesh — CNCF project with a web dashboard; ChaosBlade offers broader target coverage including JVM and host-level faults from a single CLI
Litmus — Kubernetes-native chaos with ChaosHub experiment library; ChaosBlade provides a simpler CLI-first experience with less Kubernetes overhead
Gremlin — commercial SaaS chaos platform; ChaosBlade is fully open-source and self-hosted
Pumba — Docker-specific chaos tool; ChaosBlade supports Docker plus Kubernetes, hosts, and JVM targets
Toxiproxy — network fault proxy for testing; ChaosBlade injects faults at the kernel level without proxying traffic

FAQ

Q: Is ChaosBlade safe to run in production? A: Yes, with precautions. Every experiment has a destroy command. Use Kubernetes label selectors to limit scope, and start with non-critical services.

Q: Does ChaosBlade require root access? A: Most OS-level experiments (network, disk, CPU) require root or equivalent privileges. JVM experiments can run as the application user.

Q: Can I schedule recurring chaos experiments? A: The ChaosBlade Box web platform supports scheduled experiments. The CLI itself is stateless and can be triggered by cron or CI pipelines.

Q: What languages does ChaosBlade support for application-level faults? A: JVM-based languages (Java, Kotlin, Scala) are natively supported via the Java agent. C++ applications can be targeted via the cplus executor.

ChaosBlade — Cloud-Native Chaos Engineering Toolkit by Alibaba

Introduction

What ChaosBlade Does

Architecture Overview

Self-Hosting & Configuration

Key Features

Comparison with Similar Tools

FAQ

Sources

Discussion

Related Assets

Actions Runner Controller — Self-Hosted GitHub Actions on Kubernetes

Graylog — Centralized Log Management and Analysis Platform

MicroK8s — Lightweight Zero-Ops Kubernetes from Canonical