Introduction
Robusta sits between Prometheus/AlertManager and your notification channels, enriching every alert with Kubernetes context. When a pod crashes, Robusta automatically attaches logs, resource graphs, and related events to the alert before sending it to Slack, Microsoft Teams, or PagerDuty.
What Robusta Does
- Intercepts Prometheus alerts and enriches them with pod logs, events, and graphs
- Groups related alerts to reduce notification noise
- Runs automated remediation playbooks for common failure patterns
- Provides AI-powered root cause analysis summaries for each incident
- Offers a web dashboard for browsing enriched alerts and cluster health
Architecture Overview
Robusta deploys as a Helm chart with two main components: a runner that executes Python-based playbooks in response to alerts or Kubernetes events, and a forwarder that pushes enriched alerts to configured sinks. The runner watches AlertManager webhooks and the Kubernetes API, triggers matching playbooks, and enriches alerts with data gathered from the cluster.
Self-Hosting & Configuration
- Install via Helm with a generated values file from the Robusta web platform
- Connect to existing Prometheus and AlertManager instances
- Configure notification sinks: Slack, Microsoft Teams, PagerDuty, Opsgenie, or webhooks
- Write custom playbooks in Python for organization-specific automation
- Enable the AI analysis feature to get GPT-powered summaries of alert causes
Key Features
- Alert enrichment adds logs, graphs, and events to every notification automatically
- Smart grouping correlates related alerts into single actionable incidents
- Built-in playbooks for common issues like CrashLoopBackOff, OOMKilled, and node pressure
- Custom Python playbooks for automated remediation specific to your workloads
- Timeline view shows the sequence of events leading to each incident
Comparison with Similar Tools
- AlertManager — routes and deduplicates alerts; Robusta adds enrichment, AI analysis, and remediation on top
- PagerDuty/Opsgenie — incident management platforms; Robusta enriches alerts before they reach these tools
- Komodor — commercial Kubernetes troubleshooting; Robusta is open source with a free tier
- Datadog — commercial monitoring with built-in alerts; Robusta works with existing Prometheus setups
- K8sGPT — AI diagnostics for cluster issues; Robusta focuses on alert enrichment and automated response
FAQ
Q: Does Robusta replace Prometheus? A: No. Robusta complements Prometheus by enriching its alerts with additional Kubernetes context before routing them.
Q: Can I write custom automation playbooks? A: Yes. Playbooks are Python functions that receive alert or event data and can take any action: restart pods, scale deployments, or call external APIs.
Q: Does the AI analysis send data externally? A: The AI feature sends anonymized alert summaries to generate root cause analysis. It can be disabled if data privacy is a concern.
Q: What notification channels are supported? A: Slack, Microsoft Teams, PagerDuty, Opsgenie, Discord, Google Chat, and generic webhooks.