Cette page est affichée en anglais. Une traduction française est en cours.
ScriptsJul 1, 2026·3 min de lecture

Kuberhealthy — Synthetic Health Monitoring for Kubernetes

Kuberhealthy is a Kubernetes operator that runs synthetic health checks as pods to continuously validate cluster functionality, networking, DNS, storage, and application health from an end-user perspective.

Prêt pour agents

Installation agent prête

Cet actif peut être installé après choix du runtime, vérification du plan et exécution de la commande adaptée.

Native · 98/100Policy : autoriser
Surface agent
Tout agent MCP/CLI
Type
Skill
Installation
Single
Confiance
Confiance : Established
Point d'entrée
Kuberhealthy Overview
Commande d'installation directe
npx -y tokrepo@latest install b2454203-7520-11f1-9bc6-00163e2b0d79 --target codex

À exécuter après confirmation du plan en dry-run.

Introduction

Kuberhealthy runs synthetic monitoring checks inside Kubernetes clusters to validate that the cluster and its components actually work from a workload perspective. Rather than just checking component status, Kuberhealthy launches real pods that exercise DNS resolution, create deployments, test network connectivity, and validate storage provisioning, surfacing failures before they affect users.

What Kuberhealthy Does

  • Runs health check pods on a schedule to validate cluster subsystems
  • Tests DNS resolution, deployment creation, pod scheduling, and network connectivity
  • Exposes check results via a Prometheus-compatible metrics endpoint
  • Provides a built-in status page showing check results across the cluster
  • Supports custom check images for application-specific health validation

Architecture Overview

Kuberhealthy runs as a deployment that watches KuberhealthyCheck (khcheck) CRDs. Each check defines a container image, schedule, and timeout. On each interval, the operator creates a Job that runs the check container. The check container performs its validation and reports success or failure back to the Kuberhealthy API. Results are stored as KuberhealthyState (khstate) custom resources and exposed as Prometheus metrics.

Self-Hosting & Configuration

  • Deploy via Helm chart with default health checks included
  • Enable built-in checks for DNS, deployment, daemonset, and pod status
  • Create custom KuberhealthyCheck CRDs with your own check container images
  • Configure check intervals, timeouts, and alert thresholds per check
  • Scrape the /metrics endpoint with Prometheus for alerting integration

Key Features

  • Real pod-based checks validate cluster behavior from a workload perspective
  • Pre-built checks cover DNS, deployment lifecycle, pod restart, and network
  • Custom check framework lets you write checks in any language as a container
  • Prometheus metrics endpoint integrates with existing monitoring stacks
  • Namespace-scoped and cluster-scoped checks for multi-tenant monitoring

Comparison with Similar Tools

  • Prometheus kube-state-metrics — reports Kubernetes object states but does not perform active validation
  • Goldpinger — specifically tests pod-to-pod network connectivity, narrower scope
  • Gatus — external endpoint monitoring, not cluster-internal synthetic checks
  • Healthchecks — cron job monitoring service, not Kubernetes-native synthetic testing

FAQ

Q: What built-in checks does Kuberhealthy include? A: Built-in checks include DNS resolution, deployment creation and deletion, daemonset scheduling, pod restart detection, and HTTP endpoint availability.

Q: Can I write custom health checks? A: Yes. A custom check is any container image that calls the Kuberhealthy API to report success or failure. You can write checks in Go, Python, Bash, or any language.

Q: How does Kuberhealthy integrate with alerting? A: Kuberhealthy exposes check results as Prometheus metrics. Configure Prometheus AlertManager rules on these metrics to trigger alerts when checks fail.

Q: Does running health checks consume significant cluster resources? A: Checks run as short-lived pods with configurable resource limits. The default checks use minimal resources and run infrequently, so cluster overhead is negligible.

Sources

Fil de discussion

Connectez-vous pour rejoindre la discussion.
Aucun commentaire pour l'instant. Soyez le premier à partager votre avis.

Actifs similaires