Esta página se muestra en inglés. Una traducción al español está en curso.
ScriptsJul 1, 2026·3 min de lectura

Kuberhealthy — Synthetic Health Monitoring for Kubernetes

Kuberhealthy is a Kubernetes operator that runs synthetic health checks as pods to continuously validate cluster functionality, networking, DNS, storage, and application health from an end-user perspective.

Listo para agents

Instalación lista para agent

Este activo puede instalarse después de elegir el runtime, revisar el plan y ejecutar el comando correspondiente.

Native · 98/100Política: permitir
Superficie agent
Cualquier agent MCP/CLI
Tipo
Skill
Instalación
Single
Confianza
Confianza: Established
Entrada
Kuberhealthy Overview
Comando de instalación directa
npx -y tokrepo@latest install b2454203-7520-11f1-9bc6-00163e2b0d79 --target codex

Ejecutar después de confirmar el plan con dry-run.

Introduction

Kuberhealthy runs synthetic monitoring checks inside Kubernetes clusters to validate that the cluster and its components actually work from a workload perspective. Rather than just checking component status, Kuberhealthy launches real pods that exercise DNS resolution, create deployments, test network connectivity, and validate storage provisioning, surfacing failures before they affect users.

What Kuberhealthy Does

  • Runs health check pods on a schedule to validate cluster subsystems
  • Tests DNS resolution, deployment creation, pod scheduling, and network connectivity
  • Exposes check results via a Prometheus-compatible metrics endpoint
  • Provides a built-in status page showing check results across the cluster
  • Supports custom check images for application-specific health validation

Architecture Overview

Kuberhealthy runs as a deployment that watches KuberhealthyCheck (khcheck) CRDs. Each check defines a container image, schedule, and timeout. On each interval, the operator creates a Job that runs the check container. The check container performs its validation and reports success or failure back to the Kuberhealthy API. Results are stored as KuberhealthyState (khstate) custom resources and exposed as Prometheus metrics.

Self-Hosting & Configuration

  • Deploy via Helm chart with default health checks included
  • Enable built-in checks for DNS, deployment, daemonset, and pod status
  • Create custom KuberhealthyCheck CRDs with your own check container images
  • Configure check intervals, timeouts, and alert thresholds per check
  • Scrape the /metrics endpoint with Prometheus for alerting integration

Key Features

  • Real pod-based checks validate cluster behavior from a workload perspective
  • Pre-built checks cover DNS, deployment lifecycle, pod restart, and network
  • Custom check framework lets you write checks in any language as a container
  • Prometheus metrics endpoint integrates with existing monitoring stacks
  • Namespace-scoped and cluster-scoped checks for multi-tenant monitoring

Comparison with Similar Tools

  • Prometheus kube-state-metrics — reports Kubernetes object states but does not perform active validation
  • Goldpinger — specifically tests pod-to-pod network connectivity, narrower scope
  • Gatus — external endpoint monitoring, not cluster-internal synthetic checks
  • Healthchecks — cron job monitoring service, not Kubernetes-native synthetic testing

FAQ

Q: What built-in checks does Kuberhealthy include? A: Built-in checks include DNS resolution, deployment creation and deletion, daemonset scheduling, pod restart detection, and HTTP endpoint availability.

Q: Can I write custom health checks? A: Yes. A custom check is any container image that calls the Kuberhealthy API to report success or failure. You can write checks in Go, Python, Bash, or any language.

Q: How does Kuberhealthy integrate with alerting? A: Kuberhealthy exposes check results as Prometheus metrics. Configure Prometheus AlertManager rules on these metrics to trigger alerts when checks fail.

Q: Does running health checks consume significant cluster resources? A: Checks run as short-lived pods with configurable resource limits. The default checks use minimal resources and run infrequently, so cluster overhead is negligible.

Sources

Discusión

Inicia sesión para unirte a la discusión.
Aún no hay comentarios. Sé el primero en compartir tus ideas.

Activos relacionados