SkillsMay 12, 2026·2 min read

APO — LLM-Powered Observability Workflows

APO (AutoPilot Observability) combines OpenTelemetry and eBPF with agentic workflows to triage alerts and find root causes with verifiable charts.

Agent ready

Ready-to-run agent install

This asset can be installed after the agent chooses its runtime, checks the plan, and runs the matching command.

Native · 98/100Policy: allow
Agent surface
Any MCP/CLI agent
Kind
Skill
Install
Single
Trust
Trust: Established
Entrypoint
Asset
Direct install command
npx -y tokrepo@latest install 2e546399-3f6d-5d30-a01f-d81be8165e9c --target codex

Run after dry-run confirms the install plan.

Intro

APO (AutoPilot Observability) combines OpenTelemetry and eBPF with agentic workflows to triage alerts and find root causes with verifiable charts.

  • Best for: SRE/Platform teams prototyping AI-assisted incident triage with real telemetry evidence
  • Works with: OpenTelemetry data sources; eBPF-based tracing (per README); Docker for quick runs
  • Setup time: 20–60 minutes

Practical Notes

  • GitHub: 380 stars · 54 forks; pushed 2026-03-25 (verified via GitHub API).
  • Repo ships a multi-stage backend/Dockerfile that builds apo-backend and includes SQLite + static assets.
  • Kubernetes manifests exist in backend/deploy/apo-backend-deploy.yml, useful for cluster trials.

Main

How to keep “LLM observability” grounded:

  1. Start from one workflow (alert validity triage → RCA) and force the agent to cite the specific charts/metrics it used.
  2. Keep a deterministic query layer (OTel + logs/metrics/traces) and have the agent generate queries, not conclusions.
  3. Store the workflow runs as incident artifacts so you can replay and compare outcomes across deployments.
  4. Introduce guardrails: block actions that mutate production without explicit approval and a rollback plan.

The goal is not to replace dashboards—it is to turn “what should I look at next?” into a repeatable runbook.

FAQ

Q: Does this replace OpenTelemetry? A: No—it builds on OpenTelemetry and enriches it with eBPF data and workflow automation.

Q: Can I run it locally first? A: Yes—the repo includes a Dockerfile for apo-backend so you can trial quickly.

Q: How do I reduce hallucinations? A: Require evidence: charts, query outputs, and explicit links to metrics/traces used.

🙏

Source & Thanks

Source: https://github.com/CloudDetail/apo > License: Apache-2.0 > GitHub stars: 380 · forks: 54

Discussion

Sign in to join the discussion.
No comments yet. Be the first to share your thoughts.

Related Assets