Practical Notes
- GitHub: 380 stars · 54 forks; pushed 2026-03-25 (verified via GitHub API).
- Repo ships a multi-stage
backend/Dockerfilethat buildsapo-backendand includes SQLite + static assets. - Kubernetes manifests exist in
backend/deploy/apo-backend-deploy.yml, useful for cluster trials.
Main
How to keep “LLM observability” grounded:
- Start from one workflow (alert validity triage → RCA) and force the agent to cite the specific charts/metrics it used.
- Keep a deterministic query layer (OTel + logs/metrics/traces) and have the agent generate queries, not conclusions.
- Store the workflow runs as incident artifacts so you can replay and compare outcomes across deployments.
- Introduce guardrails: block actions that mutate production without explicit approval and a rollback plan.
The goal is not to replace dashboards—it is to turn “what should I look at next?” into a repeatable runbook.
FAQ
Q: Does this replace OpenTelemetry? A: No—it builds on OpenTelemetry and enriches it with eBPF data and workflow automation.
Q: Can I run it locally first?
A: Yes—the repo includes a Dockerfile for apo-backend so you can trial quickly.
Q: How do I reduce hallucinations? A: Require evidence: charts, query outputs, and explicit links to metrics/traces used.