# APO — LLM-Powered Observability Workflows > APO (AutoPilot Observability) combines OpenTelemetry and eBPF with agentic workflows to triage alerts and find root causes with verifiable charts. ## Install Copy the content below into your project: ## Quick Use ```bash git clone https://github.com/CloudDetail/apo.git cd apo/backend docker build -t apo-backend . docker run --rm -p 8080:8080 apo-backend ``` ## Intro APO (AutoPilot Observability) combines OpenTelemetry and eBPF with agentic workflows to triage alerts and find root causes with verifiable charts. - **Best for:** SRE/Platform teams prototyping AI-assisted incident triage with real telemetry evidence - **Works with:** OpenTelemetry data sources; eBPF-based tracing (per README); Docker for quick runs - **Setup time:** 20–60 minutes ## Practical Notes - GitHub: 380 stars · 54 forks; pushed 2026-03-25 (verified via GitHub API). - Repo ships a multi-stage `backend/Dockerfile` that builds `apo-backend` and includes SQLite + static assets. - Kubernetes manifests exist in `backend/deploy/apo-backend-deploy.yml`, useful for cluster trials. ## Main How to keep “LLM observability” grounded: 1. Start from **one workflow** (alert validity triage → RCA) and force the agent to cite the specific charts/metrics it used. 2. Keep a **deterministic query layer** (OTel + logs/metrics/traces) and have the agent generate queries, not conclusions. 3. Store the workflow runs as incident artifacts so you can replay and compare outcomes across deployments. 4. Introduce guardrails: block actions that mutate production without explicit approval and a rollback plan. The goal is not to replace dashboards—it is to turn “what should I look at next?” into a repeatable runbook. ### FAQ **Q: Does this replace OpenTelemetry?** A: No—it builds on OpenTelemetry and enriches it with eBPF data and workflow automation. **Q: Can I run it locally first?** A: Yes—the repo includes a Dockerfile for `apo-backend` so you can trial quickly. **Q: How do I reduce hallucinations?** A: Require evidence: charts, query outputs, and explicit links to metrics/traces used. ## Source & Thanks > Source: https://github.com/CloudDetail/apo > License: Apache-2.0 > GitHub stars: 380 · forks: 54 --- ## 快速使用 ```bash git clone https://github.com/CloudDetail/apo.git cd apo/backend docker build -t apo-backend . docker run --rm -p 8080:8080 apo-backend ``` ## 简介 APO(AutoPilot Observability)把 OpenTelemetry 与 eBPF 数据汇聚到同一平台,并用 agentic 工作流做告警分流与根因分析,强调“可视化证据”降低幻觉风险。 - **适合谁:** SRE/平台团队:用真实遥测证据做 AI 辅助排障 - **可搭配:** OpenTelemetry 数据源;eBPF 追踪(见 README);可用 Docker 快速运行 - **准备时间:** 20–60 分钟 ## 实战建议 - GitHub:380 stars · 54 forks;最近更新 2026-03-25(GitHub API 验证)。 - 仓库提供多阶段 `backend/Dockerfile`:构建 `apo-backend`,并内置 SQLite 与静态资源。 - `backend/deploy/apo-backend-deploy.yml` 提供 K8s 部署清单,适合集群试跑。 ## 主要内容 让“LLM 可观测”更可靠的做法: 1. 先从 **一个工作流** 落地(告警有效性→根因),强制要求每个结论都引用具体图表/指标证据。 2. 保持 **可复现的查询层**(OTel + 指标/日志/追踪),让 agent 生成查询,而不是直接输出结论。 3. 把每次 workflow run 保存为事故工单附件,便于复盘与对比不同环境的差异。 4. 加上护栏:任何会改动生产的动作都必须显式审批,并预先写好回滚方案。 目标不是替代仪表盘,而是把“下一步看什么”沉淀成可重复的 runbook。 ### FAQ **它会取代 OpenTelemetry 吗?** 答:不会。它基于 OpenTelemetry,并通过 eBPF 与工作流自动化补强排障流程。 **可以先本地跑起来吗?** 答:可以。仓库提供 `apo-backend` 的 Dockerfile,适合快速试跑。 **怎么降低幻觉?** 答:强制证据链:图表、查询结果、以及引用到的指标/trace 片段。 ## 来源与感谢 > Source: https://github.com/CloudDetail/apo > License: Apache-2.0 > GitHub stars: 380 · forks: 54 --- Source: https://tokrepo.com/en/workflows/apo-llm-powered-observability-workflows Author: Script Depot