[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"pack-detail-deploy-monitor-observability-es":3,"seo:pack:deploy-monitor-observability:es":100},{"code":4,"message":5,"data":6},200,"操作成功",{"pack":7},{"slug":8,"icon":9,"tone":10,"status":11,"status_label":12,"title":13,"description":14,"items":15,"install_cmd":99},"deploy-monitor-observability","📡","#0891B2","new","Nuevo · esta semana","Stack de Despliegue + Monitoreo + Observabilidad","Diez selecciones para desarrolladores que ponen código en producción: destinos de despliegue (Vercel \u002F Kamal \u002F Coolify), error tracking, OpenTelemetry, métricas, logs, dashboards, uptime y alertas — encadenados en un orden deliberado para que realmente captures la próxima caída.",[16,28,38,46,53,60,67,75,82,92],{"id":17,"uuid":18,"slug":19,"title":20,"description":21,"author_name":22,"view_count":23,"vote_count":24,"lang_type":25,"type":26,"type_label":27},3033,"2d5d7b20-25a2-4f99-bb2e-827672d613dd","vercel-cli-preview-deployments-from-terminal","Vercel CLI — Preview Deployments from Terminal","Vercel CLI runs dev servers, pulls project env, and creates preview or production deployments from the terminal. Useful for agent-built web changes.","Vercel",84,0,"en","script","Script",{"id":29,"uuid":30,"slug":31,"title":32,"description":33,"author_name":34,"view_count":35,"vote_count":24,"lang_type":25,"type":36,"type_label":37},1443,"5211d45c-3908-11f1-9bc6-00163e2b0d79","kamal-zero-downtime-docker-deploys-any-server-5211d45c","Kamal — Zero-Downtime Docker Deploys to Any Server","Kamal is Basecamp's deploy tool that ships Docker containers to bare metal or cloud VMs with a single command, giving you Heroku-like workflows on servers you actually own.","Script Depot",122,"skill","Skill",{"id":39,"uuid":40,"slug":41,"title":42,"description":43,"author_name":44,"view_count":45,"vote_count":24,"lang_type":25,"type":36,"type_label":37},464,"202dfab1-6823-4fb2-a585-8af913d55af3","coolify-self-hosted-vercel-netlify-alternative-202dfab1","Coolify — Self-Hosted Vercel & Netlify Alternative","Deploy apps, databases, and services on your own server with one click. No vendor lock-in. 52K+ GitHub stars.","AI Open Source",152,{"id":47,"uuid":48,"slug":49,"title":50,"description":51,"author_name":44,"view_count":52,"vote_count":24,"lang_type":25,"type":36,"type_label":37},945,"ece57add-34d8-11f1-9bc6-00163e2b0d79","sentry-open-source-error-tracking-performance-monitoring-ece57add","Sentry — Open Source Error Tracking & Performance Monitoring","Sentry is the developer-first error tracking and performance monitoring platform. Capture exceptions, trace performance issues, and debug production errors across all languages.",176,{"id":54,"uuid":55,"slug":56,"title":57,"description":58,"author_name":44,"view_count":59,"vote_count":24,"lang_type":25,"type":36,"type_label":37},1472,"1e161adc-3929-11f1-9bc6-00163e2b0d79","opentelemetry-collector-vendor-neutral-telemetry-pipeline-1e161adc","OpenTelemetry Collector — Vendor-Neutral Telemetry Pipeline","The OpenTelemetry Collector is the CNCF-graduated pipeline for receiving, processing, and exporting metrics, logs, and traces across any observability backend, replacing per-vendor agents with one portable binary.",130,{"id":61,"uuid":62,"slug":63,"title":64,"description":65,"author_name":44,"view_count":66,"vote_count":24,"lang_type":25,"type":36,"type_label":37},916,"ed3a8de4-34ae-11f1-9bc6-00163e2b0d79","prometheus-open-source-monitoring-alerting-toolkit-ed3a8de4","Prometheus — Open Source Monitoring & Alerting Toolkit","Prometheus is the CNCF-graduated monitoring system and time series database. Pull-based metrics collection, powerful PromQL queries, and built-in alerting for cloud-native infrastructure.",135,{"id":68,"uuid":69,"slug":70,"title":71,"description":72,"author_name":73,"view_count":74,"vote_count":24,"lang_type":25,"type":36,"type_label":37},958,"92fa7c1f-352f-11f1-9bc6-00163e2b0d79","grafana-loki-prometheus-inspired-log-aggregation-system-92fa7c1f","Grafana Loki — Prometheus-Inspired Log Aggregation System","Loki is a horizontally scalable, multi-tenant log aggregation system by Grafana Labs. Unlike other log systems, Loki indexes metadata about logs, not log content itself.","Grafana Labs",210,{"id":76,"uuid":77,"slug":78,"title":79,"description":80,"author_name":73,"view_count":81,"vote_count":24,"lang_type":25,"type":36,"type_label":37},915,"ed1a524f-34ae-11f1-9bc6-00163e2b0d79","grafana-open-source-data-visualization-observability-ed1a524f","Grafana — Open Source Data Visualization & Observability","Grafana is the leading open-source platform for monitoring and observability. Visualize metrics, logs, and traces from Prometheus, Loki, Elasticsearch, and 100+ data sources.",193,{"id":83,"uuid":84,"slug":85,"title":86,"description":87,"author_name":88,"view_count":89,"vote_count":24,"lang_type":25,"type":90,"type_label":91},465,"88e260be-dfd0-46b6-883f-21141a8c2f23","uptime-kuma-self-hosted-uptime-monitoring-88e260be","Uptime Kuma — Self-Hosted Uptime Monitoring","Monitor HTTP, TCP, DNS, Docker services with notifications to 90+ channels. Beautiful dashboard. 84K+ GitHub stars.","MCP Hub",171,"mcp","MCP",{"id":93,"uuid":94,"slug":95,"title":96,"description":97,"author_name":34,"view_count":98,"vote_count":24,"lang_type":25,"type":36,"type_label":37},2026,"51f92d7e-3f31-11f1-9bc6-00163e2b0d79","prometheus-alertmanager-alert-routing-notification-hub-51f92d7e","Prometheus Alertmanager — Alert Routing and Notification Hub","Alertmanager handles alerts sent by Prometheus, deduplicating, grouping, and routing them to the right notification channel such as email, Slack, PagerDuty, or webhooks.",133,"tokrepo install pack\u002Fdeploy-monitor-observability",{"pageType":101,"pageKey":8,"locale":25,"title":102,"metaDescription":103,"h1":104,"tldr":105,"bodyMarkdown":106,"faq":107,"schema":123,"internalLinks":128,"citations":141,"wordCount":154,"generatedAt":155},"pack","Deploy + Monitor + Observability Stack — 10 Picks for Shipping to Prod","Vercel CLI, Kamal, Coolify, Sentry, OpenTelemetry Collector, Prometheus, Loki, Grafana, Uptime Kuma, Alertmanager — a deliberate stack that wires deploy → traces → logs → metrics → uptime → alerts → dashboards. Open-source first. Install via TokRepo.","Deploy + Monitor + Observability Stack","Ten picks that take you from `git push` to a pager that actually fires when prod breaks. Three deploy targets (PaaS, container, self-hosted), Sentry for errors, OpenTelemetry for traces, Prometheus + Loki for metrics and logs, Grafana for the wall display, Uptime Kuma for the heartbeat, and Alertmanager for the page. Open-source-first; mention hosted equivalents where they earn their price.","## What's in this pack\n\nThis is the stack a working backend engineer would assemble the *week before* their app gets real users — not the heroic post-outage scramble. Every pick here is **open-source-first**, **runs on a $20 VPS or smaller**, and **plugs into the next tool in the chain**. The order matters: each layer feeds the next.\n\n| # | Pick | Layer | What it does |\n|---|---|---|---|\n| 1 | Vercel CLI | deploy (PaaS) | preview URL on every `git push`, zero config for Next\u002FNuxt\u002FAstro |\n| 2 | Kamal | deploy (container) | zero-downtime Docker deploys to any bare VPS — Basecamp's tool |\n| 3 | Coolify | deploy (self-hosted PaaS) | open-source Vercel\u002FHeroku replacement for your own server |\n| 4 | Sentry | errors + APM | exception capture, release health, performance traces |\n| 5 | OpenTelemetry Collector | telemetry pipeline | vendor-neutral fan-in for traces, metrics, logs |\n| 6 | Prometheus | metrics | pull-based time-series DB, the industry default |\n| 7 | Grafana Loki | logs | log aggregation that thinks like Prometheus — cheap, indexed by label |\n| 8 | Grafana | dashboards | the wall display every other tool plugs into |\n| 9 | Uptime Kuma | uptime + status page | self-hosted heartbeat that pages you when the site dies |\n| 10 | Prometheus Alertmanager | alert routing | dedupe, group, route alerts to PagerDuty \u002F Slack \u002F email |\n\n## Install in this order (deploy → traces → logs → metrics → uptime → alerts → dashboards)\n\nThe order is deliberate. **Don't install dashboards first.** Empty dashboards teach you nothing. Wire the data sources first; the dashboard is the last 10% of the work.\n\n1. **Pick one deploy target.** Vercel CLI if you're shipping a JS framework and want preview URLs on every PR. Kamal if you've outgrown Heroku-style pricing and want to own the box. Coolify if you want the Vercel UX on your own hardware. Pick one. Skip the other two.\n2. **Sentry next.** Errors are the single highest-signal telemetry you'll add. Five lines of SDK init and you start catching exceptions you didn't know existed. Set up release tracking from day one so you can answer \"did this start with the last deploy?\"\n3. **OpenTelemetry Collector.** Don't lock yourself to one vendor's SDK. The Collector is a single Go binary that receives OTLP from your app and fans out to Sentry, Prometheus, Loki, or anything else. Configure it once, swap backends without touching app code.\n4. **Prometheus for metrics.** Scrape `\u002Fmetrics` from your app, your Node Exporter, your database exporters. The four golden signals — latency, traffic, errors, saturation — go here.\n5. **Loki for logs.** If you already use Prometheus, Loki is the obvious log store: same label model, same query language flavor, runs on the same VM. Don't index every JSON field; index by service, env, level — let `LogQL` filter the rest.\n6. **Uptime Kuma for the heartbeat.** External-perspective ping. Catches the outages your internal stack can't see (DNS, TLS cert, CDN). Public status page included.\n7. **Alertmanager wired to Prometheus.** Alerts should fire on symptoms (p95 latency > 2s, error rate > 1%), not causes (CPU > 80%). Route P1 to pager, P2 to Slack, P3 to a daily digest.\n8. **Grafana last.** Now that data is flowing, build *three* dashboards: one for the on-call engineer (latency, error rate, recent deploys), one for the product owner (signups, conversions, cost per user), one for the exec (uptime %, MAU, week-over-week). Generic dashboards get ignored.\n\n## Tradeoffs you'll hit\n\n- **Vercel vs Kamal vs Coolify** — Vercel = zero-ops, scales to zero, gets expensive at scale and you don't own the stack. Kamal = own the box, Docker is the only abstraction, cheap and predictable. Coolify = the middle ground; self-hosted UI on top of Docker. Most teams ship the MVP on Vercel, migrate to Kamal\u002FCoolify when the bill hits $500\u002Fmo.\n- **Sentry SaaS vs self-hosted** — Self-hosted Sentry needs ~6 services (Kafka, Postgres, Redis, ClickHouse). For under 100k events\u002Fmonth, the SaaS free tier is genuinely cheaper than your time. Self-host only when you're past the free tier and have ops bandwidth.\n- **Prometheus + Loki + Grafana vs Datadog** — Datadog is the polished hosted incumbent. The open stack costs ~$20\u002Fmo in VPS instead of $300+\u002Fmo per host. Tradeoff: you babysit the stack. Below ~10 services, open-source wins on cost and lock-in; above ~50, Datadog's ergonomics start to matter.\n- **Push vs pull metrics** — Prometheus is pull (it scrapes you). If you run serverless or short-lived jobs, pull doesn't work — use a Pushgateway, or switch to OpenTelemetry push to a Collector. Don't fight the model.\n\n## Common pitfalls\n\n- **Alerting on causes, not symptoms.** \"CPU > 80%\" pages you at 3am for a workload that's fine. \"User-facing p95 > 2s\" pages you only when it matters. Tune for symptoms; investigate causes after waking up.\n- **No release annotation in Grafana.** Half of all incidents start \"right after the deploy.\" Wire your deploy script to POST a Grafana annotation on every release. The flame on the timeline saves 20 minutes per incident.\n- **Indexing every log field.** Loki's whole point is that it doesn't. If you add 50 labels per log line, cardinality explodes and the cheap log store becomes expensive. Index by service, env, level — grep the rest.\n- **One alert channel for everything.** P1 (site down) → phone. P2 (degraded) → Slack with @channel. P3 (anomaly) → daily digest. Mix them and either you ignore the pager or you ignore the digest. Both fail.\n- **No external uptime check.** Your internal Prometheus thinks the service is up. Cloudflare or your CDN is dropping 30% of requests in `eu-west`. Uptime Kuma from a different network catches this. Five minutes to set up.",[108,111,114,117,120],{"q":109,"a":110},"Do I really need all ten of these? It looks like a lot.","You need one from each *layer*, not all ten. The pack lists alternatives within layers (three deploy targets, two metric paths via Prometheus or OTel) — pick the one that fits your scale. The minimum viable stack for a 1-person indie ship is: Vercel CLI + Sentry + Uptime Kuma. Add Prometheus + Grafana + Alertmanager when you have a second engineer. Add Loki + OpenTelemetry Collector when you're past 10 services. Don't install ahead of need.",{"q":112,"a":113},"What's the realistic monthly cost for this whole stack?","For a small team: Vercel free or $20\u002Fmo, Sentry free tier (5k errors\u002Fmo) or $26\u002Fmo, then a single $5-20 VPS to host Prometheus + Loki + Grafana + Uptime Kuma + Alertmanager together (they're all light on RAM). Total: $25-60\u002Fmo for production observability that catches real outages. Compare to Datadog at $15-31 per host per month, often $300+\u002Fmo for the same coverage.",{"q":115,"a":116},"How does this overlap with the LLM Observability pack?","LLM Observability (Langfuse, Phoenix, AgentOps) is the *application-semantic* layer — prompt traces, token costs, eval scores. This Deploy + Monitor + Observability pack is the *infrastructure* layer — is the container alive, is the HTTP p95 acceptable, did the deploy break the error rate. You want both. The OpenTelemetry Collector in this pack can ingest LLM traces from Langfuse\u002FPhoenix and forward them alongside infra metrics, so on-call sees both on one Grafana dashboard.",{"q":118,"a":119},"Why Kamal over Docker Swarm or Nomad?","Kamal is opinionated to the point of being boring, which is what you want for deploys. It only does zero-downtime container rollouts and traefik-based routing — no scheduler, no service mesh, no YAML cathedral. For 1-10 servers it's the simplest thing that works. Swarm is in maintenance mode; Nomad is great but the operational footprint is larger than a small team needs. Reach for k8s only when you have someone whose full-time job is k8s.",{"q":121,"a":122},"Can I use this stack with a serverless backend (AWS Lambda, Cloudflare Workers)?","Yes, but the scrape model breaks. For serverless, use OpenTelemetry SDKs that *push* traces and metrics to the OpenTelemetry Collector via OTLP. The Collector then writes to Prometheus (via remote_write) and Loki, and everything else in the pack works unchanged. Uptime Kuma still pings the public URL, Sentry's SDK works in Lambda\u002FWorkers runtimes, and Grafana dashboards don't care where the data came from.",{"@context":124,"@type":125,"name":104,"description":126,"numberOfItems":127,"inLanguage":25},"https:\u002F\u002Fschema.org","ItemList","Ten open-source-first picks that take a dev from git push to a working alert pipeline: deploy, traces, logs, metrics, uptime, alerts, dashboards.",10,[129,133,137],{"url":130,"anchor":131,"reason":132},"\u002Fen\u002Fpacks\u002Fllm-observability","LLM Observability pack","Application-semantic counterpart for teams shipping LLM features alongside this infra stack",{"url":134,"anchor":135,"reason":136},"\u002Fen\u002Fai-tools-for\u002Fdevops","DevOps tools curated for AI agents","Broader catalog of agent-friendly DevOps assets — Kamal, Coolify, k8s tooling",{"url":138,"anchor":139,"reason":140},"\u002Fen\u002Ftopics","Browse other topic packs","More opinionated stacks: backend AI toolkit, frontend AI toolkit, AI second brain",[142,146,150],{"claim":143,"source_name":144,"source_url":145},"Kamal performs zero-downtime Docker deploys to any server","Kamal — Deploy Docker apps anywhere","https:\u002F\u002Fkamal-deploy.org\u002F",{"claim":147,"source_name":148,"source_url":149},"OpenTelemetry Collector is a vendor-neutral implementation for receiving, processing and exporting telemetry data","OpenTelemetry Collector docs","https:\u002F\u002Fopentelemetry.io\u002Fdocs\u002Fcollector\u002F",{"claim":151,"source_name":152,"source_url":153},"Prometheus Alertmanager handles deduplication, grouping and routing of alerts","Prometheus Alertmanager docs","https:\u002F\u002Fprometheus.io\u002Fdocs\u002Falerting\u002Flatest\u002Falertmanager\u002F",905,"2026-05-22T10:00:00Z"]