ConfigsApr 10, 2026·3 min read

Prometheus — Open Source Monitoring & Alerting Toolkit

Prometheus is the CNCF-graduated monitoring system and time series database. Pull-based metrics collection, powerful PromQL queries, and built-in alerting for cloud-native infrastructure.

TL;DR
Prometheus scrapes metrics from targets, stores them as time series, and provides PromQL for querying and alerting on infrastructure health.
§01

What it is

Prometheus is an open-source monitoring system and time series database, originally built at SoundCloud and now a CNCF-graduated project. It uses a pull-based model to scrape metrics from instrumented targets at configured intervals, stores them locally, and provides PromQL -- a powerful query language for aggregation, filtering, and alerting.

It is designed for DevOps engineers and SREs who need reliable metrics collection, alerting, and dashboarding for containerized and cloud-native workloads.

§02

How it saves time or tokens

Prometheus auto-discovers scrape targets in Kubernetes using service discovery, eliminating manual target configuration as services scale up or down. PromQL lets you write complex queries -- rate of HTTP errors over 5 minutes, 99th percentile latency per endpoint -- in a single expression. The built-in Alertmanager routes alerts to Slack, PagerDuty, or email based on configurable rules, replacing custom alerting scripts.

§03

How to use

  1. Start Prometheus with Docker.
docker run -d --name prometheus -p 9090:9090 \
  -v ./prometheus.yml:/etc/prometheus/prometheus.yml \
  prom/prometheus:latest
  1. Create a minimal prometheus.yml.
global:
  scrape_interval: 15s
scrape_configs:
  - job_name: prometheus
    static_configs:
      - targets: ['localhost:9090']
  1. Open http://localhost:9090 and query metrics with PromQL.
rate(prometheus_http_requests_total[5m])
§04

Example

A PromQL query for alerting on high error rates:

sum(rate(http_requests_total{status=~"5.."}[5m]))
  /
sum(rate(http_requests_total[5m]))
  > 0.05

This fires when more than 5 percent of HTTP requests return 5xx status codes over a 5-minute window.

§05

Related on TokRepo

  • Monitoring tools -- Compare Prometheus with other observability solutions.
  • DevOps tools -- Infrastructure automation that pairs with monitoring.
§06

Common pitfalls

  • Prometheus stores data locally by default. For long-term retention, use Thanos or Cortex as a remote storage backend.
  • High-cardinality labels (user IDs, request IDs) cause memory usage to explode. Keep label cardinality bounded.
  • The pull model requires network access from Prometheus to all targets. In firewalled environments, use Pushgateway for short-lived jobs.

Frequently Asked Questions

How does Prometheus differ from Grafana?+

Prometheus collects and stores metrics and evaluates alerting rules. Grafana is a visualization layer that queries Prometheus (and other data sources) to render dashboards. They are complementary tools typically deployed together.

Does Prometheus work with Kubernetes?+

Yes. Prometheus has built-in Kubernetes service discovery. It auto-discovers pods, services, and endpoints using Kubernetes API annotations. The kube-prometheus-stack Helm chart bundles Prometheus, Alertmanager, Grafana, and pre-built dashboards.

What is PromQL?+

PromQL (Prometheus Query Language) is a functional query language for selecting, aggregating, and transforming time series data. It supports operations like rate, histogram_quantile, sum by label, and mathematical functions.

How does alerting work in Prometheus?+

You define alerting rules in YAML files that specify PromQL conditions and durations. When a condition is true for the specified duration, Prometheus fires the alert to Alertmanager, which handles deduplication, grouping, and routing to notification channels.

Can Prometheus handle long-term storage?+

Prometheus is optimized for short-to-medium retention (days to weeks). For long-term storage, integrate with Thanos, Cortex, or VictoriaMetrics, which provide horizontal scaling and object-store-backed retention.

Citations (3)
🙏

Source & Thanks

Discussion

Sign in to join the discussion.
No comments yet. Be the first to share your thoughts.

Related Assets