Cortex — Horizontally Scalable Long-Term Storage for Prometheus
Cortex is a CNCF project that provides horizontally scalable, highly available, multi-tenant, long-term storage for Prometheus metrics, letting you run Prometheus-as-a-Service at scale.
What it is
Cortex is a CNCF incubating project that extends Prometheus with horizontally scalable, highly available, multi-tenant long-term storage. It accepts Prometheus metrics via remote write, stores them durably, and serves PromQL queries across months or years of data. Cortex lets organizations run Prometheus-as-a-Service without losing data when individual Prometheus instances restart.
It targets platform teams and SRE organizations that run Prometheus at scale and need durable metric storage, multi-tenancy for different teams, and query federation across multiple Prometheus instances.
How it saves time or tokens
Standalone Prometheus stores metrics locally and loses data on restart or disk failure. Cortex adds durable storage with object store backends (S3, GCS, Azure Blob), so metric history persists indefinitely. Multi-tenancy means one Cortex cluster serves multiple teams without running separate Prometheus stacks for each. Horizontal scaling handles growing ingestion rates without re-architecting.
How to use
- Deploy Cortex components (distributor, ingester, querier, compactor) using Helm charts or Docker Compose.
- Configure your existing Prometheus instances to remote-write to Cortex's distributor endpoint.
- Point Grafana or other visualization tools at Cortex's query endpoint for long-term metric queries.
Example
# prometheus.yml - configure remote write to Cortex
remote_write:
- url: http://cortex-distributor:9009/api/v1/push
headers:
X-Scope-OrgID: team-backend
# Query long-term metrics via Cortex
# Grafana datasource: http://cortex-querier:9009/prometheus
Related on TokRepo
- Monitoring tools — Observability and metrics solutions
- DevOps tools — Infrastructure and operations automation
Common pitfalls
- Cortex has many microservice components. Start with single-binary mode for testing, then decompose into microservices for production scaling.
- Object storage costs accumulate with long retention periods. Configure compaction and downsampling to manage storage costs for multi-year retention.
- Multi-tenancy requires setting the X-Scope-OrgID header on all requests. Missing headers cause authentication errors or data routing to a default tenant.
Frequently Asked Questions
Both add long-term storage and high availability to Prometheus. Cortex ingests metrics via remote write and stores them centrally. Thanos uses a sidecar model that reads from existing Prometheus instances and uploads to object storage. Cortex is better for centralized Prometheus-as-a-Service; Thanos is better for federated existing Prometheus deployments.
Cortex supports Amazon S3, Google Cloud Storage, Azure Blob Storage, and any S3-compatible object store (like MinIO) for long-term block storage. It uses a key-value store (DynamoDB, Consul, etcd, or memberlist) for hash ring coordination.
Yes. Cortex is a CNCF incubating project used in production by multiple organizations. It also forms the basis of commercial offerings like Grafana Cloud's metrics backend.
Each request includes an X-Scope-OrgID header identifying the tenant. Cortex isolates data and queries per tenant. Rate limits, retention policies, and access controls can be configured per tenant. This enables a single cluster to serve multiple independent teams.
Yes. Add a remote_write configuration to your existing Prometheus instances pointing at Cortex. Prometheus continues scraping targets as before; Cortex stores the data for long-term retention and cross-instance queries. The migration is additive, not disruptive.
Citations (3)
- Cortex GitHub Repository— Cortex is a CNCF incubating project for Prometheus long-term storage
- Cortex Official Documentation— Cortex architecture and deployment
- Prometheus Remote Write Docs— Prometheus remote write specification
Related on TokRepo
Discussion
Related Assets
NAPI-RS — Build Node.js Native Addons in Rust
Write high-performance Node.js native modules in Rust with automatic TypeScript type generation and cross-platform prebuilt binaries.
Mamba — Fast Cross-Platform Package Manager
A drop-in conda replacement written in C++ that resolves environments in seconds instead of minutes.
Plasmo — The Browser Extension Framework
Build, test, and publish browser extensions for Chrome, Firefox, and Edge using React or Vue with hot-reload and automatic manifest generation.