Cette page est affichée en anglais. Une traduction française est en cours.
SkillsApr 15, 2026·3 min de lecture

Citus — Distributed PostgreSQL for Sharding and HTAP

A Postgres extension that turns your database into a distributed cluster with sharding, columnar storage and parallel query — keeping full SQL, ACID, JSONB, PostGIS and the Postgres ecosystem intact.

Prêt pour agents

Installation agent prête

Cet actif peut être installé après choix du runtime, vérification du plan et exécution de la commande adaptée.

Native · 98/100Policy : autoriser
Surface agent
Tout agent MCP/CLI
Type
Skill
Installation
Single
Confiance
Confiance : Established
Point d'entrée
Citus Guide
Commande d'installation directe
npx -y tokrepo@latest install cb669573-3920-11f1-9bc6-00163e2b0d79 --target codex

À exécuter après confirmation du plan en dry-run.

TL;DR
Citus turns PostgreSQL into a distributed cluster with sharding, columnar storage, and parallel queries while keeping full SQL.
§01

What it is

Citus is a PostgreSQL extension that transforms a single-node Postgres database into a distributed cluster. It adds horizontal sharding, columnar storage, and parallel query execution while preserving full SQL compatibility, ACID transactions, JSONB, PostGIS, and the entire Postgres extension ecosystem.

Citus is designed for multi-tenant SaaS applications and real-time analytics workloads (HTAP). It distributes tables across worker nodes by a chosen distribution column (typically tenant_id), allowing queries to be parallelized across shards. Now part of Microsoft Azure (Azure Cosmos DB for PostgreSQL), Citus remains fully open-source.

§02

How it saves time or tokens

Citus eliminates the need to migrate away from PostgreSQL when your data outgrows a single node. Instead of rewriting your application for a different database, you add the Citus extension and distribute your tables. Existing SQL queries, indexes, and Postgres features continue to work. Columnar storage compresses analytical data by up to 10x, reducing storage costs for time-series and log data.

§03

How to use

  1. Spin up a Citus cluster: docker compose -p citus up -d --scale worker=2 using the official Docker setup.
  2. Connect to the coordinator node: psql -U postgres.
  3. Create tables and distribute them: SELECT create_distributed_table('events', 'tenant_id');.
  4. Query normally -- Citus parallelizes execution across workers automatically.
§04

Example

-- Create and distribute a table
CREATE TABLE events (
  tenant_id BIGINT,
  id BIGSERIAL,
  payload JSONB,
  ts TIMESTAMPTZ DEFAULT now()
);
SELECT create_distributed_table('events', 'tenant_id');

-- Columnar storage for analytics
CREATE TABLE logs (ts TIMESTAMPTZ, level TEXT, message TEXT)
  USING columnar;

-- Query across shards
SELECT tenant_id, COUNT(*), AVG(payload->>'duration')
FROM events
WHERE ts > now() - INTERVAL '1 day'
GROUP BY tenant_id;
§05

Related on TokRepo

§06

Common pitfalls

  • Choosing the wrong distribution column leads to data skew and hot shards. Pick a column with high cardinality and even distribution (tenant_id is the classic choice).
  • Cross-shard joins are expensive. Design your schema so that co-located tables share the same distribution column to keep joins local.
  • Not all Postgres features work identically in distributed mode. Check Citus documentation for limitations around CTEs, window functions, and certain DDL operations.

Questions fréquentes

What is Citus used for?+

Citus is used for scaling PostgreSQL horizontally. Primary use cases are multi-tenant SaaS applications (isolate tenant data across shards) and real-time analytics (parallel aggregation across distributed data). It handles both OLTP and OLAP workloads.

Is Citus free and open source?+

Yes. Citus is open-source under the AGPL license. It is also available as a managed service through Azure Cosmos DB for PostgreSQL. The open-source version includes all sharding, columnar storage, and parallel query features.

How does Citus compare to native PostgreSQL partitioning?+

PostgreSQL partitioning splits data within a single node. Citus distributes data across multiple nodes with a coordinator that routes queries. Partitioning helps with pruning; Citus helps with horizontal scaling beyond what one machine can handle.

Can I use PostGIS with Citus?+

Yes. Citus is a Postgres extension, so it works alongside other extensions including PostGIS, pg_trgm, and hstore. Geospatial queries on distributed tables work as expected when the distribution column is included in the query.

What is columnar storage in Citus?+

Columnar storage stores data by column rather than by row, which improves compression ratios and scan performance for analytical queries. In Citus, create a table with `USING columnar` to enable it. Best for append-only data like logs and time-series.

Sources citées (3)

Fil de discussion

Connectez-vous pour rejoindre la discussion.
Aucun commentaire pour l'instant. Soyez le premier à partager votre avis.

Actifs similaires