What is Cube — Open Source Semantic Layer for Data Apps?

Cube is a headless semantic layer that turns your warehouse into a reusable API for BI, embedded analytics, and AI — defining metrics once and serving them via SQL, REST, GraphQL, and MDX.

Is Cube — Open Source Semantic Layer for Data Apps free to use?

Yes. Cube — Open Source Semantic Layer for Data Apps is freely available on TokRepo. Check the Source & Thanks section on the asset page for the specific open-source license.

How do I install Cube — Open Source Semantic Layer for Data Apps?

Visit the asset page on TokRepo and click "Copy for agent" to get the installation instructions. Most assets can be installed with a single command.

Cube — Open Source Semantic Layer for Data Apps

Introduction

Cube (formerly Cube.js) is an open source semantic layer for building data applications. You define measures, dimensions, and joins once in YAML/JS/Python; Cube generates and caches SQL against your warehouse and serves the results to BI tools, apps, notebooks, and LLM agents via SQL, REST, GraphQL, or MDX. The project has over 19,000 GitHub stars and is used by thousands of teams to keep metric definitions consistent.

What Cube Does

Models data with reusable cubes, views, and pre-aggregations defined in code and version-controlled.
Translates client queries into optimized warehouse SQL with automatic pre-aggregation routing.
Serves the same model through SQL (Postgres-compatible), REST, GraphQL, and MDX endpoints.
Handles multi-tenant row-level security with context-aware filters.
Provides a playground, developer API, and TypeScript client for embedded analytics.

Architecture Overview

Cube is a Node.js + Rust stack. The schema compiler turns your data model into an intermediate representation; the query orchestrator matches requests to cubes, rewrites them into warehouse SQL, and hits Cube Store (a distributed Arrow-based cache) or the source warehouse directly. A SQL API layer built on DataFusion speaks the Postgres wire protocol so Tableau/Looker/Power BI can query Cube as if it were a database. Authentication is JWT-based with per-user filter injection. Supported warehouses include Snowflake, BigQuery, Redshift, Databricks, ClickHouse, Postgres, MySQL, Trino, DuckDB, Pinot, and more.

Self-Hosting & Configuration

Run with cubejs-cli + Node, or deploy the official Docker image (cubejs/cube) on Kubernetes.
For production, separate cube-api and cube-refresh-worker processes and back them with Redis + Cube Store.
Define schemas in YAML, JS, or Python — Python model support makes cubes editable from Jupyter.
Enable pre-aggregations on hot metrics — Cube materializes them on a schedule into Cube Store (Arrow-backed, S3-compatible).
Protect the API with signed JWTs and securityContext so every query is scoped to a tenant.

Key Features

Define metrics once, consume everywhere — the canonical semantic layer for modern data stacks.
Pre-aggregations give sub-second p95 even on multi-billion-row warehouses.
Postgres-compatible SQL API plugs into Tableau, Looker, Power BI, Metabase, and Superset with no adapters.
First-class support for LLM text-to-metric via the SQL API and Cube Cloud's AI Assistant.
Self-hosted Apache-2.0 core with identical APIs to the managed Cube Cloud.

Comparison with Similar Tools

dbt Semantic Layer / MetricFlow — defines metrics alongside dbt models; query interface is less flexible than Cube.
LookML (Looker) — proprietary and tied to Looker UI; Cube is open and API-first.
MetricFlow (stand-alone) — now part of dbt; similar goals, fewer integrations.
Malloy (Google) — experimental modeling language; not a production semantic layer yet.
AtScale — commercial semantic layer; Cube gives 80% of features for free.

FAQ

Q: Do I need Cube Cloud? A: No — the open source version is production-grade. Cube Cloud adds managed hosting, SSO, and an AI assistant.

Q: How does Cube compare to just writing views in my warehouse? A: Views lack pre-aggregation routing, caching, multi-protocol APIs, and tenant-scoped security — all of which Cube provides.

Q: Can Cube cache queries? A: Yes, via Cube Store pre-aggregations and an in-memory query cache with TTL and on-demand refresh.

Q: Can I query Cube from an LLM agent? A: Yes — use the SQL API or the GraphQL API; Cube also ships an MCP server for agent tool use.

Cube — Open Source Semantic Layer for Data Apps

Introduction

What Cube Does

Architecture Overview

Self-Hosting & Configuration

Key Features

Comparison with Similar Tools

FAQ

Sources

讨论

相关资产

KeyDB — Multithreaded Drop-In Redis Replacement

Bytebase — Database DevOps and CI/CD for Teams

TiKV — Distributed Transactional Key-Value Store on Raft