# Apache Calcite — Dynamic SQL Query Planning and Optimization Framework > Modular SQL query planning framework used as the query optimizer inside Apache Hive, Druid, Flink, and dozens of other data systems. ## Install Save as a script file and run: # Apache Calcite — Dynamic SQL Query Planning and Optimization Framework ## Quick Use ```xml org.apache.calcite calcite-core 1.37.0 ``` ```java // Connect via JDBC and query a CSV file as a table Connection conn = DriverManager.getConnection("jdbc:calcite:model=model.json"); ResultSet rs = conn.createStatement().executeQuery("SELECT * FROM EMPS WHERE age > 30"); ``` ## Introduction Apache Calcite is a foundational framework for building databases and data management systems. Rather than storing data itself, it provides a SQL parser, validator, query optimizer, and JDBC adapter that other systems plug into. Projects like Apache Hive, Druid, Flink, and Phoenix all rely on Calcite for SQL processing. ## What Apache Calcite Does - Parses and validates SQL statements against user-defined schemas - Optimizes query plans using cost-based and rule-based transformations - Provides a JDBC driver that turns any data source into a SQL-queryable endpoint - Supports federated queries across multiple heterogeneous data sources - Offers adapters for CSV files, JSON, JDBC databases, Elasticsearch, and more ## Architecture Overview Calcite processes queries in stages: the SQL parser produces a syntax tree, the validator checks types and resolves names against a schema, and the optimizer (called the planner) transforms the relational algebra tree using pluggable rules. The planner supports both heuristic (rule-based) and Volcano-style (cost-based) optimization. Adapters translate optimized plans into operations on the underlying data source, whether that is an in-memory collection, a file, or a remote database. ## Self-Hosting & Configuration - Add calcite-core as a Maven or Gradle dependency in your Java project - Define a model.json file describing schemas and their adapter types - Implement the Schema and Table interfaces to expose custom data sources - Register optimization rules with the planner for domain-specific transformations - Use the JDBC driver (jdbc:calcite:) for SQL access from any Java application ## Key Features - Pluggable adapter architecture lets you query any data source through standard SQL - Cost-based optimizer with extensible statistics and cost model for smart plan selection - Materialized view rewriting automatically routes queries to precomputed results - Streaming SQL extensions support continuous queries over event streams - Lattice and star-schema optimizations accelerate OLAP-style aggregate queries ## Comparison with Similar Tools - **Apache DataFusion** — Rust-based query engine; embeddable like Calcite but focused on single-process execution rather than framework reuse - **Substrait** — Cross-language query plan specification; Calcite can produce Substrait plans but also includes its own optimizer and execution - **Presto/Trino** — Distributed SQL engines that embed their own optimizers; Calcite is a library others embed rather than a standalone engine - **DuckDB** — Embedded analytical database with its own parser and optimizer; Calcite is a framework for building such systems - **Apache Drill** — SQL query engine for multiple data sources; built on top of Calcite for parsing and optimization ## FAQ **Q: Is Calcite a database?** A: No. Calcite is a framework that provides SQL parsing, optimization, and JDBC connectivity. It does not store data. Systems like Hive, Druid, and Flink use Calcite as their SQL processing layer. **Q: Which projects use Calcite?** A: Apache Hive, Druid, Flink, Phoenix, Beam, Kylin, and Storm all use Calcite for query parsing and optimization. Many commercial data products also embed it. **Q: Can I use Calcite to query CSV or JSON files?** A: Yes. Calcite includes built-in adapters for CSV and JSON files. Define a model.json pointing to your files and query them with standard SQL via the JDBC driver. **Q: How do I add custom optimization rules?** A: Implement the RelOptRule interface, define pattern matching for the relational tree nodes you want to transform, and register the rule with the planner. Calcite applies matching rules during optimization. ## Sources - https://github.com/apache/calcite - https://calcite.apache.org/docs/ --- Source: https://tokrepo.com/en/workflows/fd0fa862-3d3a-11f1-9bc6-00163e2b0d79 Author: Script Depot