Configs2026年4月16日·1 分钟阅读

Apache Iceberg — Open Table Format for Huge Analytical Datasets

High-performance, engine-agnostic table format that brings ACID transactions, schema evolution, and time travel to Parquet data lakes.

Introduction

Apache Iceberg is a high-performance table format designed for huge analytical datasets. It delivers ACID transactions, hidden partitioning, schema evolution, and time travel on top of Parquet/ORC/Avro files — without locking you into any specific query engine. Spark, Trino, Flink, Presto, Snowflake, BigQuery, DuckDB, Dremio, Athena, and others all read Iceberg natively.

What Iceberg Does

  • Stores massive tables as immutable data files tracked by metadata snapshots.
  • Enables atomic commits, concurrent writes, and isolation without locking the whole table.
  • Supports schema evolution (add/drop/rename columns) and partition evolution.
  • Provides time travel and rollback via snapshot-level reads.
  • Powers vendor-neutral lakehouses: the same table works across many engines.

Architecture Overview

An Iceberg table is a three-layer tree: a current metadata JSON pointer, a list of manifest lists per snapshot, and manifests that enumerate data files with per-column statistics. Engines prune partitions and files via these stats before reading any data. A catalog (REST, Nessie, Glue, Hive, Unity, JDBC, or the new Polaris) brokers atomic swaps of the metadata pointer to implement transactions.

Self-Hosting & Configuration

  • Pick a catalog: REST (iceberg-rest-fixture), Nessie (Git-like), Hive Metastore, Glue, Unity, or JDBC.
  • Store table data in S3/GCS/Azure/MinIO/HDFS with lifecycle policies that align with Iceberg retention.
  • Configure properties like write.format.default, write.target-file-size-bytes, and write.distribution-mode.
  • Use rewrite_data_files and expire_snapshots actions to compact and reclaim storage.
  • Integrate with Spark, Flink, Trino, Dremio, Starrocks, ClickHouse, and DuckDB via first-class connectors.

Key Features

  • Engine-agnostic: the same table is consumed by batch, streaming, and interactive engines.
  • Hidden partitioning — users query by event_time while Iceberg maps to partition columns.
  • Schema evolution that is metadata-only: no file rewrites.
  • Row-level updates via copy-on-write and merge-on-read deletion vectors.
  • Branches and tags (Git-style semantics) for experimentation, backfills, and GDPR deletes.

Comparison with Similar Tools

  • Delta Lake — ACID lakehouse format; tied tightly to Databricks ecosystem, now with UniForm interop.
  • Apache Hudi — Focused on upserts and incremental pulls; Iceberg targets broader analytics.
  • Hive tables — Directory-based; Iceberg replaces partition discovery with metadata-tracked files.
  • Parquet alone — Great columnar storage, no table semantics; Iceberg layers ACID on top.
  • Snowflake / BigQuery native — Closed; Iceberg keeps your data portable across engines.

FAQ

Q: Which engines can read Iceberg? A: Spark, Trino, Flink, Presto, Athena, Snowflake, BigQuery, DuckDB, Dremio, Starrocks, and many more.

Q: What catalog should I use? A: REST is emerging as the de-facto standard; Glue/Unity are common in cloud; Nessie adds Git-like branching.

Q: How do deletes work? A: Copy-on-write rewrites affected files, or merge-on-read writes delete files that readers merge in at query time.

Q: Can Iceberg handle streaming? A: Yes — Flink and Spark Structured Streaming can upsert into Iceberg tables with exactly-once commits.

Sources

讨论

登录后参与讨论。
还没有评论,来写第一条吧。

相关资产