# Apache Gravitino — Unified Metadata Lake for Data and AI > Apache Gravitino is a metadata lake that provides a single catalog interface to manage schemas, tables, models, and topics across multiple data sources, query engines, and AI platforms. ## Install Save in your project root: # Apache Gravitino — Unified Metadata Lake for Data and AI ## Quick Use ```bash # Download and start curl -LO https://github.com/apache/gravitino/releases/latest/download/gravitino.tar.gz tar xzf gravitino.tar.gz && cd gravitino ./bin/gravitino-server.sh start # Access the web UI at http://localhost:8090 ``` ## Introduction Apache Gravitino is a metadata management platform that unifies catalog operations across heterogeneous data sources and AI systems. Instead of managing separate metadata stores for each engine, Gravitino provides a single entry point for schema, table, model, and topic management. ## What Apache Gravitino Does - Provides a unified metadata catalog spanning relational databases, data lakes, and messaging systems - Manages metadata for Hive, Iceberg, JDBC catalogs, Kafka topics, and ML model registries - Enables cross-engine metadata sharing between Spark, Trino, Flink, and other query engines - Supports multi-tenant metalakes with role-based access control - Offers REST, Java, and Python APIs plus a web management UI ## Architecture Overview Gravitino introduces the concept of a metalake, a top-level namespace that groups catalogs from different data sources. Each catalog connects to a backend system (Hive Metastore, JDBC database, Iceberg REST catalog, Kafka cluster) via provider plugins. The Gravitino server exposes a REST API that translates unified metadata operations into backend-specific calls. An event listener framework enables audit logging and downstream notifications when metadata changes. ## Self-Hosting & Configuration - Download the release tarball or build from source with Gradle - Configure gravitino-server.conf with the server port and backend storage settings - Register catalogs via the REST API or web UI, specifying the provider and connection details - Set up a relational backend (MySQL or PostgreSQL) for production metadata persistence - Deploy behind a reverse proxy with TLS for production environments ## Key Features - Unified catalog interface for Hive, Iceberg, JDBC, Kafka, and model registries - Metalake concept provides multi-tenant isolation for different teams or projects - Cross-engine metadata sharing eliminates catalog duplication between Spark, Trino, and Flink - Tag-based metadata classification and governance across all managed assets - Event listener framework for audit trails and automated metadata workflows ## Comparison with Similar Tools - **Hive Metastore** — Hive-centric catalog; Gravitino unifies Hive with Iceberg, JDBC, Kafka, and more - **Unity Catalog** — Databricks-originated; Gravitino is vendor-neutral and Apache-governed - **Apache Polaris** — Iceberg-focused catalog; Gravitino covers a broader range of data and AI assets - **DataHub** — metadata discovery and lineage; Gravitino is an operational catalog for query engines - **OpenMetadata** — metadata platform; Gravitino serves as an active catalog that engines query directly ## FAQ **Q: What is a metalake?** A: A metalake is the top-level organizational unit in Gravitino. It groups multiple catalogs (Hive, Iceberg, JDBC, Kafka) under a single namespace for unified management. **Q: Which query engines can use Gravitino?** A: Gravitino provides connectors for Apache Spark, Trino, and Apache Flink. Applications can also use the REST or Java/Python client APIs directly. **Q: Does Gravitino replace Hive Metastore?** A: Gravitino can sit in front of Hive Metastore and other catalogs, providing a unified interface. It does not replace the backends but adds a unification layer. **Q: Is Gravitino production-ready?** A: Apache Gravitino is an incubating project under the Apache Software Foundation with active development and growing production adoption. ## Sources - https://github.com/apache/gravitino - https://gravitino.apache.org/docs/ --- Source: https://tokrepo.com/en/workflows/asset-4b259937 Author: AI Open Source