# Apache Superset — Open Source Data Visualization & Exploration > Apache Superset is a modern data exploration and visualization platform. Connect any SQL database, build interactive dashboards with 40+ chart types, no coding required. ## Install Save as a script file and run: ## Quick Use ```bash docker run -d --name superset -p 8088:8088 -e SUPERSET_SECRET_KEY=your-secret-key apache/superset:latest ``` Then initialize: ```bash docker exec -it superset superset fab create-admin --username admin --firstname Admin --lastname User --email admin@example.com --password admin docker exec -it superset superset db upgrade docker exec -it superset superset init ``` Open `http://localhost:8088` — login and connect your database. ## Intro **Apache Superset** is a modern, enterprise-ready data exploration and visualization platform. It provides an intuitive interface for exploring data, building interactive dashboards with 40+ chart types, and sharing insights — all through a web browser without writing code. Power users can write SQL for advanced analysis. With 72.3K+ GitHub stars and Apache-2.0 license, Superset is the most popular open-source BI platform under the Apache Software Foundation umbrella, used by Airbnb, Dropbox, Lyft, Netflix, and thousands of organizations worldwide. ## What Superset Does - **40+ Chart Types**: Bar, line, area, scatter, pie, heatmap, map, treemap, sunburst, pivot table, and more - **SQL Editor**: Full SQL IDE with autocomplete, saved queries, and result visualization - **Dashboard Builder**: Drag-and-drop dashboard creation with cross-filtering and drill-down - **Data Exploration**: No-code exploration with visual query builder for slicing and dicing data - **30+ Database Connectors**: PostgreSQL, MySQL, ClickHouse, BigQuery, Snowflake, Redshift, Trino, Druid, etc. - **Access Control**: Role-based access with row-level security and dataset permissions - **Caching**: Query result caching with configurable TTL for dashboard performance - **Alerts & Reports**: Scheduled reports and threshold-based alerts via email/Slack - **Embedding**: Embed dashboards and charts into external applications ## Architecture ``` ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ │ Browser │────▶│ Superset │────▶│ Your Data │ │ Dashboards │ │ Server │ │ PostgreSQL │ └──────────────┘ │ (Python/ │ │ ClickHouse │ │ Flask) │ │ BigQuery │ └──────┬───────┘ │ Snowflake │ │ │ 30+ more │ ┌─────────┼─────────┐ └──────────────┘ │ │ │ ┌──────┴──┐ ┌───┴───┐ ┌───┴───┐ │ Redis │ │Celery │ │Metadata│ │ (Cache) │ │(Async)│ │ DB │ └─────────┘ └───────┘ └────────┘ ``` ## Self-Hosting ### Docker Compose ```yaml services: superset: image: apache/superset:latest ports: - "8088:8088" environment: SUPERSET_SECRET_KEY: your-long-secret-key DATABASE_URL: postgresql+psycopg2://superset:superset@db:5432/superset REDIS_URL: redis://redis:6379/0 depends_on: - db - redis db: image: postgres:16-alpine environment: POSTGRES_USER: superset POSTGRES_PASSWORD: superset POSTGRES_DB: superset volumes: - pg-data:/var/lib/postgresql/data redis: image: redis:7-alpine volumes: pg-data: ``` ## Key Features ### No-Code Exploration ``` 1. Select dataset (table/view) 2. Choose chart type (bar, line, etc.) 3. Drag metrics (SUM, COUNT, AVG) 4. Add dimensions (group by) 5. Apply filters 6. Customize colors, labels, formatting 7. Save to dashboard ``` ### SQL Lab (SQL IDE) ```sql -- Full SQL editor with schema browser SELECT DATE_TRUNC('month', order_date) as month, product_category, SUM(revenue) as total_revenue, COUNT(DISTINCT customer_id) as unique_customers FROM orders WHERE order_date >= '2024-01-01' GROUP BY 1, 2 ORDER BY 1, 3 DESC -- Results can be instantly visualized as any chart type -- Save queries for reuse -- Share with team members ``` ### Dashboard Features - **Cross-filtering**: Click on one chart to filter all others - **Filter bar**: Global filters affecting all charts - **Drill-down**: Click data points to explore deeper - **Annotations**: Mark events on time-series charts - **CSS customization**: Custom styling per dashboard - **Auto-refresh**: Configurable refresh intervals ### Jinja Templating ```sql -- Dynamic queries with user context SELECT * FROM orders WHERE region = '{{ current_user().region }}' AND order_date >= '{{ from_dttm }}' AND order_date <= '{{ to_dttm }}' {% if filter_values('status') %} AND status IN ({{ "'" + "', '".join(filter_values('status')) + "'" }}) {% endif %} ``` ## Superset vs Alternatives | Feature | Superset | Metabase | Grafana | Looker | Tableau | |---------|----------|----------|---------|--------|---------| | Open Source | Yes (Apache-2.0) | Yes (AGPL) | Yes (AGPL) | No | No | | Chart types | 40+ | 15+ | 20+ | LookML | Extensive | | SQL IDE | Yes | Basic | No | No | No | | No-code explore | Yes | Yes | Limited | LookML | Yes | | Databases | 30+ | 20+ | 100+ (metrics) | BigQuery focus | Any | | Row-level security | Yes | Enterprise | No | Yes | Yes | | Best for | Data teams | Business users | DevOps metrics | Enterprises | Enterprises | ## 常见问题 **Q: Superset 和 Metabase 怎么选?** A: Metabase 更适合非技术用户(可视化查询构建器更友好)。Superset 更适合数据团队(SQL Lab 更强大、图表类型更多、性能更好)。如果团队习惯写 SQL,选 Superset。 **Q: 可以处理大数据吗?** A: Superset 本身不存储数据,性能取决于底层数据库。搭配 ClickHouse、Druid 或 Trino 等 OLAP 引擎可以查询数十亿行数据。Superset 的缓存层(Redis)也能显著提升仪表盘加载速度。 **Q: 自托管难度大吗?** A: Docker 部署相对简单。生产环境需要配置 Redis(缓存)、Celery(异步任务)和 PostgreSQL(元数据)。Apache 官方提供详细的生产部署指南和 Helm chart。 ## 来源与致谢 - GitHub: [apache/superset](https://github.com/apache/superset) — 72.3K+ ⭐ | Apache-2.0 - 官网: [superset.apache.org](https://superset.apache.org) --- Source: https://tokrepo.com/en/workflows/58b28055-34e1-11f1-9bc6-00163e2b0d79 Author: Script Depot