What Kestra Does
Kestra handles workflow orchestration across multiple domains:
- Data Pipelines: Schedule and run ETL/ELT jobs with built-in support for dbt, Spark, Flink, and major databases
- Event-Driven Automation: React to file uploads, API webhooks, database changes, and message queue events in real-time
- Infrastructure Orchestration: Manage CI/CD pipelines, Terraform runs, and Kubernetes deployments
- Business Process Automation: Automate approval workflows, notifications, and cross-system data synchronization
Architecture Overview
Kestra uses a pluggable architecture:
┌─────────────┐ ┌──────────────┐ ┌─────────────┐
│ YAML Flows │────▶│ Kestra Server│────▶│ Workers │
│ (Git/UI) │ │ (Executor) │ │ (Task Runs) │
└─────────────┘ └──────┬───────┘ └─────────────┘
│
┌──────┴───────┐
│ Repository │
│ (Postgres/ │
│ Elasticsearch)│
└──────────────┘- Flows: Declarative YAML definitions with tasks, triggers, inputs, and outputs
- Namespaces: Organize flows into logical groups with shared variables and files
- Triggers: Schedule (cron), event-based (webhook, file detection, flow completion), or polling
- Task Runners: Execute tasks in Docker containers, Kubernetes pods, or cloud compute (AWS Batch, GCP, Azure)
Key Features Deep Dive
Declarative YAML Flows
id: etl_pipeline
namespace: data.production
inputs:
- id: date
type: DATE
defaults: "{{ now() | dateAdd(-1, 'DAYS') | date('yyyy-MM-dd') }}"
tasks:
- id: extract
type: io.kestra.plugin.jdbc.postgresql.Query
url: jdbc:postgresql://source-db:5432/analytics
sql: SELECT * FROM events WHERE date = '{{ inputs.date }}'
store: true
- id: transform
type: io.kestra.plugin.scripts.python.Script
script: |
import pandas as pd
df = pd.read_csv('{{ outputs.extract.uri }}')
df['processed_at'] = pd.Timestamp.now()
df.to_csv('{{ outputDir }}/transformed.csv', index=False)
- id: load
type: io.kestra.plugin.gcp.bigquery.Load
from: "{{ outputs.transform.outputFiles['transformed.csv'] }}"
destinationTable: project.dataset.events
triggers:
- id: daily
type: io.kestra.plugin.core.trigger.Schedule
cron: "0 2 * * *"Built-in Monitoring
Kestra provides real-time execution monitoring with Gantt charts, log streaming, topology views, and automatic failure alerts. The UI shows execution history, metrics, and allows manual replay of failed runs.
Plugin Ecosystem
500+ official plugins covering:
- Databases: PostgreSQL, MySQL, MongoDB, ClickHouse, Snowflake, BigQuery
- Cloud: AWS (S3, Lambda, ECS), GCP (GCS, Dataproc), Azure (Blob, Functions)
- Messaging: Kafka, RabbitMQ, MQTT, Redis, Pulsar
- Scripts: Python, Node.js, R, Shell, PowerShell in isolated containers
Self-Hosting Guide
Docker Compose (Production)
# docker-compose.yml
services:
kestra:
image: kestra/kestra:latest
command: server standalone
ports:
- "8080:8080"
volumes:
- kestra-data:/app/storage
environment:
KESTRA_CONFIGURATION: |
datasources:
postgres:
url: jdbc:postgresql://postgres:5432/kestra
driverClassName: org.postgresql.Driver
username: kestra
password: kestra
kestra:
repository:
type: postgres
queue:
type: postgres
storage:
type: local
local:
base-path: /app/storage
postgres:
image: postgres:16
environment:
POSTGRES_DB: kestra
POSTGRES_USER: kestra
POSTGRES_PASSWORD: kestra
volumes:
- pg-data:/var/lib/postgresql/data
volumes:
kestra-data:
pg-data:Kubernetes with Helm
helm repo add kestra https://helm.kestra.io/
helm install kestra kestra/kestra --namespace kestra --create-namespaceKestra vs Alternatives
| Feature | Kestra | Airflow | Prefect | Temporal |
|---|---|---|---|---|
| Configuration | YAML declarative | Python DAGs | Python decorators | Code-based |
| UI | Built-in visual editor | Web UI | Cloud dashboard | Web UI |
| Event-driven | Native | Limited | Yes | Yes |
| Learning curve | Low (YAML) | Medium (Python) | Medium | High |
| Plugin system | 500+ plugins | Operators | Integrations | Activities |
| Scaling | Horizontal | Celery/K8s | Cloud/K8s | Worker pools |
常见问题
Q: Kestra 适合什么规模的团队? A: 从个人开发者到企业级团队都适用。单节点 Docker 适合小规模使用,Kubernetes 部署支持每天数百万次执行。
Q: 需要会 Java 才能使用吗? A: 不需要。流程用 YAML 定义,脚本任务支持 Python、Node.js、Shell 等。只有开发自定义插件才需要 Java。
Q: 与 Airflow 相比有什么优势? A: Kestra 的 YAML 声明式方式比 Airflow 的 Python DAG 更容易上手,原生支持事件驱动,内置可视化编辑器,且不需要重启即可部署新流程。
来源与致谢
- GitHub: kestra-io/kestra — 26.7K+ ⭐ | Apache-2.0
- 官网: kestra.io