Configs2026年4月10日·1 分钟阅读

Kestra — Event-Driven Orchestration & Scheduling Platform

Kestra is an open-source orchestration platform for scheduling and running complex data pipelines, ETL jobs, and automation workflows with declarative YAML.

AI
AI Open Source · Community
快速使用

先拿来用,再决定要不要深挖

这里应该同时让用户和 Agent 知道第一步该复制什么、安装什么、落到哪里。

docker run --pull=always --rm -it -p 8080:8080 -p 8081:8081 kestra/kestra:latest server local

Open http://localhost:8080 — create your first flow in the built-in editor.

介绍

Kestra is an open-source event-driven orchestration and scheduling platform designed for mission-critical data pipelines and automation workflows. Built with Java and featuring a declarative YAML-based approach, it makes complex workflow orchestration accessible to both engineers and non-technical users.

With 26.7K+ GitHub stars and Apache-2.0 license, Kestra combines the power of code-based orchestration with a visual low-code interface, supporting 500+ plugins for integrations with databases, cloud services, messaging systems, and more.

What Kestra Does

Kestra handles workflow orchestration across multiple domains:

  • Data Pipelines: Schedule and run ETL/ELT jobs with built-in support for dbt, Spark, Flink, and major databases
  • Event-Driven Automation: React to file uploads, API webhooks, database changes, and message queue events in real-time
  • Infrastructure Orchestration: Manage CI/CD pipelines, Terraform runs, and Kubernetes deployments
  • Business Process Automation: Automate approval workflows, notifications, and cross-system data synchronization

Architecture Overview

Kestra uses a pluggable architecture:

┌─────────────┐     ┌──────────────┐     ┌─────────────┐
│  YAML Flows │────▶│ Kestra Server│────▶│   Workers    │
│  (Git/UI)   │     │  (Executor)  │     │ (Task Runs)  │
└─────────────┘     └──────┬───────┘     └─────────────┘
                           │
                    ┌──────┴───────┐
                    │  Repository  │
                    │ (Postgres/   │
                    │  Elasticsearch)│
                    └──────────────┘
  • Flows: Declarative YAML definitions with tasks, triggers, inputs, and outputs
  • Namespaces: Organize flows into logical groups with shared variables and files
  • Triggers: Schedule (cron), event-based (webhook, file detection, flow completion), or polling
  • Task Runners: Execute tasks in Docker containers, Kubernetes pods, or cloud compute (AWS Batch, GCP, Azure)

Key Features Deep Dive

Declarative YAML Flows

id: etl_pipeline
namespace: data.production

inputs:
  - id: date
    type: DATE
    defaults: "{{ now() | dateAdd(-1, 'DAYS') | date('yyyy-MM-dd') }}"

tasks:
  - id: extract
    type: io.kestra.plugin.jdbc.postgresql.Query
    url: jdbc:postgresql://source-db:5432/analytics
    sql: SELECT * FROM events WHERE date = '{{ inputs.date }}'
    store: true

  - id: transform
    type: io.kestra.plugin.scripts.python.Script
    script: |
      import pandas as pd
      df = pd.read_csv('{{ outputs.extract.uri }}')
      df['processed_at'] = pd.Timestamp.now()
      df.to_csv('{{ outputDir }}/transformed.csv', index=False)

  - id: load
    type: io.kestra.plugin.gcp.bigquery.Load
    from: "{{ outputs.transform.outputFiles['transformed.csv'] }}"
    destinationTable: project.dataset.events

triggers:
  - id: daily
    type: io.kestra.plugin.core.trigger.Schedule
    cron: "0 2 * * *"

Built-in Monitoring

Kestra provides real-time execution monitoring with Gantt charts, log streaming, topology views, and automatic failure alerts. The UI shows execution history, metrics, and allows manual replay of failed runs.

Plugin Ecosystem

500+ official plugins covering:

  • Databases: PostgreSQL, MySQL, MongoDB, ClickHouse, Snowflake, BigQuery
  • Cloud: AWS (S3, Lambda, ECS), GCP (GCS, Dataproc), Azure (Blob, Functions)
  • Messaging: Kafka, RabbitMQ, MQTT, Redis, Pulsar
  • Scripts: Python, Node.js, R, Shell, PowerShell in isolated containers

Self-Hosting Guide

Docker Compose (Production)

# docker-compose.yml
services:
  kestra:
    image: kestra/kestra:latest
    command: server standalone
    ports:
      - "8080:8080"
    volumes:
      - kestra-data:/app/storage
    environment:
      KESTRA_CONFIGURATION: |
        datasources:
          postgres:
            url: jdbc:postgresql://postgres:5432/kestra
            driverClassName: org.postgresql.Driver
            username: kestra
            password: kestra
        kestra:
          repository:
            type: postgres
          queue:
            type: postgres
          storage:
            type: local
            local:
              base-path: /app/storage

  postgres:
    image: postgres:16
    environment:
      POSTGRES_DB: kestra
      POSTGRES_USER: kestra
      POSTGRES_PASSWORD: kestra
    volumes:
      - pg-data:/var/lib/postgresql/data

volumes:
  kestra-data:
  pg-data:

Kubernetes with Helm

helm repo add kestra https://helm.kestra.io/
helm install kestra kestra/kestra --namespace kestra --create-namespace

Kestra vs Alternatives

Feature Kestra Airflow Prefect Temporal
Configuration YAML declarative Python DAGs Python decorators Code-based
UI Built-in visual editor Web UI Cloud dashboard Web UI
Event-driven Native Limited Yes Yes
Learning curve Low (YAML) Medium (Python) Medium High
Plugin system 500+ plugins Operators Integrations Activities
Scaling Horizontal Celery/K8s Cloud/K8s Worker pools

常见问题

Q: Kestra 适合什么规模的团队? A: 从个人开发者到企业级团队都适用。单节点 Docker 适合小规模使用,Kubernetes 部署支持每天数百万次执行。

Q: 需要会 Java 才能使用吗? A: 不需要。流程用 YAML 定义,脚本任务支持 Python、Node.js、Shell 等。只有开发自定义插件才需要 Java。

Q: 与 Airflow 相比有什么优势? A: Kestra 的 YAML 声明式方式比 Airflow 的 Python DAG 更容易上手,原生支持事件驱动,内置可视化编辑器,且不需要重启即可部署新流程。

来源与致谢

讨论

登录后参与讨论。
还没有评论,来写第一条吧。

相关资产