ConfigsMay 3, 2026·3 min read

Cadence — Distributed Workflow Execution Engine by Uber

Cadence is a distributed, scalable, fault-tolerant workflow orchestration engine developed by Uber for executing long-running business logic as durable, stateful workflows that survive process and infrastructure failures.

Introduction

Building reliable distributed systems is hard because processes crash, networks partition, and services restart. Cadence lets developers write complex business logic as straightforward code while the engine handles retries, state persistence, and failure recovery automatically. Originally built at Uber to power critical services, it treats workflows as durable functions that run to completion regardless of infrastructure failures.

What Cadence Does

  • Executes long-running workflows that persist state across process restarts and failures
  • Provides automatic retry, timeout, and error handling for individual workflow steps (activities)
  • Supports child workflows, signals, queries, and timers for complex orchestration patterns
  • Scales horizontally to handle millions of concurrent workflow executions
  • Offers SDKs for Go and Java with workflow and activity worker frameworks

Architecture Overview

Cadence consists of a server cluster and client workers. The server stores workflow state in a persistence layer (Cassandra, MySQL, or PostgreSQL) and manages task queues that dispatch work to activity and workflow workers. When a workflow function executes, the SDK records each decision (activity scheduling, timer creation, child workflow launch) as an event in the workflow's history. If a worker crashes, another worker replays the event history to reconstruct the workflow's state and continues from where it left off. This event-sourcing approach makes workflows inherently durable. The server supports multi-cluster replication for disaster recovery and global workflows.

Self-Hosting & Configuration

  • Deploy the Cadence server using the provided Docker Compose file for local development
  • Configure the persistence backend in config/development.yaml (Cassandra, MySQL, or PostgreSQL)
  • Register domains to isolate different applications or environments
  • Run workflow and activity workers as separate processes that connect to the Cadence server
  • Set up Cadence Web for a visual dashboard showing workflow status and history

Key Features

  • Event-sourced workflow state survives any number of process, host, or data center failures
  • Activity retry policies with configurable backoff handle transient failures automatically
  • Visibility queries let you search and filter running workflows by custom attributes
  • Multi-cluster replication enables active-active and failover deployment topologies
  • Cron workflows schedule recurring executions with built-in deduplication

Comparison with Similar Tools

  • Temporal — Fork of Cadence with namespace-based multi-tenancy; Cadence uses domain-based isolation and remains actively maintained by Uber
  • Apache Airflow — DAG-based batch scheduler; Cadence handles event-driven, long-running workflows with sub-second latency
  • Step Functions — AWS-managed service with JSON state machines; Cadence lets you write workflows as regular code with full language features
  • Prefect — Python-focused orchestrator; Cadence supports Go and Java with stronger durability guarantees
  • Conductor — Netflix workflow engine using JSON DSL; Cadence uses native code SDKs for a more natural developer experience

FAQ

Q: What is the difference between Cadence and Temporal? A: Temporal was forked from Cadence by some of its original creators. Both share the same core concepts. Cadence continues to be developed and used in production at Uber.

Q: What databases does Cadence support for persistence? A: Cadence supports Apache Cassandra, MySQL, and PostgreSQL as persistence backends.

Q: Can Cadence handle millions of workflows? A: Yes. Cadence is designed for horizontal scalability and handles millions of concurrent workflow executions in production at Uber.

Q: Do I need to write idempotent workflow code? A: Workflow code must be deterministic (same inputs produce same decisions), but activities can be non-deterministic. The SDK guides you toward correct patterns.

Sources

Discussion

Sign in to join the discussion.
No comments yet. Be the first to share your thoughts.

Related Assets