SkillsApr 24, 2026·3 min read

Metaflow — Human-Friendly ML Workflow Framework by Netflix

Metaflow is a Python framework from Netflix for building and managing real-life data science and ML projects, handling compute, data versioning, and orchestration with minimal boilerplate.

Agent ready

Ready-to-run agent install

This asset can be installed after the agent chooses its runtime, checks the plan, and runs the matching command.

Native · 98/100Policy: allow
Agent surface
Any MCP/CLI agent
Kind
Skill
Install
Single
Trust
Trust: Established
Entrypoint
Metaflow Overview
Direct install command
npx -y tokrepo@latest install 11f68356-3fdb-11f1-9bc6-00163e2b0d79 --target codex

Run after dry-run confirms the install plan.

Introduction

Metaflow was built at Netflix to let data scientists write production ML pipelines using regular Python. It manages infrastructure concerns—versioning, compute scaling, dependency management—behind a simple decorator-based API, so teams can focus on modeling rather than plumbing.

What Metaflow Does

  • Structures ML projects as flows with steps connected by a DAG
  • Automatically versions every run's data, code, and dependencies
  • Scales individual steps to cloud compute (AWS Batch, Kubernetes) with a single decorator
  • Provides a built-in client for inspecting past runs and retrieving artifacts
  • Supports branching and joining for parallel workloads within a flow

Architecture Overview

A Metaflow flow is a Python class where each method decorated with @step becomes a node in a DAG. When executed, the runtime snapshots code, data artifacts, and environment metadata for each step. Steps can be dispatched to local processes, AWS Batch, or Kubernetes. A metadata service tracks all runs, and a datastore (S3 or local filesystem) persists artifacts so any past result can be retrieved programmatically.

Self-Hosting & Configuration

  • Install from PyPI for local execution with no extra infrastructure
  • Configure AWS integration by running metaflow configure aws for S3 and Batch
  • Deploy the metadata service for team-wide run tracking and artifact sharing
  • Use @conda or @pypi decorators to pin per-step dependencies automatically
  • Integrate with Argo Workflows or AWS Step Functions for production scheduling

Key Features

  • Decorator-based API keeps flow definitions in plain Python without YAML or config files
  • Automatic data versioning lets you inspect or compare any historical run
  • @resources decorator requests specific CPU, memory, or GPU for individual steps
  • Fan-out with foreach enables parallel processing across data partitions
  • Built-in resume from the last successful step after failures

Comparison with Similar Tools

  • Prefect — Python workflow engine; more general-purpose, less ML-specific artifact management
  • Dagster — asset-centric orchestrator; stronger typing but heavier abstraction layer
  • Kedro — pipeline framework for data science; more opinionated project structure
  • Airflow — DAG scheduler for batch jobs; requires more infrastructure and is less Python-native

FAQ

Q: Do I need AWS to use Metaflow? A: No. Metaflow runs fully locally. AWS and Kubernetes integrations are optional for scaling.

Q: How does data versioning work? A: Every step's output artifacts are automatically persisted and tagged with the run ID. You can retrieve any artifact from any past run via the client API.

Q: Can I schedule flows for recurring execution? A: Yes. Integrate with Argo Workflows, AWS Step Functions, or any cron-based scheduler to trigger flows on a schedule.

Q: Does Metaflow handle GPU workloads? A: Yes. Use the @resources(gpu=1) decorator to request GPU instances for specific steps.

Sources

Discussion

Sign in to join the discussion.
No comments yet. Be the first to share your thoughts.

Related Assets