ScriptsMay 11, 2026·3 min read

MosaicML Composer — Efficient Large-Scale Model Training

A PyTorch library that accelerates neural network training with algorithmic speedup methods, multi-GPU support, and production-ready training recipes.

Introduction

Composer is an open-source PyTorch training library by MosaicML (now Databricks) that makes it easy to apply efficiency methods like mixed precision, gradient accumulation, and algorithm-level speedups. It provides a Trainer abstraction that handles distributed training, checkpointing, and logging out of the box.

What Composer Does

  • Provides a Trainer class that wraps PyTorch training with built-in best practices
  • Implements 25+ speed-up algorithms like BlurPool, CutMix, Label Smoothing, and Progressive Resizing
  • Supports multi-GPU and multi-node training with FSDP and DeepSpeed backends
  • Handles checkpointing, resumption, and logging to W&B, MLflow, or TensorBoard
  • Includes streaming dataset loading for training on cloud-hosted data

Architecture Overview

Composer's Trainer manages the training loop through an event-based callback system. Speed-up algorithms are implemented as callbacks that hook into events like BATCH_START or AFTER_LOSS. The Trainer orchestrates data loading, forward/backward passes, optimization, and checkpointing. Under the hood, it delegates distributed parallelism to PyTorch FSDP or DeepSpeed, abstracting away the complexity of multi-GPU coordination.

Self-Hosting & Configuration

  • Install: pip install mosaicml or with extras: pip install 'mosaicml[all]'
  • Define training runs via YAML configs or Python API
  • Set distributed training: composer -n 8 train.yaml for 8-GPU runs
  • Configure cloud checkpointing to S3 or GCS for fault tolerance
  • Use Streaming datasets for efficient data loading from object storage

Key Features

  • Composable speed-up algorithms that stack without code changes
  • YAML-based declarative training configuration
  • Built-in FSDP support for training large language models
  • Elastic fault-tolerant training with automatic checkpoint recovery
  • LLM fine-tuning recipes for MPT and other foundation models

Comparison with Similar Tools

  • PyTorch Lightning — general training framework; Composer focuses on efficiency algorithms and LLM training
  • DeepSpeed — low-level distributed training library; Composer provides a higher-level Trainer interface
  • Hugging Face Trainer — specialized for transformers; Composer supports any PyTorch model architecture
  • Determined AI — platform with resource management; Composer is a pure training library

FAQ

Q: Can I use Composer with any PyTorch model? A: Yes. Wrap your model and data loaders in Composer's Trainer. No model architecture changes needed.

Q: What kind of speedups can I expect? A: Combining algorithms like MixUp, Progressive Resizing, and mixed precision typically yields 2-5x training speedup with no accuracy loss.

Q: Does Composer support RLHF or fine-tuning workflows? A: Yes. The LLM Foundry project built on Composer provides recipes for pre-training and fine-tuning large language models.

Q: Is multi-node training supported? A: Yes. Composer uses PyTorch's distributed launcher and supports multi-node FSDP and DeepSpeed configurations.

Sources

Discussion

Sign in to join the discussion.
No comments yet. Be the first to share your thoughts.

Related Assets