Cette page est affichée en anglais. Une traduction française est en cours.
SkillsMay 11, 2026·3 min de lecture

MosaicML Composer — Efficient Large-Scale Model Training

A PyTorch library that accelerates neural network training with algorithmic speedup methods, multi-GPU support, and production-ready training recipes.

Prêt pour agents

Cet actif peut être lu et installé directement par les agents

TokRepo expose une commande CLI universelle, un contrat d'installation, le metadata JSON, un plan selon l'adaptateur et le contenu raw pour aider les agents à juger l'adaptation, le risque et les prochaines actions.

Native · 98/100Policy : autoriser
Surface agent
Tout agent MCP/CLI
Type
Skill
Installation
Single
Confiance
Confiance : Established
Point d'entrée
Composer Overview
Commande CLI universelle
npx tokrepo install d87368f0-4cd0-11f1-9bc6-00163e2b0d79

Introduction

Composer is an open-source PyTorch training library by MosaicML (now Databricks) that makes it easy to apply efficiency methods like mixed precision, gradient accumulation, and algorithm-level speedups. It provides a Trainer abstraction that handles distributed training, checkpointing, and logging out of the box.

What Composer Does

  • Provides a Trainer class that wraps PyTorch training with built-in best practices
  • Implements 25+ speed-up algorithms like BlurPool, CutMix, Label Smoothing, and Progressive Resizing
  • Supports multi-GPU and multi-node training with FSDP and DeepSpeed backends
  • Handles checkpointing, resumption, and logging to W&B, MLflow, or TensorBoard
  • Includes streaming dataset loading for training on cloud-hosted data

Architecture Overview

Composer's Trainer manages the training loop through an event-based callback system. Speed-up algorithms are implemented as callbacks that hook into events like BATCH_START or AFTER_LOSS. The Trainer orchestrates data loading, forward/backward passes, optimization, and checkpointing. Under the hood, it delegates distributed parallelism to PyTorch FSDP or DeepSpeed, abstracting away the complexity of multi-GPU coordination.

Self-Hosting & Configuration

  • Install: pip install mosaicml or with extras: pip install 'mosaicml[all]'
  • Define training runs via YAML configs or Python API
  • Set distributed training: composer -n 8 train.yaml for 8-GPU runs
  • Configure cloud checkpointing to S3 or GCS for fault tolerance
  • Use Streaming datasets for efficient data loading from object storage

Key Features

  • Composable speed-up algorithms that stack without code changes
  • YAML-based declarative training configuration
  • Built-in FSDP support for training large language models
  • Elastic fault-tolerant training with automatic checkpoint recovery
  • LLM fine-tuning recipes for MPT and other foundation models

Comparison with Similar Tools

  • PyTorch Lightning — general training framework; Composer focuses on efficiency algorithms and LLM training
  • DeepSpeed — low-level distributed training library; Composer provides a higher-level Trainer interface
  • Hugging Face Trainer — specialized for transformers; Composer supports any PyTorch model architecture
  • Determined AI — platform with resource management; Composer is a pure training library

FAQ

Q: Can I use Composer with any PyTorch model? A: Yes. Wrap your model and data loaders in Composer's Trainer. No model architecture changes needed.

Q: What kind of speedups can I expect? A: Combining algorithms like MixUp, Progressive Resizing, and mixed precision typically yields 2-5x training speedup with no accuracy loss.

Q: Does Composer support RLHF or fine-tuning workflows? A: Yes. The LLM Foundry project built on Composer provides recipes for pre-training and fine-tuning large language models.

Q: Is multi-node training supported? A: Yes. Composer uses PyTorch's distributed launcher and supports multi-node FSDP and DeepSpeed configurations.

Sources

Fil de discussion

Connectez-vous pour rejoindre la discussion.
Aucun commentaire pour l'instant. Soyez le premier à partager votre avis.

Actifs similaires