Cette page est affichée en anglais. Une traduction française est en cours.
ScriptsApr 28, 2026·3 min de lecture

Flax — Neural Network Library for JAX

A high-performance neural network library built on JAX, providing a flexible module system used extensively across Google DeepMind and the JAX research community.

Introduction

Flax is Google's neural network library for JAX. It provides a Pythonic module system (Linen) for defining models while preserving JAX's functional programming model. Flax is the foundation for many large-scale research projects at Google DeepMind, including vision transformers and large language models.

What Flax Does

  • Defines neural network modules with the Linen API using @nn.compact decorators
  • Manages model parameters as immutable pytrees compatible with JAX transformations
  • Supports training state management via TrainState with Optax optimizers
  • Provides serialization utilities for checkpointing with Orbax
  • Includes NNX, a newer mutable-state API for more traditional OOP-style models

Architecture Overview

Flax separates model definition from model state. A Module's init() method returns a pytree of parameters without storing them on the module. This functional approach makes models compatible with JAX's jit, vmap, pmap, and grad transformations. The Linen API uses lazy submodule initialization inside @nn.compact methods. State (batch norm stats, RNG keys) is tracked via variable collections, keeping the parameter tree organized.

Self-Hosting & Configuration

  • Install via pip; requires JAX with CPU, GPU, or TPU backend
  • GPU support: install jax[cuda12] for NVIDIA GPUs
  • TPU support available on Google Cloud with jax[tpu]
  • Use flax.training.train_state.TrainState for managing optimizer and parameters
  • Checkpoint models with orbax.checkpoint for reliable saving and restoring

Key Features

  • Functional parameter management compatible with all JAX transformations
  • Linen API for concise model definitions with automatic shape inference
  • NNX API offering mutable references for more traditional model building
  • Built-in support for mixed precision training via jmp
  • Seamless multi-device training with JAX pmap and sharding

Comparison with Similar Tools

  • PyTorch — Imperative and mutable; Flax is functional and immutable, leveraging JAX's compiler
  • Haiku — DeepMind's JAX library; Flax Linen is now the recommended choice for new projects
  • Equinox — Minimal JAX library using Python classes as pytrees; Flax has broader adoption
  • Keras (JAX backend) — Higher-level API; Flax offers more control for research

FAQ

Q: Should I use Linen or NNX? A: Linen is stable and widely adopted. NNX is newer and provides a more familiar mutable-state API. Both are supported.

Q: Does Flax work on TPUs? A: Yes. JAX natively supports TPUs, and Flax models run on TPUs without code changes beyond backend configuration.

Q: How does Flax handle batch normalization? A: Batch norm statistics are stored in a separate 'batch_stats' variable collection, updated via mutable parameter passing during training.

Q: Can I convert a PyTorch model to Flax? A: Not directly. Model architectures need to be reimplemented, but weight tensors can be transferred manually between frameworks.

Sources

Fil de discussion

Connectez-vous pour rejoindre la discussion.
Aucun commentaire pour l'instant. Soyez le premier à partager votre avis.

Actifs similaires