# timm — Pretrained Vision Models and Layers for PyTorch

> timm (PyTorch Image Models) is a collection of pretrained image classification models, layers, utilities, and training scripts maintained by Ross Wightman and hosted on Hugging Face.

## Install

Save in your project root:

# timm — Pretrained Vision Models and Layers for PyTorch

## Quick Use
```bash
pip install timm
python -c "import timm; model = timm.create_model('efficientnet_b0', pretrained=True); print(model.default_cfg)"
```

## Introduction
timm (PyTorch Image Models) is the go-to library for pretrained image classification backbones in the PyTorch ecosystem. It provides hundreds of model architectures with pretrained weights and a consistent API for creating, fine-tuning, and benchmarking vision models.

## What timm Does
- Supplies 700+ pretrained model architectures covering CNNs, Vision Transformers, and hybrids
- Offers a single `create_model()` entry point that handles weight loading and head customization
- Provides reusable layers (attention blocks, normalization, activation functions) as building blocks
- Includes a training script (`train.py`) with modern augmentation and optimization defaults
- Publishes model performance benchmarks and weight registries on Hugging Face Hub

## Architecture Overview
Models are registered in a global registry keyed by name. `create_model()` looks up the constructor, optionally downloads pretrained weights, and replaces the classifier head to match the requested `num_classes`. Internally each model is a standard `nn.Module`. timm layers (`PatchEmbed`, `Mlp`, `DropPath`, etc.) are reused across architectures. A `data` subpackage handles augmentation pipelines (RandAugment, CutMix, Mixup) used during training.

## Self-Hosting & Configuration
- Install via pip: `pip install timm` (requires PyTorch)
- All weights download automatically from Hugging Face Hub on first use
- Customize the classifier head: `timm.create_model('resnet50', num_classes=10)`
- Use `timm.list_models('vit_*')` to discover available architectures
- Export to ONNX or TorchScript with standard PyTorch APIs

## Key Features
- Largest single-repo collection of vision model implementations for PyTorch
- Consistent API across all architectures — swap backbones with one argument change
- Regular updates with new state-of-the-art models (EfficientNet, ConvNeXt, SwinV2, EVA, etc.)
- Built-in training recipe with competitive ImageNet accuracy out of the box
- Integrated with Hugging Face Hub for easy weight sharing and versioning

## Comparison with Similar Tools
- **torchvision.models** — ships with PyTorch but covers far fewer architectures and updates less often
- **Hugging Face Transformers** — broader scope (NLP, audio, vision) but timm has deeper vision-specific coverage
- **MMClassification (MMPretrain)** — OpenMMLab alternative, config-driven rather than code-driven
- **CLIP** — focuses on vision-language alignment, not pure classification backbones
- **Keras Applications** — TensorFlow/Keras equivalent; timm is PyTorch-native

## FAQ
**Q: How do I fine-tune a timm model on a custom dataset?**
A: Call `timm.create_model('efficientnet_b0', pretrained=True, num_classes=YOUR_NUM)`, freeze early layers if desired, and train with your own loop or the included training script.

**Q: Can I use timm models for object detection or segmentation?**
A: Yes. Libraries like Detectron2, MMDetection, and YOLO often accept timm backbones via feature extraction mode (`features_only=True`).

**Q: Are timm weights free to use commercially?**
A: Most weights use Apache-2.0 or similar permissive licenses, but check the individual model card on Hugging Face Hub.

**Q: How does timm compare in speed to torchvision?**
A: For the same architecture the performance is essentially identical; timm just offers more choices and newer designs.

## Sources
- https://github.com/huggingface/pytorch-image-models
- https://huggingface.co/docs/timm

---
Source: https://tokrepo.com/en/workflows/b2a4ac4b-3e25-11f1-9bc6-00163e2b0d79
Author: AI Open Source