# timm — Pretrained Vision Models and Layers for PyTorch > timm (PyTorch Image Models) is a collection of pretrained image classification models, layers, utilities, and training scripts maintained by Ross Wightman and hosted on Hugging Face. ## Install Save in your project root: # timm — Pretrained Vision Models and Layers for PyTorch ## Quick Use ```bash pip install timm python -c "import timm; model = timm.create_model('efficientnet_b0', pretrained=True); print(model.default_cfg)" ``` ## Introduction timm (PyTorch Image Models) is the go-to library for pretrained image classification backbones in the PyTorch ecosystem. It provides hundreds of model architectures with pretrained weights and a consistent API for creating, fine-tuning, and benchmarking vision models. ## What timm Does - Supplies 700+ pretrained model architectures covering CNNs, Vision Transformers, and hybrids - Offers a single `create_model()` entry point that handles weight loading and head customization - Provides reusable layers (attention blocks, normalization, activation functions) as building blocks - Includes a training script (`train.py`) with modern augmentation and optimization defaults - Publishes model performance benchmarks and weight registries on Hugging Face Hub ## Architecture Overview Models are registered in a global registry keyed by name. `create_model()` looks up the constructor, optionally downloads pretrained weights, and replaces the classifier head to match the requested `num_classes`. Internally each model is a standard `nn.Module`. timm layers (`PatchEmbed`, `Mlp`, `DropPath`, etc.) are reused across architectures. A `data` subpackage handles augmentation pipelines (RandAugment, CutMix, Mixup) used during training. ## Self-Hosting & Configuration - Install via pip: `pip install timm` (requires PyTorch) - All weights download automatically from Hugging Face Hub on first use - Customize the classifier head: `timm.create_model('resnet50', num_classes=10)` - Use `timm.list_models('vit_*')` to discover available architectures - Export to ONNX or TorchScript with standard PyTorch APIs ## Key Features - Largest single-repo collection of vision model implementations for PyTorch - Consistent API across all architectures — swap backbones with one argument change - Regular updates with new state-of-the-art models (EfficientNet, ConvNeXt, SwinV2, EVA, etc.) - Built-in training recipe with competitive ImageNet accuracy out of the box - Integrated with Hugging Face Hub for easy weight sharing and versioning ## Comparison with Similar Tools - **torchvision.models** — ships with PyTorch but covers far fewer architectures and updates less often - **Hugging Face Transformers** — broader scope (NLP, audio, vision) but timm has deeper vision-specific coverage - **MMClassification (MMPretrain)** — OpenMMLab alternative, config-driven rather than code-driven - **CLIP** — focuses on vision-language alignment, not pure classification backbones - **Keras Applications** — TensorFlow/Keras equivalent; timm is PyTorch-native ## FAQ **Q: How do I fine-tune a timm model on a custom dataset?** A: Call `timm.create_model('efficientnet_b0', pretrained=True, num_classes=YOUR_NUM)`, freeze early layers if desired, and train with your own loop or the included training script. **Q: Can I use timm models for object detection or segmentation?** A: Yes. Libraries like Detectron2, MMDetection, and YOLO often accept timm backbones via feature extraction mode (`features_only=True`). **Q: Are timm weights free to use commercially?** A: Most weights use Apache-2.0 or similar permissive licenses, but check the individual model card on Hugging Face Hub. **Q: How does timm compare in speed to torchvision?** A: For the same architecture the performance is essentially identical; timm just offers more choices and newer designs. ## Sources - https://github.com/huggingface/pytorch-image-models - https://huggingface.co/docs/timm --- Source: https://tokrepo.com/en/workflows/b2a4ac4b-3e25-11f1-9bc6-00163e2b0d79 Author: AI Open Source