How do I install Replicate Cog — Containerize ML Models with One YAML File?

Visit the asset page on TokRepo and click "Copy for agent" to get the installation instructions. Most assets can be installed with a single command.

Replicate Cog — Containerize ML Models with One YAML File

from cog import BasePredictor, Input, Path from PIL import Image import torch class Predictor(BasePredictor): def setup(self): """Load model into memory once at boot.""" self.model = torch.hub.load("pytorch/vision", "resnet50", pretrained=True) self.model.eval() def predict( self, image: Path = Input(description="Image to classify"), top_k: int = Input(default=3, ge=1, le=10), ) -> dict: img = Image.open(image) # ... preprocess and run model ... return {"top_classes": ["cat", "tabby", "egyptian"][:top_k]}

# Build the image cog build -t resnet50 # Run locally with GPU cog predict -i image=@cat.jpg -i top_k=5 # Push to Replicate cog push r8.im/yourname/resnet50 # Or deploy elsewhere (Cog images are standard Docker) docker run -p 5000:5000 --gpus=all resnet50 curl http://localhost:5000/predictions \ -H 'Content-Type: application/json' \ -d '{"input": {"image": "...base64..."}}'

Quick Use

Install: brew install cog (macOS) or sudo curl -o /usr/local/bin/cog -L https://github.com/replicate/cog/releases/latest/download/cog_$(uname -s)_$(uname -m) && sudo chmod +x /usr/local/bin/cog
Create cog.yaml and predict.py (templates in this asset)
cog predict to test locally; cog push to ship to Replicate

Intro

Cog is Replicate's open-source tool that wraps an ML model in a Docker container with a clean HTTP API. Define a cog.yaml for environment, a predict.py for inference, and cog build produces a portable image you can run anywhere — locally, on Replicate, on Kubernetes, on your own GPU box. Best for: ML researchers / engineers who want to ship a reproducible model without writing Dockerfiles. Works with: Linux, macOS, Windows (WSL2). Setup time: 10 minutes.

cog.yaml

build:
  gpu: true
  cuda: "12.1"
  python_version: "3.11"
  python_packages:
    - "torch==2.4.0"
    - "transformers==4.45.0"
    - "pillow==11.0.0"
predict: "predict.py:Predictor"

predict.py

from cog import BasePredictor, Input, Path
from PIL import Image
import torch

class Predictor(BasePredictor):
    def setup(self):
        """Load model into memory once at boot."""
        self.model = torch.hub.load("pytorch/vision", "resnet50", pretrained=True)
        self.model.eval()

    def predict(
        self,
        image: Path = Input(description="Image to classify"),
        top_k: int = Input(default=3, ge=1, le=10),
    ) -> dict:
        img = Image.open(image)
        # ... preprocess and run model ...
        return {"top_classes": ["cat", "tabby", "egyptian"][:top_k]}

Build, run, deploy

# Build the image
cog build -t resnet50

# Run locally with GPU
cog predict -i image=@cat.jpg -i top_k=5

# Push to Replicate
cog push r8.im/yourname/resnet50

# Or deploy elsewhere (Cog images are standard Docker)
docker run -p 5000:5000 --gpus=all resnet50
curl http://localhost:5000/predictions \
  -H 'Content-Type: application/json' \
  -d '{"input": {"image": "...base64..."}}'

What you get for free

Type-checked, schema-documented inputs (Cog generates OpenAPI)
Multi-GPU support via gpu: true and CUDA version pin
Auto-detects PyTorch / TensorFlow / JAX and pins versions
Output of Path types automatically uploaded to a CDN
Works as a standard Docker image anywhere

FAQ

Q: Is Cog free? A: Yes — Cog is open-source under Apache-2.0. Replicate's hosting is paid (per-second GPU billing), but you can deploy Cog images anywhere Docker runs for free.

Q: Does Cog work on Apple Silicon? A: Yes — gpu: false produces CPU-only images that run on Apple Silicon. For GPU inference on Mac, you'll need to deploy elsewhere (Replicate, Lambda, your own GPU box).

Q: How does this differ from a regular Dockerfile? A: Cog generates the Dockerfile for you — pinning CUDA, PyTorch, system libraries, with caching. You get strongly-typed inputs, OpenAPI docs, and CDN-uploaded outputs without writing them. For non-ML workloads, regular Docker is simpler.

Source & Thanks

Built by Replicate. Licensed under Apache-2.0.

replicate/cog — ⭐ 9,000+

Replicate Cog — Containerize ML Models with One YAML File

Installation avec revue préalable

cog.yaml

predict.py

Build, run, deploy

What you get for free

FAQ

Quick Use

Intro

cog.yaml

predict.py

Build, run, deploy

What you get for free

FAQ

Source & Thanks

Source et remerciements

Fil de discussion

Actifs similaires

Replicate — Run AI Models via Simple API Calls

Replicate Webhooks — Async Notifications for Slow Models

tiktoken — Fast BPE Tokenizer for OpenAI Models

CogVideo — Text and Image to Video Generation