# CVAT — Computer Vision Annotation Tool for AI Training Data

> A self-hosted annotation platform for labeling images and videos with bounding boxes, polygons, keypoints, and more. Built for teams building computer vision models.

## Install

Save as a script file and run:

# CVAT — Computer Vision Annotation Tool for AI Training Data

## Quick Use
```bash
git clone https://github.com/cvat-ai/cvat.git
cd cvat
docker compose up -d
# Open http://localhost:8080
# Create a superuser: docker exec -it cvat_server bash -ic 'python manage.py createsuperuser'
```

## Introduction
CVAT (Computer Vision Annotation Tool) is an open-source platform for labeling images and videos to create training datasets for machine learning models. Originally developed by Intel, it provides a web-based interface where annotators can draw bounding boxes, polygons, polylines, keypoints, and cuboids. It supports team workflows with task assignment, quality control, and export to common dataset formats.

## What CVAT Does
- Annotates images and video frames with multiple label types including bounding boxes, polygons, and keypoints
- Supports semi-automatic annotation using AI models via the Nuclio serverless framework
- Manages multi-user projects with task assignment, review workflows, and quality analytics
- Exports datasets in COCO, Pascal VOC, YOLO, Datumaro, and other popular formats
- Provides a REST API and Python SDK for programmatic task creation and data retrieval

## Architecture Overview
CVAT runs as a set of Docker containers: a Django backend, a PostgreSQL database, a Redis queue, and an Nginx reverse proxy. The frontend is a React application that communicates with the backend via REST APIs. For AI-assisted annotation, CVAT integrates with Nuclio to deploy inference models as serverless functions that run alongside the main application.

## Self-Hosting & Configuration
- Requires Docker and Docker Compose on a Linux host
- Minimum recommended: 4 CPU cores, 8 GB RAM for small teams
- GPU support is optional, used only for AI-assisted annotation models
- Configure SMTP, storage backends, and auth providers via environment variables
- Persistent data is stored in Docker volumes for the database and uploaded media

## Key Features
- AI-assisted annotation with automatic bounding box and polygon suggestions
- Built-in analytics dashboard for tracking annotation progress and quality
- Supports both image and video annotation with frame-level interpolation
- Cloud storage integration with AWS S3, Google Cloud Storage, and Azure Blob
- Active Directory and LDAP integration for enterprise single sign-on

## Comparison with Similar Tools
- **Label Studio** — more general-purpose (text, audio, images); CVAT specializes in computer vision
- **doccano** — focused on text annotation; CVAT handles images and video
- **Supervisely** — commercial platform with a free tier; CVAT is fully open source
- **VOTT** — archived by Microsoft; CVAT is actively maintained with frequent releases

## FAQ
**Q: Can CVAT handle video annotation?**
A: Yes. CVAT supports frame-by-frame video annotation with object tracking and interpolation between keyframes.

**Q: Does it support automatic annotation?**
A: Yes. You can deploy pre-trained models (such as YOLO or Faster R-CNN) via the Nuclio integration to generate automatic annotations that annotators can then refine.

**Q: What export formats are supported?**
A: CVAT exports to COCO JSON, Pascal VOC XML, YOLO TXT, Datumaro, LabelMe, and several other formats.

**Q: Is there a cloud-hosted version?**
A: Yes. The team offers a managed cloud service at app.cvat.ai with free and paid plans.

## Sources
- https://github.com/cvat-ai/cvat
- https://docs.cvat.ai

---
Source: https://tokrepo.com/en/workflows/asset-c06f0d95
Author: Script Depot