# CVAT — Computer Vision Annotation Tool for AI Training Data > A self-hosted annotation platform for labeling images and videos with bounding boxes, polygons, keypoints, and more. Built for teams building computer vision models. ## Install Save as a script file and run: # CVAT — Computer Vision Annotation Tool for AI Training Data ## Quick Use ```bash git clone https://github.com/cvat-ai/cvat.git cd cvat docker compose up -d # Open http://localhost:8080 # Create a superuser: docker exec -it cvat_server bash -ic 'python manage.py createsuperuser' ``` ## Introduction CVAT (Computer Vision Annotation Tool) is an open-source platform for labeling images and videos to create training datasets for machine learning models. Originally developed by Intel, it provides a web-based interface where annotators can draw bounding boxes, polygons, polylines, keypoints, and cuboids. It supports team workflows with task assignment, quality control, and export to common dataset formats. ## What CVAT Does - Annotates images and video frames with multiple label types including bounding boxes, polygons, and keypoints - Supports semi-automatic annotation using AI models via the Nuclio serverless framework - Manages multi-user projects with task assignment, review workflows, and quality analytics - Exports datasets in COCO, Pascal VOC, YOLO, Datumaro, and other popular formats - Provides a REST API and Python SDK for programmatic task creation and data retrieval ## Architecture Overview CVAT runs as a set of Docker containers: a Django backend, a PostgreSQL database, a Redis queue, and an Nginx reverse proxy. The frontend is a React application that communicates with the backend via REST APIs. For AI-assisted annotation, CVAT integrates with Nuclio to deploy inference models as serverless functions that run alongside the main application. ## Self-Hosting & Configuration - Requires Docker and Docker Compose on a Linux host - Minimum recommended: 4 CPU cores, 8 GB RAM for small teams - GPU support is optional, used only for AI-assisted annotation models - Configure SMTP, storage backends, and auth providers via environment variables - Persistent data is stored in Docker volumes for the database and uploaded media ## Key Features - AI-assisted annotation with automatic bounding box and polygon suggestions - Built-in analytics dashboard for tracking annotation progress and quality - Supports both image and video annotation with frame-level interpolation - Cloud storage integration with AWS S3, Google Cloud Storage, and Azure Blob - Active Directory and LDAP integration for enterprise single sign-on ## Comparison with Similar Tools - **Label Studio** — more general-purpose (text, audio, images); CVAT specializes in computer vision - **doccano** — focused on text annotation; CVAT handles images and video - **Supervisely** — commercial platform with a free tier; CVAT is fully open source - **VOTT** — archived by Microsoft; CVAT is actively maintained with frequent releases ## FAQ **Q: Can CVAT handle video annotation?** A: Yes. CVAT supports frame-by-frame video annotation with object tracking and interpolation between keyframes. **Q: Does it support automatic annotation?** A: Yes. You can deploy pre-trained models (such as YOLO or Faster R-CNN) via the Nuclio integration to generate automatic annotations that annotators can then refine. **Q: What export formats are supported?** A: CVAT exports to COCO JSON, Pascal VOC XML, YOLO TXT, Datumaro, LabelMe, and several other formats. **Q: Is there a cloud-hosted version?** A: Yes. The team offers a managed cloud service at app.cvat.ai with free and paid plans. ## Sources - https://github.com/cvat-ai/cvat - https://docs.cvat.ai --- Source: https://tokrepo.com/en/workflows/asset-c06f0d95 Author: Script Depot