Cette page est affichée en anglais. Une traduction française est en cours.
ConfigsMay 12, 2026·2 min de lecture

MMPose — OpenMMLab Pose Estimation Toolbox

MMPose provides a modular framework for 2D and 3D pose estimation covering human body, hand, face, and animal keypoint detection with 30+ state-of-the-art methods.

Introduction

MMPose is a comprehensive pose estimation toolbox from the OpenMMLab ecosystem. It supports diverse tasks from human body keypoints to hand gesture recognition and animal pose tracking, all through a consistent modular API backed by PyTorch.

What MMPose Does

  • Estimates 2D and 3D keypoints for human body, hands, face, and animals
  • Implements 30+ methods including HRNet, RTMPose, and ViTPose
  • Provides top-down and bottom-up pose estimation pipelines
  • Supports whole-body pose estimation combining body, hand, and face
  • Integrates with MMDetection for person detection before pose estimation

Architecture Overview

MMPose follows a top-down or bottom-up paradigm. Top-down first detects each person with a bounding box (via MMDetection), then estimates keypoints within each box. Bottom-up detects all keypoints simultaneously and groups them by person. Both approaches use configurable backbones, heads, and codec modules managed by MMEngine.

Self-Hosting & Configuration

  • Install mmpose, mmengine, mmcv, and optionally mmdet via pip
  • Download model checkpoints from the MMPose model zoo
  • Use config files to select backbone, keypoint head, and dataset
  • Set input resolution to balance speed and accuracy
  • Deploy with MMDeploy for ONNX or TensorRT inference

Key Features

  • RTMPose models achieve real-time performance at high accuracy
  • Unified framework for body, hand, face, and animal keypoints
  • Extensive model zoo with pre-trained weights on COCO, MPII, and more
  • Modular codec system for keypoint encoding and decoding
  • Built-in visualization with skeleton overlay on images and video

Comparison with Similar Tools

  • MediaPipe — optimized for mobile and web but closed ecosystem; MMPose offers more research flexibility
  • OpenPose — pioneered real-time pose but is slower; RTMPose in MMPose is faster and more accurate
  • Detectron2 — supports keypoint detection but with fewer pose-specific methods
  • AlphaPose — strong real-time performance but narrower scope than MMPose

FAQ

Q: Can MMPose track poses across video frames? A: MMPose handles per-frame estimation. Combine with a tracker like ByteTrack for temporal tracking.

Q: Does it support 3D pose estimation? A: Yes. MMPose includes 3D pose methods that lift 2D keypoints into 3D coordinates.

Q: What is RTMPose? A: RTMPose is a real-time pose estimation model in MMPose that achieves state-of-the-art speed-accuracy tradeoffs.

Q: Can I train on custom keypoint definitions? A: Yes. Define a custom dataset class with your keypoint schema and skeleton connectivity.

Sources

Fil de discussion

Connectez-vous pour rejoindre la discussion.
Aucun commentaire pour l'instant. Soyez le premier à partager votre avis.

Actifs similaires