Automatic Pose Detection and Anonymization Using MediaPipe

MediaPipe is a set of open-source machine learning libraries created by Google that allow easy access to many models including object detection, image classification, image segmentation, and face detection.

The Pose Landmarker is able to map a set of 33 landmarks that track human position from either a single image or video. Each marker has three outputs: the x and y coordinates, and whether it is visible in the image.

Fig. 1  The outline of the landmarks and an example tracked image [2].

Unlike other pose detection models, MediaPipe Pose Landmarker provides detailed location information for the hands and feet allowing the user to determine the scale and orientation of the hands and feet. This makes the pose information useful in a wider range of applications.

The pose detection uses a two-step process. First, a detector identifies the regions of interest within the frame. The number of people in the image can be input so that the detector will find a region for each one. The landmarks are then found within each region. In order to increase speed when tracking a video, the detector only runs on the first frame and subsequent regions of interest are determined by the previous frame’s landmark positions. This allows for real-time performance, even on cell phone CPUs. When running on a GPU it is possible to combine the Pose Landmarker with additional models for more detailed face or hand tracking.

Detecting and tracking the position of one or more people is valuable in many applications, such as tracking fitness performance, gait analysis, and gesture control of devices. Importantly for the iToBoS project, it also can automate the anonymization of identifiable medical images by using the face and eye coordinates to programmatically mask out these areas. Regions of interest can be defined using the detected landmarks, which can then be replaced by blurred or solid-coloured pixels. This will be robust across any angle, position, or motion, and is an effective method to hide the identity of patient images.

Bibliography 

[1] Bazarevsky, V.; Grishchenko, I. On-Device, Real-Time Body Pose Tracking with MediaPipe BlazePose, Google Research. Available online: https://ai.googleblog.com/2020/08/on-device-real-time-body-pose-tracking.html (accessed on 22 March 2024).

[2] Google. (2024, January 23). Pose landmark detection guide. Google. https://developers.google.com/mediapipe/solutions/vision/pose_landmarker