pith. machine review for the scientific record. sign in

arxiv: 1812.08008 · v2 · submitted 2018-12-18 · 💻 cs.CV

Recognition: unknown

OpenPose: Realtime Multi-Person 2D Pose Estimation using Part Affinity Fields

Authors on Pith no claims yet
classification 💻 cs.CV
keywords bodyrealtimepartposeaccuracyestimationfootimage
0
0 comments X
read the original abstract

Realtime multi-person 2D pose estimation is a key component in enabling machines to have an understanding of people in images and videos. In this work, we present a realtime approach to detect the 2D pose of multiple people in an image. The proposed method uses a nonparametric representation, which we refer to as Part Affinity Fields (PAFs), to learn to associate body parts with individuals in the image. This bottom-up system achieves high accuracy and realtime performance, regardless of the number of people in the image. In previous work, PAFs and body part location estimation were refined simultaneously across training stages. We demonstrate that a PAF-only refinement rather than both PAF and body part location refinement results in a substantial increase in both runtime performance and accuracy. We also present the first combined body and foot keypoint detector, based on an internal annotated foot dataset that we have publicly released. We show that the combined detector not only reduces the inference time compared to running them sequentially, but also maintains the accuracy of each component individually. This work has culminated in the release of OpenPose, the first open-source realtime system for multi-person 2D pose detection, including body, foot, hand, and facial keypoints.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Forward citations

Cited by 2 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. Can LLMs Reason About Attention? Towards Zero-Shot Analysis of Multimodal Classroom Behavior

    cs.HC 2026-04 unverdicted novelty 4.0

    A pipeline uses OpenPose and Gaze-LLE to extract pose and gaze data from classroom videos, deletes the raw footage, and applies an LLM for zero-shot behavioral analysis of student attention.

  2. Real-Time Cellist Postural Evaluation With On-Device Computer Vision

    cs.HC 2026-04 unverdicted novelty 3.0

    Cello Evaluator is a real-time postural feedback system for cellists running on current Android phones via on-device computer vision, validated as user-friendly by experts.