Can 3D Pose be Learned from 2D Projections Alone?
read the original abstract
3D pose estimation from a single image is a challenging task in computer vision. We present a weakly supervised approach to estimate 3D pose points, given only 2D pose landmarks. Our method does not require correspondences between 2D and 3D points to build explicit 3D priors. We utilize an adversarial framework to impose a prior on the 3D structure, learned solely from their random 2D projections. Given a set of 2D pose landmarks, the generator network hypothesizes their depths to obtain a 3D skeleton. We propose a novel Random Projection layer, which randomly projects the generated 3D skeleton and sends the resulting 2D pose to the discriminator. The discriminator improves by discriminating between the generated poses and pose samples from a real distribution of 2D poses. Training does not require correspondence between the 2D inputs to either the generator or the discriminator. We apply our approach to the task of 3D human pose estimation. Results on Human3.6M dataset demonstrates that our approach outperforms many previous supervised and weakly supervised approaches.
This paper has not been read by Pith yet.
Forward citations
Cited by 1 Pith paper
-
xR-EgoPose: Egocentric 3D Human Pose from an HMD Camera
A dual-branch decoder network trained on the new xR-EgoPose synthetic dataset achieves state-of-the-art egocentric 3D pose estimation from HMD fish-eye cameras and generalizes to real footage.
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.