Learning to Estimate 3D Hand Pose from Single RGB Images

Christian Zimmermann; Thomas Brox

arxiv: 1705.01389 · v3 · pith:5TXHQKLFnew · submitted 2017-05-03 · 💻 cs.CV

Learning to Estimate 3D Hand Pose from Single RGB Images

Christian Zimmermann , Thomas Brox This is my paper

classification 💻 cs.CV

keywords handposeimagesdepthsingledeepestimatesestimation

0 comments

read the original abstract

Low-cost consumer depth cameras and deep learning have enabled reasonable 3D hand pose estimation from single depth images. In this paper, we present an approach that estimates 3D hand pose from regular RGB images. This task has far more ambiguities due to the missing depth information. To this end, we propose a deep network that learns a network-implicit 3D articulation prior. Together with detected keypoints in the images, this network yields good estimates of the 3D pose. We introduce a large scale 3D hand pose dataset based on synthetic hand models for training the involved networks. Experiments on a variety of test sets, including one on sign language recognition, demonstrate the feasibility of 3D hand pose estimation on single color images.

This paper has not been read by Pith yet.

discussion (0)

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

LACE: Latent Visual Representation for Cross-Embodiment Learning
cs.RO 2026-05 unverdicted novelty 6.0

LACE aligns human-robot visual features via semantic distribution matching on corresponding body parts plus Gram loss, yielding 65% better zero-shot policy transfer than baseline DINO.