pith. sign in

arxiv: 2603.25726 · v3 · pith:ZIDSSHZGnew · submitted 2026-03-26 · 💻 cs.CV

AnyHand: A Large-Scale Synthetic Dataset for RGB(-D) Hand Pose Estimation

classification 💻 cs.CV
keywords anyhandtrainingdataestimationhandposeexistingrgb-d
0
0 comments X
read the original abstract

We present AnyHand, a large-scale synthetic dataset designed to advance the state of the art in 3D hand pose estimation. While recent works with foundation approaches have shown that scaling training data markedly improves hand pose estimation, existing real-world datasets are limited in coverage, and prior synthetic datasets rarely provide occlusions, arm details, and aligned depth together at scale. To address this bottleneck, our proposed AnyHand contains 2.5M single-hand and 4.1M hand-object interaction RGB-D images, with rich geometric annotations. We show that extending the original training data recipes of existing RGB baselines with AnyHand yields significant gains on multiple benchmarks (FreiHAND and HO-3D), even when keeping the architectures and training schemes fixed. Together with extensive ablations on the scale and composition of the training data setups, these results suggest that training data diversity and quality are as critical as scale for advancing hand pose estimation. We further examine the utility of AnyHand's aligned depth maps in the appendix, showing that scaling RGB-D supervision with AnyHand allows a lightweight depth-fusion variant of existing RGB baselines to outperform prior RGB-D methods.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.