H-Flow learns dense human scene flow from monocular video via joint pose and depth prediction in a multi-head transformer, using physics-inspired geometric and biomechanical priors for self-supervision, and introduces the DynAct4D synthetic benchmark.
HuMMan: Multi-modal 4D human dataset for versatile sensing and modeling
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.CV 1years
2026 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
H-Flow: Self-supervised Human Scene Flow via Physics-inspired Joint Multi-modal Learning
H-Flow learns dense human scene flow from monocular video via joint pose and depth prediction in a multi-head transformer, using physics-inspired geometric and biomechanical priors for self-supervision, and introduces the DynAct4D synthetic benchmark.