QuaterNet: A Quaternion-based Recurrent Model for Human Motion

Dario Pavllo , David Grangier , Michael Auli

Authors on Pith no claims yet

classification 💻 cs.CV

keywords quaternetangleerrorshumanjointrecurrentrotationsskeleton

read the original abstract

Deep learning for predicting or generating 3D human pose sequences is an active research area. Previous work regresses either joint rotations or joint positions. The former strategy is prone to error accumulation along the kinematic chain, as well as discontinuities when using Euler angle or exponential map parameterizations. The latter requires re-projection onto skeleton constraints to avoid bone stretching and invalid configurations. This work addresses both limitations. Our recurrent network, QuaterNet, represents rotations with quaternions and our loss function performs forward kinematics on a skeleton to penalize absolute position errors instead of angle errors. On short-term predictions, QuaterNet improves the state-of-the-art quantitatively. For long-term generation, our approach is qualitatively judged as realistic as recent neural strategies from the graphics literature.

This paper has not been read by Pith yet.

discussion (0)

Forward citations

Cited by 2 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

EggHand: A Multimodal Foundation Model for Egocentric Hand Pose Forecasting
cs.CV 2026-05 unverdicted novelty 6.0

EggHand unifies VLA action decoding with viewpoint-aware video-text encoding to forecast egocentric hand poses, achieving SOTA accuracy on EgoExo4D while remaining robust to ego-motion and controllable via language prompts.
Next-Scale Autoregressive Models for Text-to-Motion Generation
cs.CV 2026-04 unverdicted novelty 6.0

MoScale introduces a hierarchical next-scale autoregressive framework for text-to-motion generation that achieves state-of-the-art performance by refining motions from coarse to fine temporal resolutions.