pith. sign in

arxiv: 1704.02827 · v2 · pith:JHNJJVTSnew · submitted 2017-04-10 · 💻 cs.CV

Learning Human Motion Models for Long-term Predictions

classification 💻 cs.CV
keywords motionsequencestimemodelassessautoencoderdatadrift
0
0 comments X
read the original abstract

We propose a new architecture for the learning of predictive spatio-temporal motion models from data alone. Our approach, dubbed the Dropout Autoencoder LSTM, is capable of synthesizing natural looking motion sequences over long time horizons without catastrophic drift or motion degradation. The model consists of two components, a 3-layer recurrent neural network to model temporal aspects and a novel auto-encoder that is trained to implicitly recover the spatial structure of the human skeleton via randomly removing information about joints during training time. This Dropout Autoencoder (D-AE) is then used to filter each predicted pose of the LSTM, reducing accumulation of error and hence drift over time. Furthermore, we propose new evaluation protocols to assess the quality of synthetic motion sequences even for which no ground truth data exists. The proposed protocols can be used to assess generated sequences of arbitrary length. Finally, we evaluate our proposed method on two of the largest motion-capture datasets available to date and show that our model outperforms the state-of-the-art on a variety of actions, including cyclic and acyclic motion, and that it can produce natural looking sequences over longer time horizons than previous methods.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. A neural network based post-filter for speech-driven head motion synthesis

    eess.SP 2019-07 unverdicted novelty 4.0

    A neural network post-filter trained to reconstruct head motions improves de-noising and smoothness over linear filters in speech-driven head motion synthesis.