pith. sign in

arxiv: 1907.03967 · v1 · pith:IDLYXMLQnew · submitted 2019-07-09 · 💻 cs.CV · cs.IT· math.IT

On the Exact Recovery Conditions of 3D Human Motion from 2D Landmark Motion with Sparse Articulated Motion

Pith reviewed 2026-05-25 00:54 UTC · model grok-4.3

classification 💻 cs.CV cs.ITmath.IT
keywords 3D human motion recovery2D landmark motionsparse articulated motionexact recovery conditionsProjective Kinematic Space Propertyl1 minimizationkinematic modelreprojection constraint
0
0 comments X

The pith

Exact 3D human motion from 2D landmarks is recoverable if and only if the Projective Kinematic Space Property holds.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper sets out to find the precise conditions under which a sparse kinematic model recovers the true 3D angular articulations from observed 2D landmark trajectories. It starts from the fact that, at high frame rates, most angular velocities are zero even when the whole body moves rigidly, then formulates an ideal cardinality-minimization problem and its convex l1 relaxation subject to a differentiated reprojection constraint. The central theorem states that the relaxed program returns the exact ground-truth motion precisely when the Projective Kinematic Space Property is satisfied, and that this same condition also makes the relaxed solution identical to the ideal l0 solution. A reader cares because the result supplies a verifiable certificate that turns an under-determined inverse problem into a reliably solvable one without needing extra views or priors.

Core claim

The paper proves that the relaxed l1 formulation recovers the exact 3D human motion vector from 2D landmark motion if and only if the Projective Kinematic Space Property (PKSP) is verified; it further shows that the relaxed formulation yields the identical ground-truth solution as the ideal l0 formulation if and only if the same PKSP condition holds.

What carries the argument

The Projective Kinematic Space Property (PKSP), a condition that encodes both the differentiated reprojection equality and the linear kinematic mapping from angular velocities to 3D point velocities, guaranteeing that the sparse solution is the unique minimizer.

If this is right

  • The relaxed program succeeds on simulated data whose angular motion is deliberately sparse.
  • The same program produces usable 3D motion estimates on the HUMAN3.6M, PANOPTIC and MPI-I3DHP sequences.
  • Recovery remains possible when the number of angular degrees of freedom exceeds twice the number of observed 2D landmarks.
  • Verification of PKSP supplies a practical test that the recovered motion is exact rather than merely plausible.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • PKSP could be monitored at runtime to decide whether to trust a recovered motion or to request additional frames.
  • The same sparsity-plus-PKSP argument might apply to other articulated systems whose joint velocities are sparse at high sampling rates.
  • If frame rate drops and many joints move simultaneously, PKSP is likely to fail, indicating the need for a different regularizer or multi-view fusion.

Load-bearing premise

At high tracking rates only a few angular articulations have non-zero velocity, independent of any global rigid motion.

What would settle it

A counter-example sequence of 2D landmark trajectories for which the l1 solver returns a motion vector different from the known ground-truth angular velocities even though PKSP is satisfied, or returns the ground-truth vector when PKSP is violated.

Figures

Figures reproduced from arXiv: 1907.03967 by Abed Malti.

Figure 1
Figure 1. Figure 1: Top: skeleton and joints. Usually, the links en [PITH_FULL_IMAGE:figures/full_fig_p002_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Sketch of possible ambiguities when recon [PITH_FULL_IMAGE:figures/full_fig_p005_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Skeleton with 40 angular articulated joints (40- [PITH_FULL_IMAGE:figures/full_fig_p010_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: Comparing accuracy and specificity between (RF) and (L2). [PITH_FULL_IMAGE:figures/full_fig_p011_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: HUMAN3.6M. 3D reconstructions (Sitting) [PITH_FULL_IMAGE:figures/full_fig_p013_5.png] view at source ↗
Figure 6
Figure 6. Figure 6: PANOPTIC. 3D reconstructions (Pose 1, id=141216 with cam HD 2). 13 [PITH_FULL_IMAGE:figures/full_fig_p013_6.png] view at source ↗
Figure 8
Figure 8. Figure 8: 3D error evaluation with occluded joints. The [PITH_FULL_IMAGE:figures/full_fig_p014_8.png] view at source ↗
Figure 9
Figure 9. Figure 9: Sample ground-truth joint-to-joint lengths. [PITH_FULL_IMAGE:figures/full_fig_p014_9.png] view at source ↗
Figure 10
Figure 10. Figure 10: Fraction of joints exceeding error threshold [PITH_FULL_IMAGE:figures/full_fig_p014_10.png] view at source ↗
read the original abstract

In this paper, we address the problem of exact recovery condition in retrieving 3D human motion from 2D landmark motion. We use a skeletal kinematic model to represent the 3D human motion as a vector of angular articulation motion. We address this problem based on the observation that at high tracking rate, regardless of the global rigid motion, only few angular articulations have non-zero motion. We propose a first ideal formulation with $\ell_0$-norm to minimize the cardinal of non-zero angular articulation motion given an equality constraint on the time-differentiation of the reprojection error. The second relaxed formulation relies on an $\ell_1$-norm to minimize the sum of absolute values of the angular articulation motion. This formulation has the advantage of being able to provide 3D motion even in the under-determined case when twice the number of 2D landmarks is smaller than the number of angular articulations. We define a specific property which is the Projective Kinematic Space Property (PKSP) that takes into account the reprojection constraint and the kinematic model. We prove that for the relaxed formulation we are able to recover the exact 3D human motion from 2D landmarks if and only if the PKSP property is verified. We further demonstrate that solving the relaxed formulation provides the same ground-truth solution as the ideal formulation if and only if the PKSP condition is filled. Results with simulated sparse skeletal angular motion show the ability of the proposed method to recover exact location of angular motion. We provide results on publicly available real data (HUMAN3.6M, PANOPTIC and MPI-I3DHP).

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The paper claims to address exact recovery of 3D human motion from 2D landmark motion by representing motion via a skeletal kinematic model of angular articulations. Motivated by the empirical observation that at high frame rates only few articulations exhibit non-zero motion (independent of global rigid motion), it introduces an ideal ℓ0-cardinality formulation minimizing non-zero articulations subject to a time-differentiated reprojection equality constraint, and a convex ℓ1 relaxation. The manuscript defines the Projective Kinematic Space Property (PKSP) that incorporates the reprojection constraint and kinematic model, then asserts that the ℓ1 program recovers the exact 3D motion if and only if PKSP holds, and that the ℓ1 solution coincides with the ground-truth ℓ0 solution under the same condition. Validation is provided on simulated sparse angular motions and real datasets (HUMAN3.6M, PANOPTIC, MPI-I3DHP).

Significance. If the PKSP-based if-and-only-if statements are rigorously established with an independently verifiable property, the work would supply a compressed-sensing-style guarantee for exact recovery in under-determined kinematic settings, which is of clear value for robust 3D human motion estimation. The explicit use of publicly available datasets is a positive feature that supports reproducibility and empirical testing of the claimed conditions.

major comments (2)
  1. [Abstract / PKSP section] Abstract and PKSP definition: the manuscript asserts an if-and-only-if recovery theorem via PKSP, yet the provided text does not contain the full derivation, the precise matrix-level definition of PKSP, or an explicit argument showing that PKSP can be checked from the kinematic Jacobian and reprojection operator without reference to the optimization outcome. Because this equivalence is the central claim, the complete proof (including any supporting lemmas on the null-space or range properties) must be supplied in a dedicated section so that the statement can be verified independently of the optimization result.
  2. [Introduction / sparsity model paragraph] The observation that “at high tracking rate, regardless of the global rigid motion, only few angular articulations have non-zero motion” is presented as the basis for the sparsity model. The manuscript should clarify whether this sparsity is merely motivational or is used inside the recovery analysis, and should supply either a quantitative justification (e.g., statistics on the cited datasets) or a reference establishing the claim at the frame rates employed.
minor comments (2)
  1. [Abstract] The phrase “if and only if the PKSP condition is filled” appears in the abstract; standard mathematical English uses “satisfied” or “holds.”
  2. [Abstract / formulation sections] Notation for the time-differentiated reprojection error and the mapping from angular velocities to 2D landmark velocities should be introduced once and used consistently; currently the abstract refers to both without an explicit symbol.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive feedback. We address the two major comments below and will revise the manuscript to incorporate the requested clarifications and expansions.

read point-by-point responses
  1. Referee: [Abstract / PKSP section] Abstract and PKSP definition: the manuscript asserts an if-and-only-if recovery theorem via PKSP, yet the provided text does not contain the full derivation, the precise matrix-level definition of PKSP, or an explicit argument showing that PKSP can be checked from the kinematic Jacobian and reprojection operator without reference to the optimization outcome. Because this equivalence is the central claim, the complete proof (including any supporting lemmas on the null-space or range properties) must be supplied in a dedicated section so that the statement can be verified independently of the optimization result.

    Authors: We agree that the current manuscript does not present the full derivation or matrix-level definition of PKSP in sufficient detail for independent verification. The if-and-only-if claim is based on null-space properties of the combined kinematic Jacobian and reprojection operator, analogous to the null-space property in compressed sensing. We will add a dedicated section (e.g., Section 3.3) containing the complete proof, the explicit matrix definition of PKSP, and the argument that PKSP is checkable directly from the Jacobian and reprojection matrix without reference to any optimization outcome. revision: yes

  2. Referee: [Introduction / sparsity model paragraph] The observation that “at high tracking rate, regardless of the global rigid motion, only few angular articulations have non-zero motion” is presented as the basis for the sparsity model. The manuscript should clarify whether this sparsity is merely motivational or is used inside the recovery analysis, and should supply either a quantitative justification (e.g., statistics on the cited datasets) or a reference establishing the claim at the frame rates employed.

    Authors: The sparsity observation is primarily motivational for adopting the ℓ0/ℓ1 formulation that minimizes the number of non-zero angular articulations. It is used inside the recovery analysis because the PKSP is defined with respect to solutions whose support is sparse in the angular articulation space. We will clarify this distinction in the introduction and add quantitative justification by reporting the average number of non-zero angular velocities observed on HUMAN3.6M, PANOPTIC, and MPI-I3DHP at the frame rates used in the experiments. revision: yes

Circularity Check

0 steps flagged

No significant circularity; PKSP is an independently defined recovery characterization

full rationale

The central claim is a standard if-and-only-if characterization theorem: the paper defines PKSP from the reprojection error constraint and kinematic model, then proves that l1 recovery equals the ground-truth l0 solution precisely when PKSP holds. This structure is mathematically non-circular (equivalent to NSP/RIP-style results in sparse recovery literature) and does not reduce the theorem to a tautology by construction. The sparsity observation is used only for motivation of the model, not as a hidden premise inside the proof. No self-citation chains, fitted inputs renamed as predictions, or ansatzes smuggled via citation appear in the derivation chain.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 1 invented entities

The claim rests on the domain assumption of sparse angular velocity at high frame rates and on the newly invented PKSP property whose independent falsifiability is not shown.

axioms (1)
  • domain assumption At high tracking rate, regardless of the global rigid motion, only few angular articulations have non-zero motion.
    Explicitly stated in the abstract as the observation enabling the sparsity model.
invented entities (1)
  • Projective Kinematic Space Property (PKSP) no independent evidence
    purpose: Characterizes when the l1 minimization recovers the exact sparse angular motion under the reprojection constraint.
    Defined inside the paper; no external validation or falsifiable prediction outside the optimization is provided.

pith-pipeline@v0.9.0 · 5834 in / 1425 out tokens · 34753 ms · 2026-05-25T00:54:59.855303+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

67 extracted references · 67 canonical work pages · 6 internal anchors

  1. [1]

    Agarwal and B

    A. Agarwal and B. Triggs. Recovering 3d human pose from monocular images. IEEE Transactions on Pattern Analysis and Machine Intelligence , 28(1):44–58, January 2006

  2. [2]

    Agarwal and B

    A. Agarwal and B. Triggs. Recovering 3D human pose from monocular images. IEEE Transactions on Pattern Analysis and Machine Intelligence, 28(1), January 2006. 12 Figure 5: H UMAN 3.6M. 3D reconstructions (Sitting). Figure 6: PANOPTIC . 3D reconstructions (Pose 1, id=141216 with cam HD 2). 13 Figure 7: MPI-I3DHP. 3D reconstructions (S3, Seq2, Video 5). Fig...

  3. [3]

    Agudo, L

    A. Agudo, L. Agapito, B. Calvo, and J. M. M. Montiel. Good Vibrations: A Modal Analysis Approach for Se- quential Non-rigid Structure from Motion. In 2014 IEEE Conference on Computer Vision and Pattern Recognition, pages 1558–1565, June 2014

  4. [4]

    Akhter and M

    I. Akhter and M. J. Black. Pose-conditioned joint angle limits for 3d human pose reconstruction. In 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 1446–1455, June 2015

  5. [5]

    Akhter, Y

    I. Akhter, Y . Sheikh, S. Khan, and T. Kanade. Trajectory Space: A Dual Representation for Nonrigid Structure from Motion. IEEE Transactions on Pattern Analysis and Ma- chine Intelligence, 33(7):1442–1456, July 2011

  6. [6]

    Optical Flow-Based 3d Human Motion Estimation from Monocular Video

    Thiemo Alldieck, Marc Kassubeck, Bastian Wandt, Bodo Rosenhahn, and Marcus Magnor. Optical Flow-Based 3d Human Motion Estimation from Monocular Video. In Pattern Recognition, Lecture Notes in Computer Science, pages 347–360. Springer, Cham, September 2017

  7. [7]

    Andriluka, L

    M. Andriluka, L. Pishchulin, P. Gehler, and B. Schiele. 2d Human Pose Estimation: New Benchmark and State of the Art Analysis. In 2014 IEEE Conference on Com- puter Vision and Pattern Recognition , pages 3686–3693, June 2014

  8. [8]

    Identifiability and identification of inertial parameters us- ing the underactuated base-link dynamics for legged multi- body systems

    Ko Ayusawa, Gentiane Venture, and Yoshihiko Nakamura. Identifiability and identification of inertial parameters us- ing the underactuated base-link dynamics for legged multi- body systems. The International Journal of Robotics Re- search, 33(3):446–468, March 2014

  9. [9]

    A. S. Bandeira, K. Scheinberg, and L. N. Vicente. On par- tial sparse recovery. arXiv:1304.2809 [cs, math], 2013

  10. [10]

    Federica Bogo, Angjoo Kanazawa, Christoph Lassner, Pe- ter Gehler, Javier Romero, and Michael J. Black. Keep It SMPL: Automatic Estimation of 3d Human Pose and Shape from a Single Image. In ECCV 2016, Lecture Notes in Computer Science, pages 561–578. Springer, Cham, October 2016

  11. [11]

    Bregler, A

    C. Bregler, A. Hertzmann, and H. Biermann. Recovering non-rigid 3D shape from image streams. In CVPR, 2000

  12. [12]

    Bregler and J

    C. Bregler and J. Malik. Tracking people with twists and exponential maps. In Proceedings. 1998 IEEE Com- puter Society Conference on Computer Vision and Pat- tern Recognition (Cat. No.98CB36231), pages 8–15, June 1998

  13. [13]

    Del Bue, J

    A. Del Bue, J. Xavier, L. Agapito, and M. Paladini. Bilinear Modeling via Augmented Lagrange Multipliers (BALM). IEEE Transactions on Pattern Analysis and Ma- chine Intelligence, 34(8):1496–1508, August 2012

  14. [14]

    Z. Cao, T. Simon, S. E. Wei, and Y . Sheikh. Real- time Multi-person 2d Pose Estimation Using Part Affinity Fields. In 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 1302–1310, July 2017

  15. [15]

    Magnor, and Hans-Peter Seidel

    Joel Carranza, Christian Theobalt, Marcus A. Magnor, and Hans-Peter Seidel. Free-viewpoint Video of Human Ac- tors. In ACM SIGGRAPH 2003 Papers, SIGGRAPH ’03, pages 569–577, New York, NY , USA, 2003. ACM

  16. [16]

    3d Reconstruction of Human Motion and Skeleton from Uncalibrated Monoc- ular Video

    Yen-Lin Chen and Jinxiang Chai. 3d Reconstruction of Human Motion and Skeleton from Uncalibrated Monoc- ular Video. In ACCV, Lecture Notes in Computer Sci- ence, pages 71–82. Springer, Berlin, Heidelberg, Septem- ber 2009

  17. [17]

    Y . Dai, H. Li, and M. He. A simple prior-free method for non-rigid structure-from-motion factorization. In CVPR, 2012

  18. [18]

    Elgammal and Chan-Su Lee

    A. Elgammal and Chan-Su Lee. Inferring 3d body pose from silhouettes using activity manifold learning. In Pro- ceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004., volume 2, pages II–681–II–688 V ol.2, June 2004

  19. [19]

    Pose Locality Constrained Representation for 3d Human Pose Reconstruction

    Xiaochuan Fan, Kang Zheng, Youjie Zhou, and Song Wang. Pose Locality Constrained Representation for 3d Human Pose Reconstruction. In Computer Vision ECCV 2014, Lecture Notes in Computer Science, pages 174–188. Springer, Cham, September 2014

  20. [20]

    Foucart and H

    S. Foucart and H. Rauhut. A Mathematical Introduction to Compressive Sensing. Springer Science & Business Me- dia, 2013

  21. [21]

    J. C. Gower. Generalized procrustes analysis. Psychome- trika, 40(1):33–51, 1975

  22. [22]

    Weiss, A

    Peng Guan, A. Weiss, A. O. Bãlan, and M. J. Black. Es- timating human shape and pose from a single image. In 2009 IEEE 12th International Conference on Computer Vi- sion, pages 1381–1388, September 2009

  23. [23]

    Haralick, C-N

    B. Haralick, C-N. Lee, K. Ottenberg, and M. Naelle. Re- view and analysis of solutions of the three point perspec- tive pose estimation problem. IJCV, 13(3):331–356, 1994

  24. [24]

    Hasler, H

    N. Hasler, H. Ackermann, B. Rosenhahn, T. Thormahlen, and H. P. Seidel. Multilinear pose and body shape es- timation of dressed subjects from image sets. In 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pages 1823–1830, June 2010

  25. [25]

    Insafutdinov, M

    E. Insafutdinov, M. Andriluka, L. Pishchulin, S. Tang, E. Levinkov, B. Andres, and B. Schiele. ArtTrack: Artic- ulated Multi-Person Tracking in the Wild. In 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 1293–1301, July 2017. 15

  26. [26]

    Ionescu, D

    C. Ionescu, D. Papava, V . Olaru, and C. Sminchisescu. Hu- man3.6m: Large Scale Datasets and Predictive Methods for 3d Human Sensing in Natural Environments. IEEE Transactions on Pattern Analysis and Machine Intelli- gence, 36(7):1325–1339, July 2014

  27. [27]

    H. Joo, H. Liu, L. Tan, L. Gui, B. Nabbe, I. Matthews, T. Kanade, S. Nobuhara, and Y . Sheikh. Panoptic Studio: A Massively Multiview System for Social Motion Capture. In 2015 IEEE International Conference on Computer Vi- sion (ICCV), pages 3334–3342, December 2015

  28. [28]

    End-to-end Recovery of Human Shape and Pose

    Angjoo Kanazawa, Michael J. Black, David W. Jacobs, and Jitendra Malik. End-to-end Recovery of Human Shape and Pose. arXiv:1712.06584 [cs], December 2017. arXiv: 1712.06584

  29. [29]

    Learning Latent Representa- tions of 3d Human Pose with Deep Neural Networks

    Isinsu Katircioglu, Bugra Tekin, Mathieu Salzmann, Vin- cent Lepetit, and Pascal Fua. Learning Latent Representa- tions of 3d Human Pose with Deep Neural Networks. In- ternational Journal of Computer Vision, pages 1–16, Jan- uary 2018

  30. [30]

    Leonardos, X

    S. Leonardos, X. Zhou, and K. Daniilidis. Articulated mo- tion estimation from a monocular image sequence using spherical tangent bundles. In 2016 IEEE International Conference on Robotics and Automation (ICRA) , pages 587–593, May 2016

  31. [31]

    Maita and G

    D. Maita and G. Venture. Influence of the model’s degree of freedom on human body dynamics identification. In 2013 35th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC) , pages 4609–4612, July 2013

  32. [32]

    Malti and C

    A. Malti and C. Herzet. Elastic shape-from-template with spatially sparse deforming forces. In CVPR, 2017

  33. [33]

    Julieta Martinez, Rayat Hossain, Javier Romero, and James J. Little. A simple yet effective baseline for 3d hu- man pose estimation. arXiv:1705.03098 [cs], May 2017. arXiv: 1705.03098

  34. [34]

    Monocular 3D Human Pose Estimation In The Wild Using Improved CNN Supervision

    Dushyant Mehta, Helge Rhodin, Dan Casas, Pascal Fua, Oleksandr Sotnychenko, Weipeng Xu, and Chris- tian Theobalt. Monocular 3d Human Pose Esti- mation In The Wild Using Improved CNN Supervi- sion. arXiv:1611.09813 [cs] , November 2016. arXiv: 1611.09813

  35. [35]

    VNect: Real-time 3d Human Pose Estimation with a Single RGB Camera

    Dushyant Mehta, Srinath Sridhar, Oleksandr Sotnychenko, Helge Rhodin, Mohammad Shafiei, Hans-Peter Seidel, Weipeng Xu, Dan Casas, and Christian Theobalt. VNect: Real-time 3d Human Pose Estimation with a Single RGB Camera. ACM Trans. Graph. , 36(4):44:1–44:14, July 2017

  36. [36]

    Moreno-Noguer

    F. Moreno-Noguer. 3d Human Pose Estimation from a Sin- gle Image via Distance Matrix Regression. In 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 1561–1570, July 2017

  37. [37]

    Musculoskeletal-see-through mir- ror: computational modeling and algorithm for whole- body muscle activity visualization in real time

    Akihiko Murai, Kosuke Kurosaki, Katsu Yamane, and Yoshihiko Nakamura. Musculoskeletal-see-through mir- ror: computational modeling and algorithm for whole- body muscle activity visualization in real time. Progress in Biophysics and Molecular Biology , 103(2-3):310–317, December 2010

  38. [38]

    Murray, Zexiang Li, S

    Richard M. Murray, Zexiang Li, S. Shankar Sastry, and S. Shankara Sastry. A Mathematical Introduction to Robotic Manipulation. CRC Press, March 1994. Google- Books-ID: D_PqGKRo7oIC

  39. [39]

    Opti- mal Metric Projections for Deformable and Articulated Structure-from-Motion

    Marco Paladini, Alessio Del Bue, João Xavier, Lour- des Agapito, Marko Stosic, and Marija Dodig. Opti- mal Metric Projections for Deformable and Articulated Structure-from-Motion. International Journal of Com- puter Vision, 96(2):252–276, January 2012

  40. [40]

    H. S. Park and Y . Sheikh. 3d reconstruction of a smooth ar- ticulated trajectory from a monocular image sequence. In 2011 International Conference on Computer Vision, pages 201–208, November 2011

  41. [41]

    Coarse-to-Fine V olumetric Pre- diction for Single-Image 3d Human Pose

    Georgios Pavlakos, Xiaowei Zhou, Konstantinos G Derpa- nis, and Kostas Daniilidis. Coarse-to-Fine V olumetric Pre- diction for Single-Image 3d Human Pose. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2017

  42. [42]

    Re- constructing 3d Human Pose from 2d Image Landmarks

    Varun Ramakrishna, Takeo Kanade, and Yaser Sheikh. Re- constructing 3d Human Pose from 2d Image Landmarks. In ECCV 2012, Lecture Notes in Computer Science, pages 573–586. Springer, Berlin, Heidelberg, October 2012

  43. [43]

    LCR-Net: Localization-Classification- Regression for Human Pose

    Gregory Rogez, Philippe Weinzaepfel, and Cordelia Schmid. LCR-Net: Localization-Classification- Regression for Human Pose. In CVPR 2017 - IEEE Conference on Computer Vision & Pattern Recognition , Honolulu, United States, June 2017

  44. [44]

    Sigal and M

    L. Sigal and M. J. Black. HumanEva: Synchronized video and motion capture dataset for evaluation of articulated hu- man motion. Technical Report CS-06-08, Brown Univer- sity, 2006

  45. [45]

    Simo-Serra, A

    E. Simo-Serra, A. Ramisa, G. Alenyà , C. Torras, and F. Moreno-Noguer. Single image 3d human pose estima- tion from noisy observations. In 2012 IEEE Conference on Computer Vision and Pattern Recognition, pages 2673– 2680, June 2012. 16

  46. [46]

    Sminchisescu and A

    C. Sminchisescu and A. Jepson. Variational mixture smoothing for non-linear dynamical systems. In Proceed- ings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004., volume 2, pages II–608–II–615 V ol.2, June 2004

  47. [47]

    Sminchisescu and B

    C. Sminchisescu and B. Triggs. Kinematic jump processes for monocular 3d human tracking. In 2003 IEEE Com- puter Society Conference on Computer Vision and Pattern Recognition, 2003. Proceedings., volume 1, pages I–69–I– 76 vol.1, June 2003

  48. [48]

    Stoll, N

    C. Stoll, N. Hasler, J. Gall, H. P. Seidel, and C. Theobalt. Fast articulated motion tracking using a sums of Gaussians body model. In 2011 International Conference on Com- puter Vision, pages 951–958, November 2011

  49. [49]

    C. J. Taylor. Reconstruction of articulated objects from point correspondences in a single uncalibrated image. Computer Vision and Image Understanding , 80(10):349– 363, October 2000

  50. [50]

    Theobalt, J

    C. Theobalt, J. Carranza, and M. A. Magnor. Enhanc- ing silhouette-based human motion capture with 3d mo- tion fields. In11th Pacific Conference onComputer Graph- ics and Applications, 2003. Proceedings., pages 185–193, October 2003

  51. [51]

    Tomasi and T

    C. Tomasi and T. Kanade. Shape and motion from im- age streams under orthography: A factorization method. International Journal of Computer Vision , 9(2):137–154, November 1992

  52. [52]

    D. Tome, C. Russell, and L. Agapito. Lifting from the Deep: Convolutional 3d Pose Estimation from a Single Image. In 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 5689–5698, July 2017

  53. [53]

    Torresani, A

    L. Torresani, A. Hertzmann, and C. Bregler. Learning non- rigid 3D shape from 2D motion. In Neural Information Processing Systems Conference, 2003

  54. [54]

    Torresani, A

    L. Torresani, A. Hertzmann, and C. Bregler. Non-rigid structure-from-motion: Estimating shape and motion with hierarchical priors. IEEE Transactions on Pattern Analysis and Machine Intelligence, 30(5):878–892, May 2008

  55. [55]

    Torresani, D

    L. Torresani, D. Yang, G. Alexander, and C. Bregler. Tracking and modeling non-rigid objects with rank con- straints. In CVPR, 2001

  56. [56]

    Wandt, H

    B. Wandt, H. Ackermann, and B. Rosenhahn. 3d Recon- struction of Human Motion from Monocular Image Se- quences. IEEE Transactions on Pattern Analysis and Ma- chine Intelligence, 38(8):1505–1516, August 2016

  57. [57]

    C. Wang, Y . Wang, Z. Lin, A. L. Yuille, and W. Gao. Ro- bust Estimation of 3d Human Poses from a Single Image. In 2014 IEEE Conference on Computer Vision and Pattern Recognition, pages 2369–2376, June 2014

  58. [58]

    Z. Wen, D. Goldfarb, and W. Yin. Alternating direction augmented Lagrangian methods for semidefinite program- ming. Mathematical Programming Computation , 2(3- 4):203–230, 2010

  59. [59]

    Xiao, J.-X

    J. Xiao, J.-X. Chai, and T. Kanade. A closed-form solution to non-rigid shape and motion recovery. In ECCV, 2004

  60. [60]

    MonoPerfCap: Human Performance Capture from Monocular Video

    Weipeng Xu, Avishek Chatterjee, Michael Zollhofer, Helge Rhodin, Dushyant Mehta, Hans-Peter Seidel, and Christian Theobalt. MonoPerfCap: Human Performance Capture from Monocular Video. arXiv:1708.02136 [cs], August 2017. arXiv: 1708.02136

  61. [61]

    Multi-body dynamics modelling of seated human body un- der exposure to whole-body vibration

    Takuya Yoshimura, Kazuma Nakai, and Gen Tamaoki. Multi-body dynamics modelling of seated human body un- der exposure to whole-body vibration. Industrial Health, 43(3):441–447, July 2005

  62. [62]

    Spatio-temporal Matching for Human Detection in Video

    Feng Zhou and Fernando De la Torre. Spatio-temporal Matching for Human Detection in Video. In Computer Vision ECCV 2014, Lecture Notes in Computer Science, pages 62–77. Springer, Cham, September 2014

  63. [63]

    X. Zhou, M. Zhu, S. Leonardos, and K. Daniilidis. Sparse Representation for 3d Shape Estimation: A Convex Relax- ation Approach. IEEE Transactions on Pattern Analysis and Machine Intelligence, 39(8):1648–1661, August 2017

  64. [64]

    X. Zhou, M. Zhu, S. Leonardos, K. G. Derpanis, and K. Daniilidis. Sparseness Meets Deepness: 3d Human Pose Estimation from Monocular Video. In 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 4966–4975, June 2016

  65. [65]

    Towards 3D Human Pose Estimation in the Wild: a Weakly-supervised Approach

    Xingyi Zhou, Qixing Huang, Xiao Sun, Xiangyang Xue, and Yichen Wei. Towards 3d Human Pose Es- timation in the Wild: a Weakly-supervised Approach. arXiv:1704.02447 [cs], April 2017. arXiv: 1704.02447

  66. [66]

    Deep Kinematic Pose Regression

    Xingyi Zhou, Xiao Sun, Wei Zhang, Shuang Liang, and Yichen Wei. Deep Kinematic Pose Regression. ECCV Workshop on Geometry Meets Deep Learning, September

  67. [67]

    Y . Zhu, M. Cox, and S. Lucey. 3d motion reconstruction for real-world camera motion. In CVPR 2011, pages 1–8, June 2011. A Bounds of rotational ariticula- tions 17 Articulation Rotation Angles Min [deg] Max [deg] Hip (Rz,Rx,Ry) ( −30,−10,−5) (30 , 180, 5) Right Hip (Rz,Rx,Ry) ( −180,−170,−90) (180 , 90, 90) Right Knee (Rx) (0) (150) Spine 1 (Rz,Rx,Ry) ( ...