AnyLift: Scaling Motion Reconstruction from Internet Videos via 2D Diffusion

· 2026 · cs.CV · arXiv 2604.17818

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

open full Pith review browse 1 citing papers arXiv PDF

abstract

Reconstructing 3D human motion and human-object interactions (HOI) from Internet videos is a fundamental step toward building large-scale datasets of human behavior. Existing methods struggle to recover globally consistent 3D motion under dynamic cameras, especially for motion types underrepresented in current motion-capture datasets, and face additional difficulty recovering coherent human-object interactions in 3D. We introduce a two-stage framework leveraging 2D diffusion that reconstructs 3D human motion and HOI from Internet videos. In the first stage, we synthesize multi-view 2D motion data for each domain, leveraging 2D keypoints extracted from Internet videos to incorporate human motions that rarely appear in existing MoCap datasets. In the second stage, a camera-conditioned multi-view 2D motion diffusion model is trained on the domain-specific synthetic data to recover 3D human motion and 3D HOI in the world space. We demonstrate the effectiveness of our method on Internet videos featuring challenging motions such as gymnastics, as well as in-the-wild HOI videos, and show that it outperforms prior work in producing realistic human motion and human-object interaction.

citation-role summary

background 1

citation-polarity summary

background 1

representative citing papers

SAM 3D Animal: Promptable Animal 3D Reconstruction from Images in the Wild

cs.CV · 2026-05-08 · unverdicted · novelty 7.0

SAM 3D Animal is the first promptable framework for multi-animal 3D reconstruction from single images, built on SMAL+ and trained on the new Herd3D dataset, achieving SOTA results on Animal3D, APTv2, and Animal Kingdom benchmarks.

citing papers explorer

Showing 1 of 1 citing paper.

SAM 3D Animal: Promptable Animal 3D Reconstruction from Images in the Wild cs.CV · 2026-05-08 · unverdicted · none · ref 23 · internal anchor
SAM 3D Animal is the first promptable framework for multi-animal 3D reconstruction from single images, built on SMAL+ and trained on the new Herd3D dataset, achieving SOTA results on Animal3D, APTv2, and Animal Kingdom benchmarks.

AnyLift: Scaling Motion Reconstruction from Internet Videos via 2D Diffusion

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer