Reconstruction under VLM Limitations Based on our user study, the text descriptions generated by the VLM are, on average, superior to those produced by human annotators

Discussions 8 · arXiv 1973.8109

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

representative citing papers

SocialMirror: Reconstructing 3D Human Interaction Behaviors from Monocular Videos with Semantic and Geometric Guidance

cs.CV · 2026-04-15 · unverdicted · novelty 5.0

SocialMirror reconstructs 3D meshes of closely interacting humans from monocular videos using semantic guidance from vision-language models and geometric constraints in a diffusion model to handle occlusions and maintain temporal and spatial consistency.

citing papers explorer

Showing 1 of 1 citing paper.

SocialMirror: Reconstructing 3D Human Interaction Behaviors from Monocular Videos with Semantic and Geometric Guidance cs.CV · 2026-04-15 · unverdicted · none · ref 76
SocialMirror reconstructs 3D meshes of closely interacting humans from monocular videos using semantic guidance from vision-language models and geometric constraints in a diffusion model to handle occlusions and maintain temporal and spatial consistency.

Reconstruction under VLM Limitations Based on our user study, the text descriptions generated by the VLM are, on average, superior to those produced by human annotators

fields

years

verdicts

representative citing papers

citing papers explorer