MoRight disentangles object and camera motion via canonical-view specification and temporal cross-view attention, while decomposing motion into active user-driven and passive consequence components to learn and apply causality in video generation.
arXiv preprint arXiv:2401.00896 (2024) 12
3 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
fields
cs.CV 3years
2026 3verdicts
UNVERDICTED 3roles
background 1polarities
background 1representative citing papers
SVI-Bench is a 35K-hour sports video benchmark with 9 tasks across four cognitive pillars that reveals multimodal models drop from ~73% on action QA to 5% on agentic evidence-gathering tasks.
Pantheon360 introduces a controllable 360° video diffusion framework that uses an explicit 3D cache from sparse inputs to enforce geometric consistency for digital twin generation.
citing papers explorer
-
MoRight: Motion Control Done Right
MoRight disentangles object and camera motion via canonical-view specification and temporal cross-view attention, while decomposing motion into active user-driven and passive consequence components to learn and apply causality in video generation.
-
SVI-Bench: A Dynamic Microworld for Strategic Video Intelligence
SVI-Bench is a 35K-hour sports video benchmark with 9 tasks across four cognitive pillars that reveals multimodal models drop from ~73% on action QA to 5% on agentic evidence-gathering tasks.
-
Pantheon360: Taming Digital Twin Generation via 3D-Aware 360{\deg} Video Diffusion
Pantheon360 introduces a controllable 360° video diffusion framework that uses an explicit 3D cache from sparse inputs to enforce geometric consistency for digital twin generation.