PREX decomposes target 4D video volumes into Preserve, Reveal, and Expand roles with a region-aware adapter on a frozen diffusion backbone, trained via proxy tasks, and introduces the PREBench benchmark to reduce region-structured editing failures.
arXiv preprint arXiv:2505.22944 (2025)
6 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
fields
cs.CV 6years
2026 6representative citing papers
MoRight disentangles object and camera motion via canonical-view specification and temporal cross-view attention, while decomposing motion into active user-driven and passive consequence components to learn and apply causality in video generation.
A synthetic data pipeline and fine-tuned video model enable generative editing to move object 3D trajectories in videos while keeping relative motion.
Self-supervised models learn to perceive and manipulate the flow of time in videos, supporting speed detection, large-scale slow-motion data curation, and temporally controllable video synthesis.
A new occlusion-aware control module generates high-fidelity egocentric videos from sparse 3D hand joints, supported by a million-clip dataset and cross-embodiment benchmark.
citing papers explorer
-
Preserve, Reveal, Expand: Faithful 4D Video Editing with Region-Aware Conditioning
PREX decomposes target 4D video volumes into Preserve, Reveal, and Expand roles with a region-aware adapter on a frozen diffusion backbone, trained via proxy tasks, and introduces the PREBench benchmark to reduce region-structured editing failures.
-
MoRight: Motion Control Done Right
MoRight disentangles object and camera motion via canonical-view specification and temporal cross-view attention, while decomposing motion into active user-driven and passive consequence components to learn and apply causality in video generation.
-
TrajectoryMover: Generative Movement of Object Trajectories in Videos
A synthetic data pipeline and fine-tuned video model enable generative editing to move object 3D trajectories in videos while keeping relative motion.
-
Seeing Fast and Slow: Learning the Flow of Time in Videos
Self-supervised models learn to perceive and manipulate the flow of time in videos, supporting speed detection, large-scale slow-motion data curation, and temporally controllable video synthesis.
-
Controllable Egocentric Video Generation via Occlusion-Aware Sparse 3D Hand Joints
A new occlusion-aware control module generates high-fidelity egocentric videos from sparse 3D hand joints, supported by a million-clip dataset and cross-embodiment benchmark.
- Eulerian Motion Guidance: Robust Image Animation via Bidirectional Geometric Consistency