The work introduces the ORVOS task, the ORVOSB benchmark with causal annotations across 210 videos, and a baseline using updated prompts plus a temporal token reservoir.
Title resolution pending
3 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
fields
cs.CV 3years
2026 3verdicts
UNVERDICTED 3roles
background 1polarities
background 1representative citing papers
A diffusion-based pipeline creates a 27M-annotation dataset of object placements that outperforms human annotations and baselines on image editing tasks, then distills it into a fast model.
A synthetic data pipeline and fine-tuned video model enable generative editing to move object 3D trajectories in videos while keeping relative motion.
citing papers explorer
-
Online Reasoning Video Object Segmentation
The work introduces the ORVOS task, the ORVOSB benchmark with causal annotations across 210 videos, and a baseline using updated prompts plus a temporal token reservoir.
-
HiddenObjects: Scalable Diffusion-Distilled Spatial Priors for Object Placement
A diffusion-based pipeline creates a 27M-annotation dataset of object placements that outperforms human annotations and baselines on image editing tasks, then distills it into a fast model.
-
TrajectoryMover: Generative Movement of Object Trajectories in Videos
A synthetic data pipeline and fine-tuned video model enable generative editing to move object 3D trajectories in videos while keeping relative motion.