VHOI densifies sparse trajectories into color-encoded HOI mask sequences and conditions a fine-tuned video diffusion model on them to produce controllable human-object interaction videos, including full navigation sequences.
Mask2iv: Interaction-centric video generation via mask tra- jectories
3 Pith papers cite this work. Polarity classification is still indexing.
representative citing papers
A comprehensive survey that organizes the literature on world models in robot learning, their roles in policy learning, planning, simulation, and video-based generation, with connections to navigation, driving, datasets, and benchmarks.
citing papers explorer
-
VHOI: Controllable Video Generation of Human-Object Interactions from Sparse Trajectories via Motion Densification
VHOI densifies sparse trajectories into color-encoded HOI mask sequences and conditions a fine-tuned video diffusion model on them to produce controllable human-object interaction videos, including full navigation sequences.
-
World Model for Robot Learning: A Comprehensive Survey
A comprehensive survey that organizes the literature on world models in robot learning, their roles in policy learning, planning, simulation, and video-based generation, with connections to navigation, driving, datasets, and benchmarks.