Motionctrl: A unified and flexible motion controller for video generation

Zhouxia Wang, Ziyang Yuan, Xintao Wang, Yaowei Li, Tianshui Chen, Menghan Xia + 2 more · 2024 · Special Interest Group on Computer Graphics and Interactive Techniques Conference Conference Papers 24 · DOI 10.1145/3641519.3657518

5 Pith papers cite this work, alongside 149 external citations. Polarity classification is still indexing.

5 Pith papers citing it

149 external citations · external index

open at publisher browse 5 citing papers

citation-role summary

background 1

citation-polarity summary

background 1

representative citing papers

Geo-Align: Video Generation Alignment via Metric Geometry Reward

cs.CV · 2026-05-22 · unverdicted · novelty 7.0

Geo-Align applies RL with a perceptual reward derived from 3D camera trajectory estimation to improve controllability and fidelity in video generation without paired training data.

Preserve, Reveal, Expand: Faithful 4D Video Editing with Region-Aware Conditioning

cs.CV · 2026-05-20 · unverdicted · novelty 7.0

PREX decomposes target 4D video volumes into Preserve, Reveal, and Expand roles with a region-aware adapter on a frozen diffusion backbone, trained via proxy tasks, and introduces the PREBench benchmark to reduce region-structured editing failures.

ReactiveGWM: Steering NPC in Reactive Game World Models

cs.CV · 2026-05-14 · unverdicted · novelty 6.0

ReactiveGWM introduces a decoupled diffusion architecture for player-NPC interactions that learns game-agnostic response logic for zero-shot strategy transfer across games.

$h$-control: Training-Free Camera Control via Block-Conditional Gibbs Refinement

cs.CV · 2026-05-12 · unverdicted · novelty 6.0 · 2 refs

h-control augments hard-replacement guidance with block-conditional pseudo-Gibbs refinement on unobserved latent sites and adaptive 3D patch freezing to achieve superior FVD on RealEstate10K and DAVIS.

SANA-WM: Efficient Minute-Scale World Modeling with Hybrid Linear Diffusion Transformer

cs.CV · 2026-05-14 · unverdicted · novelty 5.0

SANA-WM is a 2.6B-parameter efficient world model that synthesizes minute-scale 720p videos with 6-DoF camera control, trained on 213K public clips in 15 days on 64 H100s and runnable on single GPUs at 36x higher throughput than prior open baselines.

citing papers explorer

Showing 5 of 5 citing papers.

Geo-Align: Video Generation Alignment via Metric Geometry Reward cs.CV · 2026-05-22 · unverdicted · none · ref 14
Geo-Align applies RL with a perceptual reward derived from 3D camera trajectory estimation to improve controllability and fidelity in video generation without paired training data.
Preserve, Reveal, Expand: Faithful 4D Video Editing with Region-Aware Conditioning cs.CV · 2026-05-20 · unverdicted · none · ref 26
PREX decomposes target 4D video volumes into Preserve, Reveal, and Expand roles with a region-aware adapter on a frozen diffusion backbone, trained via proxy tasks, and introduces the PREBench benchmark to reduce region-structured editing failures.
ReactiveGWM: Steering NPC in Reactive Game World Models cs.CV · 2026-05-14 · unverdicted · none · ref 39
ReactiveGWM introduces a decoupled diffusion architecture for player-NPC interactions that learns game-agnostic response logic for zero-shot strategy transfer across games.
$h$-control: Training-Free Camera Control via Block-Conditional Gibbs Refinement cs.CV · 2026-05-12 · unverdicted · none · ref 40 · 2 links
h-control augments hard-replacement guidance with block-conditional pseudo-Gibbs refinement on unobserved latent sites and adaptive 3D patch freezing to achieve superior FVD on RealEstate10K and DAVIS.
SANA-WM: Efficient Minute-Scale World Modeling with Hybrid Linear Diffusion Transformer cs.CV · 2026-05-14 · unverdicted · none · ref 60
SANA-WM is a 2.6B-parameter efficient world model that synthesizes minute-scale 720p videos with 6-DoF camera control, trained on 213K public clips in 15 days on 64 H100s and runnable on single GPUs at 36x higher throughput than prior open baselines.

Motionctrl: A unified and flexible motion controller for video generation

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer