arXiv preprint arXiv:2510.26796 , year=

Dongyue Lu, Ao Liang, Tianxin Huang, Xiao Fu, Yuyang Zhao, Baorui Ma, Liang Pan, Wei Yin, Lingdong Kong, Wei Tsang Ooi, et al · 2025 · arXiv 2510.26796

3 Pith papers cite this work. Polarity classification is still indexing.

3 Pith papers citing it

read on arXiv browse 3 citing papers

citation-role summary

background 1

citation-polarity summary

background 1

representative citing papers

Geo-Align: Video Generation Alignment via Metric Geometry Reward

cs.CV · 2026-05-22 · unverdicted · novelty 7.0

Geo-Align applies RL with a perceptual reward derived from 3D camera trajectory estimation to improve controllability and fidelity in video generation without paired training data.

RealCam: Real-Time Novel-View Video Generation with Interactive Camera Control

cs.CV · 2026-05-07 · unverdicted · novelty 6.0

RealCam is a causal autoregressive model for real-time camera-controlled video-to-video generation, using cross-frame in-context teacher distillation and loop-closed data augmentation to achieve high fidelity and consistency.

Embody4D: A Generalist 4D World Model for Embodied AI

cs.CV · 2026-05-03 · unverdicted · novelty 5.0

Embody4D generates high-fidelity, view-consistent novel views from monocular videos for embodied scenarios via 3D-aware data synthesis, adaptive noise injection, and interaction-aware attention.

citing papers explorer

Showing 3 of 3 citing papers.

Geo-Align: Video Generation Alignment via Metric Geometry Reward cs.CV · 2026-05-22 · unverdicted · none · ref 24
Geo-Align applies RL with a perceptual reward derived from 3D camera trajectory estimation to improve controllability and fidelity in video generation without paired training data.
RealCam: Real-Time Novel-View Video Generation with Interactive Camera Control cs.CV · 2026-05-07 · unverdicted · none · ref 8
RealCam is a causal autoregressive model for real-time camera-controlled video-to-video generation, using cross-frame in-context teacher distillation and loop-closed data augmentation to achieve high fidelity and consistency.
Embody4D: A Generalist 4D World Model for Embodied AI cs.CV · 2026-05-03 · unverdicted · none · ref 36
Embody4D generates high-fidelity, view-consistent novel views from monocular videos for embodied scenarios via 3D-aware data synthesis, adaptive noise injection, and interaction-aware attention.

arXiv preprint arXiv:2510.26796 , year=

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer