Unified camera positional encoding for controlled video generation

Cheng Zhang, Boying Li, Meng Wei, Yan-Pei Cao, Camilo Cruz Gambardella, Dinh Phung, Jianfei Cai · 2026 · arXiv 2512.07237

9 Pith papers cite this work. Polarity classification is still indexing.

9 Pith papers citing it

read on arXiv browse 9 citing papers

citation-role summary

background 1

citation-polarity summary

background 1

representative citing papers

RayPE: Ray-Space Positional Encoding for 3D-Aware Video Generation

cs.CV · 2026-06-25 · unverdicted · novelty 7.0

RayPE extends video DiT attention with Plucker coordinates and a gated reciprocal-product term to improve 3D consistency and camera controllability.

CRePE: Curved Ray Expectation Positional Encoding for Unified-Camera-Controlled Video Generation

cs.CV · 2026-05-13 · unverdicted · novelty 7.0

CRePE supplies depth-aware positional distributions along curved rays for stable unified-camera control in frozen video DiT models.

Geometrically Consistent Multi-View Scene Generation from Freehand Sketches

cs.CV · 2026-04-15 · unverdicted · novelty 7.0

A framework generates consistent multi-view scenes from one freehand sketch via a ~9k-sample dataset, Parallel Camera-Aware Attention Adapters, and Sparse Correspondence Supervision Loss, outperforming baselines in realism and consistency.

DPPE: Rethinking Camera-Based Positional Encoding for Scaling Multi-View Transformers

cs.CV · 2026-06-30 · unverdicted · novelty 6.0

DPPE decouples rotation and translation in camera positional encodings for multi-view transformers to resolve late-stage training stagnation and improve generalization in novel view synthesis.

Prisma-World: Camera-Controllable Multi-Agent Video World Model

cs.CV · 2026-06-08 · unverdicted · novelty 6.0

Prisma-World is a diffusion-based multi-agent video model that uses joint full-attention, multi-agent RoPE, and relative camera geometry injection plus curriculum training to produce consistent cross-view videos from flexible agent counts.

Effective Multi-sensor Conditioning for Street-view Novel-view Synthesis

cs.CV · 2026-06-01 · unverdicted · novelty 6.0

StreetNVS presents a multi-sensor conditioned video diffusion framework for street-view novel view synthesis that outperforms baselines with sparse LiDAR and handles extreme out-of-trajectory paths on the Waymo dataset.

Warp-as-History: Generalizable Camera-Controlled Video Generation from One Training Video

cs.CV · 2026-05-14 · unverdicted · novelty 6.0

Warp-as-History enables zero-shot camera trajectory following in frozen video models by supplying camera-warped pseudo-history, with single-video LoRA fine-tuning improving generalization to unseen videos.

Flex4DHuman: Flexible Multi-view Video Diffusion for 4D Human Reconstruction

cs.CV · 2026-06-11 · unverdicted · novelty 5.0

A multi-view video diffusion model conditioned on relative camera poses via extended RoPE generates dense synchronized views from sparse inputs for 4D Gaussian splatting reconstruction, claiming SOTA results on human datasets and generalization to animals.

SANA-WM: Efficient Minute-Scale World Modeling with Hybrid Linear Diffusion Transformer

cs.CV · 2026-05-14 · unverdicted · novelty 5.0

SANA-WM is a 2.6B-parameter efficient world model that synthesizes minute-scale 720p videos with 6-DoF camera control, trained on 213K public clips in 15 days on 64 H100s and runnable on single GPUs at 36x higher throughput than prior open baselines.

citing papers explorer

Showing 1 of 1 citing paper after filters.

Geometrically Consistent Multi-View Scene Generation from Freehand Sketches cs.CV · 2026-04-15 · unverdicted · none · ref 55
A framework generates consistent multi-view scenes from one freehand sketch via a ~9k-sample dataset, Parallel Camera-Aware Attention Adapters, and Sparse Correspondence Supervision Loss, outperforming baselines in realism and consistency.

Unified camera positional encoding for controlled video generation

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer