Dynamiccity: Large-scale 4d occupancy generation from dynamic scenes

Hengwei Bian, Lingdong Kong, Haozhe Xie, Liang Pan, Yu Qiao, Ziwei Liu · 2024 · arXiv 2410.18084

3 Pith papers cite this work. Polarity classification is still indexing.

3 Pith papers citing it

read on arXiv browse 3 citing papers

citation-role summary

background 1

citation-polarity summary

background 1

representative citing papers

OccDirector: Language-Guided Behavior and Interaction Generation in 4D Occupancy Space

cs.CV · 2026-04-24 · unverdicted · novelty 7.0

OccDirector uses a VLM-guided Spatio-Temporal MMDiT model with history anchoring to generate physically plausible 4D occupancy from language scripts, supported by the new OccInteract-85k dataset.

Xiaomi OneVL: One-Step Latent Reasoning and Planning with Vision-Language Explanation

cs.CV · 2026-04-20 · unverdicted · novelty 6.0 · 2 refs

OneVL achieves superior accuracy to explicit chain-of-thought reasoning at answer-only latency by supervising latent tokens with a visual world model decoder that predicts future frames.

SparseWorld-TC: Trajectory-Conditioned Sparse Occupancy World Model

cs.CV · 2025-11-27 · unverdicted · novelty 5.0

A sparse transformer predicts multi-frame 3D occupancy from images without BEV or VAE tokenization and reports SOTA results on nuScenes for 1-3s forecasting under arbitrary trajectories.

citing papers explorer

Showing 3 of 3 citing papers.

OccDirector: Language-Guided Behavior and Interaction Generation in 4D Occupancy Space cs.CV · 2026-04-24 · unverdicted · none · ref 2
OccDirector uses a VLM-guided Spatio-Temporal MMDiT model with history anchoring to generate physically plausible 4D occupancy from language scripts, supported by the new OccInteract-85k dataset.
Xiaomi OneVL: One-Step Latent Reasoning and Planning with Vision-Language Explanation cs.CV · 2026-04-20 · unverdicted · none · ref 7 · 2 links
OneVL achieves superior accuracy to explicit chain-of-thought reasoning at answer-only latency by supervising latent tokens with a visual world model decoder that predicts future frames.
SparseWorld-TC: Trajectory-Conditioned Sparse Occupancy World Model cs.CV · 2025-11-27 · unverdicted · none · ref 2
A sparse transformer predicts multi-frame 3D occupancy from images without BEV or VAE tokenization and reports SOTA results on nuScenes for 1-3s forecasting under arbitrary trajectories.

Dynamiccity: Large-scale 4d occupancy generation from dynamic scenes

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer