GEM represents driving scenes as explicit continuous 4D Gaussian primitives with learned dynamics to enable direct querying at arbitrary timestamps for semantic occupancy forecasting and motion planning.
Dome: Tam- ing diffusion model into high-fidelity controllable occupancy world model
8 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
verdicts
UNVERDICTED 8roles
background 3polarities
background 3representative citing papers
GEM is a new LiDAR world model using deformable Mamba that disentangles dynamic and static features to generate high-fidelity simulations and achieve state-of-the-art results on autonomous driving benchmarks.
HERMES++ unifies 3D scene understanding and future geometry prediction in driving scenes via BEV representations, LLM-enhanced queries, a temporal link, and joint geometric optimization.
Rolling Sink is a training-free cache adjustment technique that maintains visual consistency in autoregressive video diffusion models for ultra-long open-ended generation beyond training horizons.
Lotus-2 is a two-stage deterministic adaptation of diffusion priors that achieves state-of-the-art monocular depth estimation with only 59K training samples.
This survey synthesizes AI techniques for mixed autonomy traffic simulation and introduces a taxonomy spanning agent-level behavior models, environment-level methods, and cognitive/physics-informed approaches.
A sparse transformer predicts multi-frame 3D occupancy from images without BEV or VAE tokenization and reports SOTA results on nuScenes for 1-3s forecasting under arbitrary trajectories.
DriVerse is a generative model that simulates driving scenes from an image and trajectory using multimodal prompting and motion alignment, achieving better performance on nuScenes and Waymo datasets with minimal training.
citing papers explorer
-
GEM: Gaussian Evolution Model for Occupancy Forecasting and Motion Planning
GEM represents driving scenes as explicit continuous 4D Gaussian primitives with learned dynamics to enable direct querying at arbitrary timestamps for semantic occupancy forecasting and motion planning.
-
GEM: Generating LiDAR World Model via Deformable Mamba
GEM is a new LiDAR world model using deformable Mamba that disentangles dynamic and static features to generate high-fidelity simulations and achieve state-of-the-art results on autonomous driving benchmarks.
-
HERMES++: Toward a Unified Driving World Model for 3D Scene Understanding and Generation
HERMES++ unifies 3D scene understanding and future geometry prediction in driving scenes via BEV representations, LLM-enhanced queries, a temporal link, and joint geometric optimization.
-
Rolling Sink: Bridging Limited-Horizon Training and Open-Ended Testing in Autoregressive Video Diffusion
Rolling Sink is a training-free cache adjustment technique that maintains visual consistency in autoregressive video diffusion models for ultra-long open-ended generation beyond training horizons.
-
Lotus-2: Advancing Geometric Dense Prediction with Powerful Image Generative Model
Lotus-2 is a two-stage deterministic adaptation of diffusion priors that achieves state-of-the-art monocular depth estimation with only 59K training samples.
-
Artificial Intelligence for Modeling and Simulation of Mixed Automated and Human Traffic
This survey synthesizes AI techniques for mixed autonomy traffic simulation and introduces a taxonomy spanning agent-level behavior models, environment-level methods, and cognitive/physics-informed approaches.
-
SparseWorld-TC: Trajectory-Conditioned Sparse Occupancy World Model
A sparse transformer predicts multi-frame 3D occupancy from images without BEV or VAE tokenization and reports SOTA results on nuScenes for 1-3s forecasting under arbitrary trajectories.
-
DriVerse: Navigation World Model for Driving Simulation via Multimodal Trajectory Prompting and Motion Alignment
DriVerse is a generative model that simulates driving scenes from an image and trajectory using multimodal prompting and motion alignment, achieving better performance on nuScenes and Waymo datasets with minimal training.