GEM represents driving scenes as explicit continuous 4D Gaussian primitives with learned dynamics to enable direct querying at arbitrary timestamps for semantic occupancy forecasting and motion planning.
Occllama: An occupancy-language-action generative world model for au- tonomous driving
7 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
verdicts
UNVERDICTED 7roles
background 2polarities
background 2representative citing papers
DriveFuture achieves SOTA results on NAVSIM by conditioning latent world model states on future predictions to directly inform trajectory planning.
HERMES++ unifies 3D scene understanding and future geometry prediction in driving scenes via BEV representations, LLM-enhanced queries, a temporal link, and joint geometric optimization.
Chat-Scene++ improves 3D scene understanding in multimodal LLMs by representing scenes as context-rich object sequences with identifier tokens and grounded chain-of-thought reasoning, reaching state-of-the-art on five benchmarks using pre-trained encoders.
A 3D Language-Embedded Gaussians framework with opacity-aware Poisson volumetric aggregation and progressive temperature decay achieves 59.50 IoU and 21.05 mIoU on Occ-ScanNet for open-vocabulary indoor occupancy.
This survey synthesizes AI techniques for mixed autonomy traffic simulation and introduces a taxonomy spanning agent-level behavior models, environment-level methods, and cognitive/physics-informed approaches.
A sparse transformer predicts multi-frame 3D occupancy from images without BEV or VAE tokenization and reports SOTA results on nuScenes for 1-3s forecasting under arbitrary trajectories.
citing papers explorer
-
GEM: Gaussian Evolution Model for Occupancy Forecasting and Motion Planning
GEM represents driving scenes as explicit continuous 4D Gaussian primitives with learned dynamics to enable direct querying at arbitrary timestamps for semantic occupancy forecasting and motion planning.
-
DriveFuture: Future-Aware Latent World Models for Autonomous Driving
DriveFuture achieves SOTA results on NAVSIM by conditioning latent world model states on future predictions to directly inform trajectory planning.
-
HERMES++: Toward a Unified Driving World Model for 3D Scene Understanding and Generation
HERMES++ unifies 3D scene understanding and future geometry prediction in driving scenes via BEV representations, LLM-enhanced queries, a temporal link, and joint geometric optimization.
-
Chat-Scene++: Exploiting Context-Rich Object Identification for 3D LLM
Chat-Scene++ improves 3D scene understanding in multimodal LLMs by representing scenes as context-rich object sequences with identifier tokens and grounded chain-of-thought reasoning, reaching state-of-the-art on five benchmarks using pre-trained encoders.
-
Monocular Open Vocabulary Occupancy Prediction for Indoor Scenes
A 3D Language-Embedded Gaussians framework with opacity-aware Poisson volumetric aggregation and progressive temperature decay achieves 59.50 IoU and 21.05 mIoU on Occ-ScanNet for open-vocabulary indoor occupancy.
-
Artificial Intelligence for Modeling and Simulation of Mixed Automated and Human Traffic
This survey synthesizes AI techniques for mixed autonomy traffic simulation and introduces a taxonomy spanning agent-level behavior models, environment-level methods, and cognitive/physics-informed approaches.
-
SparseWorld-TC: Trajectory-Conditioned Sparse Occupancy World Model
A sparse transformer predicts multi-frame 3D occupancy from images without BEV or VAE tokenization and reports SOTA results on nuScenes for 1-3s forecasting under arbitrary trajectories.