BEVCALIB performs LiDAR-camera calibration from raw data by fusing camera and LiDAR bird's-eye view features with a novel feature selector and reports state-of-the-art accuracy on KITTI and NuScenes.
Bevworld: A multimodal world model for au- tonomous driving via unified bev latent space
8 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
representative citing papers
DriveFuture achieves SOTA results on NAVSIM by conditioning latent world model states on future predictions to directly inform trajectory planning.
HERMES++ unifies 3D scene understanding and future geometry prediction in driving scenes via BEV representations, LLM-enhanced queries, a temporal link, and joint geometric optimization.
The paper introduces a unified framework for world models that fully incorporates all cognitive functions from Cognitive Architecture Theory, highlights under-researched areas in motivation and meta-cognition, and proposes Epistemic World Models as a new category for scientific discovery agents.
ReSim is a controllable video world model trained on heterogeneous real and simulated driving data that achieves higher fidelity and controllability for both expert and non-expert actions, plus a Video2Reward module for estimating action quality from simulated futures.
PLAN-S decodes a style-conditioned four-channel semantic cost map from latent representations to bridge world models and planners in autonomous driving, reporting 0.55 m average L2 and 42% collision reduction on nuScenes plus PDMS gains on NAVSIM.
DriVerse is a generative model that simulates driving scenes from an image and trajectory using multimodal prompting and motion alignment, achieving better performance on nuScenes and Waymo datasets with minimal training.
citing papers explorer
-
DriveFuture: Future-Aware Latent World Models for Autonomous Driving
DriveFuture achieves SOTA results on NAVSIM by conditioning latent world model states on future predictions to directly inform trajectory planning.
-
HERMES++: Toward a Unified Driving World Model for 3D Scene Understanding and Generation
HERMES++ unifies 3D scene understanding and future geometry prediction in driving scenes via BEV representations, LLM-enhanced queries, a temporal link, and joint geometric optimization.
-
Human Cognition in Machines: A Unified Perspective of World Models
The paper introduces a unified framework for world models that fully incorporates all cognitive functions from Cognitive Architecture Theory, highlights under-researched areas in motivation and meta-cognition, and proposes Epistemic World Models as a new category for scientific discovery agents.
-
PLAN-S: Bridging Planning with Latent Style Dynamics for Autonomous Driving World Models
PLAN-S decodes a style-conditioned four-channel semantic cost map from latent representations to bridge world models and planners in autonomous driving, reporting 0.55 m average L2 and 42% collision reduction on nuScenes plus PDMS gains on NAVSIM.
- OpenWorldLib: A Unified Codebase and Definition of Advanced World Models