Bevworld: A multimodal world model for au- tonomous driving via unified bev latent space

· 2024 · arXiv 2407.05679

8 Pith papers cite this work. Polarity classification is still indexing.

8 Pith papers citing it

read on arXiv browse 8 citing papers

citation-role summary

background 3 baseline 1

citation-polarity summary

background 3 baseline 1

representative citing papers

BEVCALIB: LiDAR-Camera Calibration via Geometry-Guided Bird's-Eye View Representations

cs.CV · 2025-06-03 · unverdicted · novelty 7.0

BEVCALIB performs LiDAR-camera calibration from raw data by fusing camera and LiDAR bird's-eye view features with a novel feature selector and reports state-of-the-art accuracy on KITTI and NuScenes.

DriveFuture: Future-Aware Latent World Models for Autonomous Driving

cs.CV · 2026-05-10 · unverdicted · novelty 6.0

DriveFuture achieves SOTA results on NAVSIM by conditioning latent world model states on future predictions to directly inform trajectory planning.

HERMES++: Toward a Unified Driving World Model for 3D Scene Understanding and Generation

cs.CV · 2026-04-30 · unverdicted · novelty 6.0

HERMES++ unifies 3D scene understanding and future geometry prediction in driving scenes via BEV representations, LLM-enhanced queries, a temporal link, and joint geometric optimization.

Human Cognition in Machines: A Unified Perspective of World Models

cs.RO · 2026-04-17 · unverdicted · novelty 6.0

The paper introduces a unified framework for world models that fully incorporates all cognitive functions from Cognitive Architecture Theory, highlights under-researched areas in motivation and meta-cognition, and proposes Epistemic World Models as a new category for scientific discovery agents.

ReSim: Reliable World Simulation for Autonomous Driving

cs.CV · 2025-06-11 · unverdicted · novelty 6.0

ReSim is a controllable video world model trained on heterogeneous real and simulated driving data that achieves higher fidelity and controllability for both expert and non-expert actions, plus a Video2Reward module for estimating action quality from simulated futures.

PLAN-S: Bridging Planning with Latent Style Dynamics for Autonomous Driving World Models

cs.AI · 2026-06-04 · unverdicted · novelty 5.0

PLAN-S decodes a style-conditioned four-channel semantic cost map from latent representations to bridge world models and planners in autonomous driving, reporting 0.55 m average L2 and 42% collision reduction on nuScenes plus PDMS gains on NAVSIM.

DriVerse: Navigation World Model for Driving Simulation via Multimodal Trajectory Prompting and Motion Alignment

cs.RO · 2025-04-22 · unverdicted · novelty 5.0

DriVerse is a generative model that simulates driving scenes from an image and trajectory using multimodal prompting and motion alignment, achieving better performance on nuScenes and Waymo datasets with minimal training.

OpenWorldLib: A Unified Codebase and Definition of Advanced World Models

cs.CV · 2026-04-06

citing papers explorer

Showing 5 of 5 citing papers after filters.

DriveFuture: Future-Aware Latent World Models for Autonomous Driving cs.CV · 2026-05-10 · unverdicted · none · ref 38
DriveFuture achieves SOTA results on NAVSIM by conditioning latent world model states on future predictions to directly inform trajectory planning.
HERMES++: Toward a Unified Driving World Model for 3D Scene Understanding and Generation cs.CV · 2026-04-30 · unverdicted · none · ref 28
HERMES++ unifies 3D scene understanding and future geometry prediction in driving scenes via BEV representations, LLM-enhanced queries, a temporal link, and joint geometric optimization.
Human Cognition in Machines: A Unified Perspective of World Models cs.RO · 2026-04-17 · unverdicted · none · ref 222
The paper introduces a unified framework for world models that fully incorporates all cognitive functions from Cognitive Architecture Theory, highlights under-researched areas in motivation and meta-cognition, and proposes Epistemic World Models as a new category for scientific discovery agents.
PLAN-S: Bridging Planning with Latent Style Dynamics for Autonomous Driving World Models cs.AI · 2026-06-04 · unverdicted · none · ref 3
PLAN-S decodes a style-conditioned four-channel semantic cost map from latent representations to bridge world models and planners in autonomous driving, reporting 0.55 m average L2 and 42% collision reduction on nuScenes plus PDMS gains on NAVSIM.
OpenWorldLib: A Unified Codebase and Definition of Advanced World Models cs.CV · 2026-04-06 · unreviewed · ref 158

Bevworld: A multimodal world model for au- tonomous driving via unified bev latent space

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer