A graph-grounded Combined Road Substrate framework generates traceable QA pairs from road maps to improve small VLMs on compositional road reasoning tasks.
Baseline reference
Planning-oriented autonomous driving
Baseline reference. 67% of citing Pith papers use this work as a benchmark or comparison.
citation-role summary
citation-polarity summary
verdicts
UNVERDICTED 9representative citing papers
MindVLA-U1 is the first unified streaming VLA architecture that surpasses human drivers on WOD-E2E planning metrics while matching VA latency and preserving language interfaces.
DAWN couples a world predictor with a world-conditioned action denoiser in latent space so that each refines the other recursively, yielding strong planning and safety results on autonomous driving benchmarks.
CoWorld-VLA extracts semantic, geometric, dynamic, and trajectory expert tokens from multi-source supervision and feeds them into a diffusion-based hierarchical planner, achieving competitive collision avoidance and trajectory accuracy on the NAVSIM v1 benchmark.
Reinforcement learning sensorimotor policies enable quadrotors to traverse narrow gaps at extreme tilts with 5 cm clearance using only vision and proprioception, including reactive traversal of moving gaps.
AutoVLA unifies semantic reasoning and trajectory planning in one autoregressive VLA model for end-to-end autonomous driving by tokenizing trajectories into discrete actions and using GRPO reinforcement fine-tuning to adaptively reduce unnecessary reasoning.
EponaV2 advances perception-free driving world models by forecasting comprehensive future 3D geometry and semantic representations, achieving SOTA planning performance on NAVSIM benchmarks.
CaAD adds ego-centric joint-causal modeling and causality-aware policy alignment to end-to-end driving, reporting Driving Score 87.53 and PDMS 91.1 on Bench2Drive and NAVSIM.
Attribution statistics derived from multi-view inputs in end-to-end planners can predict planning risks, with reported Spearman correlation of 0.30 with trajectory error and AUROC of 0.77 for collision detection.
citing papers explorer
-
Bridging Structure and Language: Graph-Based Visual Reasoning for Autonomous Road Understanding
A graph-grounded Combined Road Substrate framework generates traceable QA pairs from road maps to improve small VLMs on compositional road reasoning tasks.
-
MindVLA-U1: VLA Beats VA with Unified Streaming Architecture for Autonomous Driving
MindVLA-U1 is the first unified streaming VLA architecture that surpasses human drivers on WOD-E2E planning metrics while matching VA latency and preserving language interfaces.
-
The DAWN of World-Action Interactive Models
DAWN couples a world predictor with a world-conditioned action denoiser in latent space so that each refines the other recursively, yielding strong planning and safety results on autonomous driving benchmarks.
-
CoWorld-VLA: Thinking in a Multi-Expert World Model for Autonomous Driving
CoWorld-VLA extracts semantic, geometric, dynamic, and trajectory expert tokens from multi-source supervision and feeds them into a diffusion-based hierarchical planner, achieving competitive collision avoidance and trajectory accuracy on the NAVSIM v1 benchmark.
-
Precise Aggressive Aerial Maneuvers with Sensorimotor Policies
Reinforcement learning sensorimotor policies enable quadrotors to traverse narrow gaps at extreme tilts with 5 cm clearance using only vision and proprioception, including reactive traversal of moving gaps.
-
AutoVLA: A Vision-Language-Action Model for End-to-End Autonomous Driving with Adaptive Reasoning and Reinforcement Fine-Tuning
AutoVLA unifies semantic reasoning and trajectory planning in one autoregressive VLA model for end-to-end autonomous driving by tokenizing trajectories into discrete actions and using GRPO reinforcement fine-tuning to adaptively reduce unnecessary reasoning.
-
EponaV2: Driving World Model with Comprehensive Future Reasoning
EponaV2 advances perception-free driving world models by forecasting comprehensive future 3D geometry and semantic representations, achieving SOTA planning performance on NAVSIM benchmarks.
-
Causality-Aware End-to-End Autonomous Driving via Ego-Centric Joint Scene Modeling
CaAD adds ego-centric joint-causal modeling and causality-aware policy alignment to end-to-end driving, reporting Driving Score 87.53 and PDMS 91.1 on Bench2Drive and NAVSIM.
-
Can Attribution Predict Risk? From Multi-View Attribution to Planning Risk Signals in End-to-End Autonomous Driving
Attribution statistics derived from multi-view inputs in end-to-end planners can predict planning risks, with reported Spearman correlation of 0.30 with trajectory error and AUROC of 0.77 for collision detection.