Rag-driver: Generalisable driving explanations with retrieval-augmented in-context learning in multi-modal large language model.arXiv preprint arXiv:2402.10828

Yuan, J · 2024 · arXiv 2402.10828

8 Pith papers cite this work. Polarity classification is still indexing.

8 Pith papers citing it

read on arXiv browse 8 citing papers

citation-role summary

background 2 baseline 1

citation-polarity summary

background 2 baseline 1

representative citing papers

VLADriver-RAG: Retrieval-Augmented Vision-Language-Action Models for Autonomous Driving

cs.CV · 2026-05-01 · unverdicted · novelty 6.0 · 2 refs

VLADriver-RAG reaches a new state-of-the-art Driving Score of 89.12 on Bench2Drive by retrieving structure-aware historical knowledge through spatiotemporal semantic graphs and Graph-DTW alignment.

AITP: Traffic Accident Responsibility Allocation via Multimodal Large Language Models

cs.CL · 2026-04-11 · unverdicted · novelty 6.0

AITP is a new multimodal large language model that uses multimodal chain-of-thought and retrieval-augmented generation of legal knowledge to achieve state-of-the-art results on traffic accident responsibility allocation and related tasks, supported by the DecaTARA benchmark of 67,941 videos.

ADAM: A Systematic Data Extraction Attack on Agent Memory via Adaptive Querying

cs.CR · 2026-04-10 · unverdicted · novelty 6.0

ADAM extracts data from LLM agent memory with up to 100% attack success rate by estimating data distribution and selecting queries via entropy guidance.

Think Before You Drive: World Model-Inspired Multimodal Grounding for Autonomous Vehicles

cs.CV · 2025-12-03 · unverdicted · novelty 6.0

ThinkDeeper introduces a world-model-based reasoning step that predicts future spatial states to improve multimodal visual grounding for autonomous vehicles, achieving top results on Talk2Car and other benchmarks.

DriveVLA-W0: World Models Amplify Data Scaling Law in Autonomous Driving

cs.CV · 2025-10-14 · unverdicted · novelty 6.0

DriveVLA-W0 adds world modeling to predict future images in VLA models, overcoming sparse action supervision and amplifying data scaling laws on NAVSIM benchmarks and a large in-house dataset.

CogDriver: Integrating Cognitive Inertia for Temporally Coherent Planning in Autonomous Driving

cs.CV · 2025-08-31 · unverdicted · novelty 6.0

CogDriver-Agent with sparse temporal memory and spatiotemporal distillation on CogDriver-Data achieves 22% higher closed-loop Driving Score on Bench2Drive and 21% lower mean L2 error on nuScenes.

Towards Robust Endogenous Reasoning: Unifying Drift Adaptation in Non-Stationary Tuning

cs.LG · 2026-04-17 · unverdicted · novelty 5.0

CPO++ adapts reinforcement fine-tuning of MLLMs to endogenous multi-modal concept drift through counterfactual reasoning and preference optimization, yielding better coherence and cross-domain robustness in safety-critical settings.

ExploreVLA: Dense World Modeling and Exploration for End-to-End Autonomous Driving

cs.CV · 2026-04-03

citing papers explorer

Showing 8 of 8 citing papers.

VLADriver-RAG: Retrieval-Augmented Vision-Language-Action Models for Autonomous Driving cs.CV · 2026-05-01 · unverdicted · none · ref 18 · 2 links
VLADriver-RAG reaches a new state-of-the-art Driving Score of 89.12 on Bench2Drive by retrieving structure-aware historical knowledge through spatiotemporal semantic graphs and Graph-DTW alignment.
AITP: Traffic Accident Responsibility Allocation via Multimodal Large Language Models cs.CL · 2026-04-11 · unverdicted · none · ref 43
AITP is a new multimodal large language model that uses multimodal chain-of-thought and retrieval-augmented generation of legal knowledge to achieve state-of-the-art results on traffic accident responsibility allocation and related tasks, supported by the DecaTARA benchmark of 67,941 videos.
ADAM: A Systematic Data Extraction Attack on Agent Memory via Adaptive Querying cs.CR · 2026-04-10 · unverdicted · none · ref 24
ADAM extracts data from LLM agent memory with up to 100% attack success rate by estimating data distribution and selecting queries via entropy guidance.
Think Before You Drive: World Model-Inspired Multimodal Grounding for Autonomous Vehicles cs.CV · 2025-12-03 · unverdicted · none · ref 63
ThinkDeeper introduces a world-model-based reasoning step that predicts future spatial states to improve multimodal visual grounding for autonomous vehicles, achieving top results on Talk2Car and other benchmarks.
DriveVLA-W0: World Models Amplify Data Scaling Law in Autonomous Driving cs.CV · 2025-10-14 · unverdicted · none · ref 32
DriveVLA-W0 adds world modeling to predict future images in VLA models, overcoming sparse action supervision and amplifying data scaling laws on NAVSIM benchmarks and a large in-house dataset.
CogDriver: Integrating Cognitive Inertia for Temporally Coherent Planning in Autonomous Driving cs.CV · 2025-08-31 · unverdicted · none · ref 47
CogDriver-Agent with sparse temporal memory and spatiotemporal distillation on CogDriver-Data achieves 22% higher closed-loop Driving Score on Bench2Drive and 21% lower mean L2 error on nuScenes.
Towards Robust Endogenous Reasoning: Unifying Drift Adaptation in Non-Stationary Tuning cs.LG · 2026-04-17 · unverdicted · none · ref 91
CPO++ adapts reinforcement fine-tuning of MLLMs to endogenous multi-modal concept drift through counterfactual reasoning and preference optimization, yielding better coherence and cross-domain robustness in safety-critical settings.
ExploreVLA: Dense World Modeling and Exploration for End-to-End Autonomous Driving cs.CV · 2026-04-03 · unreviewed · ref 51

Rag-driver: Generalisable driving explanations with retrieval-augmented in-context learning in multi-modal large language model.arXiv preprint arXiv:2402.10828

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer