RailVQA-bench supplies 21,168 QA pairs for ATO visual cognition while RailVQA-CoM combines large-model reasoning with small-model efficiency via transparent modules and temporal sampling.
Drivelm: Driving with graph visual ques- tion answering
3 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
fields
cs.CV 3years
2026 3verdicts
UNVERDICTED 3roles
background 2representative citing papers
VECTOR-DRIVE uses shared self-attention with semantic-aware expert routing of tokens to VL and trajectory experts plus flow-matching action decoding to reach 88.91 driving score on Bench2Drive.
C-CoT applies VLMs to autonomous driving via five-stage reasoning with a meta-action tree for counterfactuals, yielding 81.9% risk recall, 3.52% collision rate, and 1.98 m L2 error on a new dataset.
citing papers explorer
-
RailVQA: A Benchmark and Framework for Efficient Interpretable Visual Cognition in Automatic Train Operation
RailVQA-bench supplies 21,168 QA pairs for ATO visual cognition while RailVQA-CoM combines large-model reasoning with small-model efficiency via transparent modules and temporal sampling.
-
VECTOR-Drive: Tightly Coupled Vision-Language and Trajectory Expert Routing for End-to-End Autonomous Driving
VECTOR-DRIVE uses shared self-attention with semantic-aware expert routing of tokens to VL and trajectory experts plus flow-matching action decoding to reach 88.91 driving score on Bench2Drive.
-
C-CoT: Counterfactual Chain-of-Thought with Vision-Language Models for Safe Autonomous Driving
C-CoT applies VLMs to autonomous driving via five-stage reasoning with a meta-action tree for counterfactuals, yielding 81.9% risk recall, 3.52% collision rate, and 1.98 m L2 error on a new dataset.