Retrieval-based interleaved visual chain-of-thought in real-world driving scenarios.arXiv preprint arXiv:2501.04671, 2025

Charles Corbière, Simon Roburin, Syrielle Montariol, Antoine Bosselut, Alexandre Alahi · 2025 · arXiv 2501.04671

3 Pith papers cite this work. Polarity classification is still indexing.

3 Pith papers citing it

representative citing papers

RailVQA: A Benchmark and Framework for Efficient Interpretable Visual Cognition in Automatic Train Operation

cs.CV · 2026-03-28 · unverdicted · novelty 7.0

RailVQA-bench supplies 21,168 QA pairs for ATO visual cognition while RailVQA-CoM combines large-model reasoning with small-model efficiency via transparent modules and temporal sampling.

OmniDrive-R1: Reinforcement-driven Interleaved Multi-modal Chain-of-Thought for Trustworthy Vision-Language Autonomous Driving

cs.CV · 2025-12-16 · unverdicted · novelty 6.0

OmniDrive-R1 boosts VLM reasoning score from 51.77% to 80.35% and answer accuracy from 37.81% to 73.62% on DriveLMM-o1 via reinforcement-driven interleaved multi-modal chain-of-thought with annotation-free grounding.

Alpamayo-R1: Bridging Reasoning and Action Prediction for Generalizable Autonomous Driving in the Long Tail

cs.RO · 2025-10-30 · conditional · novelty 6.0

Alpamayo-R1 introduces a VLA model with a Chain of Causation dataset and multi-stage SFT-plus-RL training that reports 12% better planning accuracy and 35% fewer close encounters versus trajectory-only baselines in driving tasks.

citing papers explorer

Showing 3 of 3 citing papers.

RailVQA: A Benchmark and Framework for Efficient Interpretable Visual Cognition in Automatic Train Operation cs.CV · 2026-03-28 · unverdicted · none · ref 7
RailVQA-bench supplies 21,168 QA pairs for ATO visual cognition while RailVQA-CoM combines large-model reasoning with small-model efficiency via transparent modules and temporal sampling.
OmniDrive-R1: Reinforcement-driven Interleaved Multi-modal Chain-of-Thought for Trustworthy Vision-Language Autonomous Driving cs.CV · 2025-12-16 · unverdicted · none · ref 9
OmniDrive-R1 boosts VLM reasoning score from 51.77% to 80.35% and answer accuracy from 37.81% to 73.62% on DriveLMM-o1 via reinforcement-driven interleaved multi-modal chain-of-thought with annotation-free grounding.
Alpamayo-R1: Bridging Reasoning and Action Prediction for Generalizable Autonomous Driving in the Long Tail cs.RO · 2025-10-30 · conditional · none · ref 12
Alpamayo-R1 introduces a VLA model with a Chain of Causation dataset and multi-stage SFT-plus-RL training that reports 12% better planning accuracy and 35% fewer close encounters versus trajectory-only baselines in driving tasks.

Retrieval-based interleaved visual chain-of-thought in real-world driving scenarios.arXiv preprint arXiv:2501.04671, 2025

fields

years

verdicts

representative citing papers

citing papers explorer