Stride-qa: Visual question answering dataset for spatiotemporal reasoning in urban driving scenes

Keishi Ishihara, Kento Sasaki, Tsubasa Takahashi, Daiki Shiono, Yu Yamaguchi · 2025 · arXiv 2508.10427

2 Pith papers cite this work. Polarity classification is still indexing.

2 Pith papers citing it

representative citing papers

DRIVESPATIAL: A Benchmark for Spatiotemporal Intelligence in VLMs for Autonomous Driving

cs.CV · 2026-05-22 · unverdicted · novelty 7.0

DriveSpatial benchmark shows the best of 15 VLMs trails humans by 28.4 points on spatiotemporal driving tasks, with cognitive scene construction as the main failure mode.

Descriptor: Distance-Annotated Traffic Perception Question Answering (DTPQA)

cs.CV · 2025-11-17 · unverdicted · novelty 6.0

DTPQA is a new VQA benchmark consisting of synthetic and real-world traffic images with distance annotations to isolate and measure VLM perception capabilities for driving decisions.

citing papers explorer

Showing 2 of 2 citing papers.

DRIVESPATIAL: A Benchmark for Spatiotemporal Intelligence in VLMs for Autonomous Driving cs.CV · 2026-05-22 · unverdicted · none · ref 81
DriveSpatial benchmark shows the best of 15 VLMs trails humans by 28.4 points on spatiotemporal driving tasks, with cognitive scene construction as the main failure mode.
Descriptor: Distance-Annotated Traffic Perception Question Answering (DTPQA) cs.CV · 2025-11-17 · unverdicted · none · ref 7
DTPQA is a new VQA benchmark consisting of synthetic and real-world traffic images with distance annotations to isolate and measure VLM perception capabilities for driving decisions.

Stride-qa: Visual question answering dataset for spatiotemporal reasoning in urban driving scenes

fields

years

verdicts

representative citing papers

citing papers explorer