Tumtraffic-videoqa: A benchmark for unified spatio-temporal video understanding in traffic scenes

· 2025 · arXiv 2502.02449

4 Pith papers cite this work. Polarity classification is still indexing.

4 Pith papers citing it

representative citing papers

CrashSight: A Phase-Aware, Infrastructure-Centric Video Benchmark for Traffic Crash Scene Understanding and Reasoning

cs.CV · 2026-04-09 · unverdicted · novelty 7.0

CrashSight is a new infrastructure-focused benchmark showing that state-of-the-art vision-language models can describe crash scenes but fail at temporal and causal reasoning.

NuRisk: A Visual Question Answering Dataset for Agent-Level Risk Assessment in Autonomous Driving

cs.AI · 2025-09-30 · conditional · novelty 7.0

NuRisk is a new VQA dataset for agent-level risk assessment in autonomous driving that benchmarks VLMs at 33% peak accuracy and shows a fine-tuned 7B model reaching 41% with 75% lower latency.

Towards Safe Mobility: A Unified Transportation Foundation Model enabled by Open-Ended Vision-Language Dataset

cs.CV · 2026-04-24 · unverdicted · novelty 6.0

Creates LTD dataset for open-ended traffic VQA and trains UniVLT model to achieve SOTA on unified microscopic AD and macroscopic traffic reasoning tasks.

Descriptor: Distance-Annotated Traffic Perception Question Answering (DTPQA)

cs.CV · 2025-11-17 · unverdicted · novelty 6.0

DTPQA is a new VQA benchmark consisting of synthetic and real-world traffic images with distance annotations to isolate and measure VLM perception capabilities for driving decisions.

citing papers explorer

Showing 4 of 4 citing papers.

CrashSight: A Phase-Aware, Infrastructure-Centric Video Benchmark for Traffic Crash Scene Understanding and Reasoning cs.CV · 2026-04-09 · unverdicted · none · ref 35
CrashSight is a new infrastructure-focused benchmark showing that state-of-the-art vision-language models can describe crash scenes but fail at temporal and causal reasoning.
NuRisk: A Visual Question Answering Dataset for Agent-Level Risk Assessment in Autonomous Driving cs.AI · 2025-09-30 · conditional · none · ref 24
NuRisk is a new VQA dataset for agent-level risk assessment in autonomous driving that benchmarks VLMs at 33% peak accuracy and shows a fine-tuned 7B model reaching 41% with 75% lower latency.
Towards Safe Mobility: A Unified Transportation Foundation Model enabled by Open-Ended Vision-Language Dataset cs.CV · 2026-04-24 · unverdicted · none · ref 11
Creates LTD dataset for open-ended traffic VQA and trains UniVLT model to achieve SOTA on unified microscopic AD and macroscopic traffic reasoning tasks.
Descriptor: Distance-Annotated Traffic Perception Question Answering (DTPQA) cs.CV · 2025-11-17 · unverdicted · none · ref 8
DTPQA is a new VQA benchmark consisting of synthetic and real-world traffic images with distance annotations to isolate and measure VLM perception capabilities for driving decisions.

Tumtraffic-videoqa: A benchmark for unified spatio-temporal video understanding in traffic scenes

fields

years

verdicts

representative citing papers

citing papers explorer