RailVQA-bench supplies 21,168 QA pairs for ATO visual cognition while RailVQA-CoM combines large-model reasoning with small-model efficiency via transparent modules and temporal sampling.
Vqa: Visual question answering,
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
fields
cs.CV 2years
2026 2verdicts
UNVERDICTED 2representative citing papers
PSG-UIENet fuses Retinex physics with CLIP-derived text semantics and a new multimodal dataset to enhance underwater images, claiming better results than fifteen prior methods.
citing papers explorer
-
RailVQA: A Benchmark and Framework for Efficient Interpretable Visual Cognition in Automatic Train Operation
RailVQA-bench supplies 21,168 QA pairs for ATO visual cognition while RailVQA-CoM combines large-model reasoning with small-model efficiency via transparent modules and temporal sampling.
-
Retinex Meets Language: A Physics-Semantics-Guided Underwater Image Enhancement Network
PSG-UIENet fuses Retinex physics with CLIP-derived text semantics and a new multimodal dataset to enhance underwater images, claiming better results than fifteen prior methods.