A dual-stream vision transformer with modality-aware gated exchange and bidirectional token exchange fuses RGB, thermal, and event data to improve UAV vehicle detection over dual-modal baselines on a new 10,489-frame dataset.
Rich feature hierarchies for accurate object detection and semantic segmentation
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
fields
cs.CV 2representative citing papers
Introduces the INTSD nighttime traffic sign dataset from India and LENS-Net baseline that performs adaptive illumination-aware detection plus multimodal classification.
citing papers explorer
-
Tri-Modal Fusion Transformers for UAV-based Object Detection
A dual-stream vision transformer with modality-aware gated exchange and bidirectional token exchange fuses RGB, thermal, and event data to improve UAV vehicle detection over dual-modal baselines on a new 10,489-frame dataset.
-
Learning Under Low Illumination: A Dataset and Algorithm for Traffic Sign Recognition
Introduces the INTSD nighttime traffic sign dataset from India and LENS-Net baseline that performs adaptive illumination-aware detection plus multimodal classification.