A dual-stream vision transformer with modality-aware gated exchange and bidirectional token exchange fuses RGB, thermal, and event data to improve UAV vehicle detection over dual-modal baselines on a new 10,489-frame dataset.
Cross-modal oriented object detection of uav aerial images based on im- age feature.IEEE Transactions on Geoscience and Remote Sensing, 62:1–21, 2024
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
fields
cs.CV 2years
2026 2verdicts
UNVERDICTED 2representative citing papers
SWNet combines visible and NIR spectra with a Pyramid Vision Transformer, bimodal gated fusion, and edge refinement to outperform prior methods on camouflaged weed detection in the Weeds-Banana dataset.
citing papers explorer
-
Tri-Modal Fusion Transformers for UAV-based Object Detection
A dual-stream vision transformer with modality-aware gated exchange and bidirectional token exchange fuses RGB, thermal, and event data to improve UAV vehicle detection over dual-modal baselines on a new 10,489-frame dataset.
-
SWNet: A Cross-Spectral Network for Camouflaged Weed Detection
SWNet combines visible and NIR spectra with a Pyramid Vision Transformer, bimodal gated fusion, and edge refinement to outperform prior methods on camouflaged weed detection in the Weeds-Banana dataset.