A dual-stream vision transformer with modality-aware gated exchange and bidirectional token exchange fuses RGB, thermal, and event data to improve UAV vehicle detection over dual-modal baselines on a new 10,489-frame dataset.
Swin transformer: Hierarchical vision transformer using shifted windows
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
citation-role summary
method 1
citation-polarity summary
fields
cs.CV 2years
2026 2verdicts
UNVERDICTED 2roles
method 1polarities
use method 1representative citing papers
AOI-SSL combines small-domain self-supervised pre-training of vision transformers with in-context patch retrieval to reduce labeled data needs and enable fast adaptation for semiconductor wire-bond segmentation.
citing papers explorer
-
Tri-Modal Fusion Transformers for UAV-based Object Detection
A dual-stream vision transformer with modality-aware gated exchange and bidirectional token exchange fuses RGB, thermal, and event data to improve UAV vehicle detection over dual-modal baselines on a new 10,489-frame dataset.
-
AOI-SSL: Self-Supervised Framework for Efficient Segmentation of Wire-bonded Semiconductors In Optical Inspection
AOI-SSL combines small-domain self-supervised pre-training of vision transformers with in-context patch retrieval to reduce labeled data needs and enable fast adaptation for semiconductor wire-bond segmentation.