A dual-stream vision transformer with modality-aware gated exchange and bidirectional token exchange fuses RGB, thermal, and event data to improve UAV vehicle detection over dual-modal baselines on a new 10,489-frame dataset.
Drone-based rgb-infrared cross-modality vehicle detection via uncertainty-aware learning.IEEE Transactions on Cir- cuits and Systems for Video Technology, 32(10):6700–6713
2 Pith papers cite this work. Polarity classification is still indexing.
fields
cs.CV 2years
2026 2verdicts
UNVERDICTED 2representative citing papers
Bridge learns low-rank bases for front-door causal adjustment to remove spurious correlations from domain shifts and integrates the approach with vision foundation models for improved object detection generalization.
citing papers explorer
-
Tri-Modal Fusion Transformers for UAV-based Object Detection
A dual-stream vision transformer with modality-aware gated exchange and bidirectional token exchange fuses RGB, thermal, and event data to improve UAV vehicle detection over dual-modal baselines on a new 10,489-frame dataset.
-
Bridge: Basis-Driven Causal Inference Marries VFMs for Domain Generalization
Bridge learns low-rank bases for front-door causal adjustment to remove spurious correlations from domain shifts and integrates the approach with vision foundation models for improved object detection generalization.