ViSA proposes expert-driven token generation and dual-branch local fusion modules for view-aware semantic alignment in AGPReID, reporting up to 10.06% mAP gains on the CARGO benchmark.
Learning modal-invariant angular metric by cyclic projection network for vis-nir person re-identification.IEEE TIP, 30:8019– 8033
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
fields
cs.CV 2years
2026 2representative citing papers
VideoThinker improves lightweight MLLM video reasoning by creating a bias model to capture shortcuts and applying causal debiasing policy optimization to push away from them, achieving SOTA efficiency with minimal data.
citing papers explorer
-
View-Aware Semantic Alignment for Aerial-Ground Person Re-Identification
ViSA proposes expert-driven token generation and dual-branch local fusion modules for view-aware semantic alignment in AGPReID, reporting up to 10.06% mAP gains on the CARGO benchmark.
-
Beyond Perceptual Shortcuts: Causal-Inspired Debiasing Optimization for Generalizable Video Reasoning in Lightweight MLLMs
VideoThinker improves lightweight MLLM video reasoning by creating a bias model to capture shortcuts and applying causal debiasing policy optimization to push away from them, achieving SOTA efficiency with minimal data.