ViSA proposes expert-driven token generation and dual-branch local fusion modules for view-aware semantic alignment in AGPReID, reporting up to 10.06% mAP gains on the CARGO benchmark.
Seeing like a human: Asynchronous learning with dynamic progressive refinement for person re-identification
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
fields
cs.CV 2years
2026 2representative citing papers
VideoThinker improves lightweight MLLM video reasoning by creating a bias model to capture shortcuts and applying causal debiasing policy optimization to push away from them, achieving SOTA efficiency with minimal data.
citing papers explorer
-
View-Aware Semantic Alignment for Aerial-Ground Person Re-Identification
ViSA proposes expert-driven token generation and dual-branch local fusion modules for view-aware semantic alignment in AGPReID, reporting up to 10.06% mAP gains on the CARGO benchmark.
-
Beyond Perceptual Shortcuts: Causal-Inspired Debiasing Optimization for Generalizable Video Reasoning in Lightweight MLLMs
VideoThinker improves lightweight MLLM video reasoning by creating a bias model to capture shortcuts and applying causal debiasing policy optimization to push away from them, achieving SOTA efficiency with minimal data.