RAR combines CLIP retrieval with MLLM ranking to improve few-shot and zero-shot fine-grained visual recognition on 5 benchmarks, 11 few-shot datasets, and 2 detection tasks.
In: ProceedingsoftheIEEE/CVFConferenceonComputerVisionandPatternRecog- nition (CVPR)
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.CV 1years
2024 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
RAR: Retrieving And Ranking Augmented MLLMs for Visual Recognition
RAR combines CLIP retrieval with MLLM ranking to improve few-shot and zero-shot fine-grained visual recognition on 5 benchmarks, 11 few-shot datasets, and 2 detection tasks.