SOLAR is a self-supervised two-stage method using intersection masks on web-scale image-text pairs to enable symmetric MM2MM retrieval, outperforming supervised VLMs by 7.08 points on a new benchmark with far fewer parameters.
Title resolution pending
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.CV 1years
2026 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
SOLAR: Self-supervised Joint Learning for Symmetric Multimodal Retrieval
SOLAR is a self-supervised two-stage method using intersection masks on web-scale image-text pairs to enable symmetric MM2MM retrieval, outperforming supervised VLMs by 7.08 points on a new benchmark with far fewer parameters.