VFCNet fuses saliency and gradient vector flow into a dual-stream attention model with DINOv3 backbone to reach SOTA on the PICD composition benchmark (CDA-1 0.683, CDA-2 0.629), while a simple DINOv3 classifier alone beats prior specialized models.
arXiv preprint arXiv:2403.03740 (2024)
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.CV 1years
2026 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
Semantically Stable Image Composition Analysis via Saliency and Gradient Vector Flow Fusion
VFCNet fuses saliency and gradient vector flow into a dual-stream attention model with DINOv3 backbone to reach SOTA on the PICD composition benchmark (CDA-1 0.683, CDA-2 0.629), while a simple DINOv3 classifier alone beats prior specialized models.