Finite-sample noise collapses the eigengap in representation covariances limiting recoverable modes K(N); multimodal learning stabilizes it via low-rank constraints, yielding better class separation quantified by truncated Mahalanobis energy approximated with a zeta function.
Leveraging a vision-language model with natural text supervision for MRI retrieval, captioning, classification, and visual question answering
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.LG 1years
2026 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
Anchoring the Eigengap: Cross-Modal Spectral Stabilization for Sample-Efficient Representation Learning
Finite-sample noise collapses the eigengap in representation covariances limiting recoverable modes K(N); multimodal learning stabilizes it via low-rank constraints, yielding better class separation quantified by truncated Mahalanobis energy approximated with a zeta function.