MODIX dynamically rescales positional indices in VLMs using intra-modal covariance-based entropy and inter-modal alignment scores to allocate finer granularity to informative content.
Exploring the limits of transfer learning with a unified text-to-text transformer.Journal of Machine Learn- ing Research, 21(140):1–67
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.CV 1years
2026 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
MODIX: A Training-Free Multimodal Information-Driven Positional Index Scaling for Vision-Language Models
MODIX dynamically rescales positional indices in VLMs using intra-modal covariance-based entropy and inter-modal alignment scores to allocate finer granularity to informative content.