A new Latent Imagination Module uses cross-attention to predict latent visual embeddings from text, improving accuracy and calibration of vision-language models on text-only inputs.
curious case of contexts
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
years
2026 2verdicts
UNVERDICTED 2representative citing papers
Mask-to-Correct and M2C+ use diversity-aware masking in RAG to identify erroneous claim spans and produce faithful corrections, outperforming baselines by up to 14% SARI without gold evidence.
citing papers explorer
-
Bridging the Missing-Modality Gap: Improving Text-Only Calibration of Vision Language Models
A new Latent Imagination Module uses cross-attention to predict latent visual embeddings from text, improving accuracy and calibration of vision-language models on text-only inputs.
-
Mask-to-Correct$^+$: Leveraging Retriever Diversity for Masking-guided Faithful Fact Correction
Mask-to-Correct and M2C+ use diversity-aware masking in RAG to identify erroneous claim spans and produce faithful corrections, outperforming baselines by up to 14% SARI without gold evidence.