pith. sign in

In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition

2 Pith papers cite this work. Polarity classification is still indexing.

2 Pith papers citing it

fields

cs.CV 2

years

2026 2

verdicts

UNVERDICTED 2

representative citing papers

Multimodal LLMs under Pairwise Modalities

cs.CV · 2026-05-20 · unverdicted · novelty 6.0

A two-stage framework enables multimodal LLMs to learn shared latent representations from pairwise modality data and achieve cross-modal generation when incorporating new modalities.

citing papers explorer

Showing 2 of 2 citing papers.

  • Multimodal LLMs under Pairwise Modalities cs.CV · 2026-05-20 · unverdicted · none · ref 6

    A two-stage framework enables multimodal LLMs to learn shared latent representations from pairwise modality data and achieve cross-modal generation when incorporating new modalities.

  • SurgCheck: Do Vision-Language Models Really Look at Images in Surgical VQA? cs.CV · 2026-05-03 · unverdicted · none · ref 6

    SurgCheck benchmark reveals that vision-language models for surgical VQA often depend on linguistic shortcuts rather than visual reasoning, shown by consistent performance drops on less-biased questions.