pith. sign in

Improved baselines with visual instruction tuning

4 Pith papers cite this work. Polarity classification is still indexing.

4 Pith papers citing it

fields

cs.CV 4

years

2026 2 2025 2

verdicts

UNVERDICTED 4

representative citing papers

Indexing Multimodal Language Models for Large-scale Image Retrieval

cs.CV · 2026-04-14 · unverdicted · novelty 7.0

Multimodal LLMs act as training-free similarity estimators for instance-level image retrieval by converting next-token probabilities from image-pair prompts into scores, combined with efficient indexing for scalability.

Relational Visual Similarity

cs.CV · 2025-12-08 · unverdicted · novelty 7.0

A vision-language model is finetuned on 114k anonymized relational captions to embed images by their underlying structural correspondences instead of visible attributes.

citing papers explorer

Showing 4 of 4 citing papers.