isearle: Improving textual inversion for zero-shot composed image retrieval

[ABBDB24] Lorenzo Agnolucci, Alberto Baldrati, Marco Bertini, Alberto Del Bimbo · arXiv 2405.02951

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

representative citing papers

E5-V: Universal Embeddings with Multimodal Large Language Models

cs.CL · 2024-07-17 · unverdicted · novelty 6.0

E5-V produces strong universal multimodal embeddings from MLLMs trained solely on text pairs, often surpassing prior methods across retrieval and related tasks without multimodal fine-tuning.

citing papers explorer

Showing 1 of 1 citing paper.

E5-V: Universal Embeddings with Multimodal Large Language Models cs.CL · 2024-07-17 · unverdicted · none · ref 1
E5-V produces strong universal multimodal embeddings from MLLMs trained solely on text pairs, often surpassing prior methods across retrieval and related tasks without multimodal fine-tuning.

isearle: Improving textual inversion for zero-shot composed image retrieval

fields

years

verdicts

representative citing papers

citing papers explorer