Vift: Towards vi- sual instruction-free fine-tuning for large vision-language models

Zikang Liu, Kun Zhou, Xin Zhao, Dawei Gao, Yaliang Li, Ji-Rong Wen · 2025

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

browse 1 citing papers

representative citing papers

Towards Long-horizon Agentic Multimodal Search

cs.CV · 2026-04-14 · unverdicted · novelty 6.0

LMM-Searcher uses file-based visual UIDs and a fetch tool plus 12K synthesized trajectories to fine-tune a multimodal agent that scales to 100-turn horizons and reaches SOTA among open-source models on MM-BrowseComp and MMSearch-Plus.

citing papers explorer

Showing 1 of 1 citing paper.

Towards Long-horizon Agentic Multimodal Search cs.CV · 2026-04-14 · unverdicted · none · ref 57
LMM-Searcher uses file-based visual UIDs and a fetch tool plus 12K synthesized trajectories to fine-tune a multimodal agent that scales to 100-turn horizons and reaches SOTA among open-source models on MM-BrowseComp and MMSearch-Plus.

Vift: Towards vi- sual instruction-free fine-tuning for large vision-language models

fields

years

verdicts

representative citing papers

citing papers explorer