Llava- next: Improved reasoning, ocr, and world knowledge, January 2024

Haotian Liu, Chunyuan Li, Yuheng Li, Bo Li, Yuanhan Zhang, Sheng Shen, Yong Jae Lee · 2024

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

browse 1 citing papers

representative citing papers

Adaptive Chain-of-Focus Reasoning via Dynamic Visual Search and Zooming for Efficient VLMs

cs.CV · 2025-05-21 · unverdicted · novelty 6.0

Chain-of-Focus enables VLMs to adaptively search and zoom on important image areas via a two-stage SFT and RL pipeline on a custom 3K-sample dataset, yielding 5% gains on the V* benchmark across resolutions from 224 to 4K.

citing papers explorer

Showing 1 of 1 citing paper.

Adaptive Chain-of-Focus Reasoning via Dynamic Visual Search and Zooming for Efficient VLMs cs.CV · 2025-05-21 · unverdicted · none · ref 26
Chain-of-Focus enables VLMs to adaptively search and zoom on important image areas via a two-stage SFT and RL pipeline on a custom 3K-sample dataset, yielding 5% gains on the V* benchmark across resolutions from 224 to 4K.

Llava- next: Improved reasoning, ocr, and world knowledge, January 2024

fields

years

verdicts

representative citing papers

citing papers explorer