UniVG-R1: Reasoning guided universal visual grounding with reinforce- ment learning.arXiv preprint arXiv:2506.12151, 2025

Shiyin Liu, Bo Shi, Ruijie Chen, Jian Shi, Junfeng Li, Jinsong Tang, Liujun Tang, Han Zhang, Zonglin Lu, Ke Sun, Qi Chen · 2025 · arXiv 2506.12151

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

read on arXiv browse 1 citing papers

representative citing papers

Adaptive Chain-of-Focus Reasoning via Dynamic Visual Search and Zooming for Efficient VLMs

cs.CV · 2025-05-21 · unverdicted · novelty 6.0

Chain-of-Focus enables VLMs to adaptively search and zoom on important image areas via a two-stage SFT and RL pipeline on a custom 3K-sample dataset, yielding 5% gains on the V* benchmark across resolutions from 224 to 4K.

citing papers explorer

Showing 1 of 1 citing paper.

Adaptive Chain-of-Focus Reasoning via Dynamic Visual Search and Zooming for Efficient VLMs cs.CV · 2025-05-21 · unverdicted · none · ref 28
Chain-of-Focus enables VLMs to adaptively search and zoom on important image areas via a two-stage SFT and RL pipeline on a custom 3K-sample dataset, yielding 5% gains on the V* benchmark across resolutions from 224 to 4K.

UniVG-R1: Reasoning guided universal visual grounding with reinforce- ment learning.arXiv preprint arXiv:2506.12151, 2025

fields

years

verdicts

representative citing papers

citing papers explorer