InnerZoom bridges cross-layer evidence in one forward pass to achieve SOTA GUI grounding accuracy on six benchmarks while cutting latency up to 31.8% versus two-pass baselines.
arXiv preprint arXiv:2602.06391 , year=
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
fields
cs.CV 2years
2026 2verdicts
UNVERDICTED 2representative citing papers
GUI-C² pairs a difficulty-scoring data pipeline with an area-gated coarse-to-fine RL mechanism to improve GUI grounding accuracy and training stability.
citing papers explorer
-
One Forward Beats Two: InnerZoom for Accurate and Efficient GUI Grounding
InnerZoom bridges cross-layer evidence in one forward pass to achieve SOTA GUI grounding accuracy on six benchmarks while cutting latency up to 31.8% versus two-pass baselines.
-
GUI-C$^2$: Coarse-to-Fine GUI Grounding via Difficulty-Aware Reinforcement Learning
GUI-C² pairs a difficulty-scoring data pipeline with an area-gated coarse-to-fine RL mechanism to improve GUI grounding accuracy and training stability.