SeeClick improves visual GUI agents via GUI grounding pre-training on automatically curated data and introduces the ScreenSpot benchmark, with results indicating that stronger grounding boosts downstream task performance.
ICLR 2023 Workshop on Mathematical and Empirical Understanding of Foundation Models , year=
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.HC 1years
2024 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
SeeClick: Harnessing GUI Grounding for Advanced Visual GUI Agents
SeeClick improves visual GUI agents via GUI grounding pre-training on automatically curated data and introduces the ScreenSpot benchmark, with results indicating that stronger grounding boosts downstream task performance.