SeeClick improves visual GUI agents via GUI grounding pre-training on automatically curated data and introduces the ScreenSpot benchmark, with results indicating that stronger grounding boosts downstream task performance.
Proceedings of the 30th annual ACM symposium on user interface software and technology , pages=
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.HC 1years
2024 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
SeeClick: Harnessing GUI Grounding for Advanced Visual GUI Agents
SeeClick improves visual GUI agents via GUI grounding pre-training on automatically curated data and introduces the ScreenSpot benchmark, with results indicating that stronger grounding boosts downstream task performance.