VenusBench-Mobile is a new benchmark that exposes large performance gaps in mobile GUI agents on realistic user-centric tasks, with failures dominated by perception and memory deficiencies and near-zero success under environment changes.
Title resolution pending
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.HC 1years
2026 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
VenusBench-Mobile: A Challenging and User-Centric Benchmark for Mobile GUI Agents with Capability Diagnostics
VenusBench-Mobile is a new benchmark that exposes large performance gaps in mobile GUI agents on realistic user-centric tasks, with failures dominated by perception and memory deficiencies and near-zero success under environment changes.