MobiBench is the first modular multi-path offline benchmark for mobile GUI agents, achieving 94.72% agreement with human evaluators while allowing component-level analysis.
arXiv:2412.18116 [cs.AI] https://arxiv.org/abs/2412.18116
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
representative citing papers
AgentLens adaptively deploys Full UI, Partial UI, and GenUI modalities with virtual display overlays for mobile GUI agents, yielding 85.7% user preference and best-in-study usability in a 21-participant evaluation.
citing papers explorer
-
MobiBench: Multi-Branch, Modular Benchmark for Mobile GUI Agents
MobiBench is the first modular multi-path offline benchmark for mobile GUI agents, achieving 94.72% agreement with human evaluators while allowing component-level analysis.
-
AgentLens: Adaptive Visual Modalities for Human-Agent Interaction in Mobile GUI Agents
AgentLens adaptively deploys Full UI, Partial UI, and GenUI modalities with virtual display overlays for mobile GUI agents, yielding 85.7% user preference and best-in-study usability in a 21-participant evaluation.