Autodroid: Llm-powered task automation in android, 2024

Hao Wen, Yuanchun Li, Guohong Liu, Shanhui Zhao, Tao Yu, Toby Jia-Jun Li, Shiqi Jiang, Yunhao Liu, Yaqin Zhang, Yunxin Liu · 2024 · arXiv 2308.15272

8 Pith papers cite this work. Polarity classification is still indexing.

8 Pith papers citing it

read on arXiv browse 8 citing papers

citation-role summary

background 2 method 1

citation-polarity summary

background 2 use method 1

representative citing papers

OSWorld: Benchmarking Multimodal Agents for Open-Ended Tasks in Real Computer Environments

cs.AI · 2024-04-11 · accept · novelty 8.0

OSWorld provides the first unified real-computer benchmark for open-ended multimodal agent tasks, exposing large performance gaps between humans and state-of-the-art LLM/VLM agents.

ScaleWoB: Guiding GUI Agents with Coding Agents via Large-Scale Environmental Synthesis

cs.AI · 2026-05-24 · unverdicted · novelty 7.0

ScaleWoB generates 100+ synthetic interactive GUI environments and 1000+ verifiable tasks as web pages, releasing a 120-task mobile benchmark where state-of-the-art agents achieve 27.92% success (17.82% on long-horizon tasks) versus 92.08% for humans, with synthetic results generalizing to real apps

Mobile GUI Agents under Real-world Threats: Are We There Yet?

cs.CR · 2025-07-06 · conditional · novelty 6.0

Introduces an app-content instrumentation framework and benchmark showing that examined GUI agents suffer 42.0% and 36.1% average misleading rates from third-party content in dynamic and static tests respectively.

InfiGUI-R1: Advancing Multimodal GUI Agents from Reactive Actors to Deliberative Reasoners

cs.AI · 2025-04-19 · unverdicted · novelty 6.0

InfiGUI-R1 uses Reasoning Injection via spatial distillation followed by Deliberation Enhancement via RL to evolve GUI agents from reactive actors to deliberative reasoners, reporting strong performance on grounding and trajectory tasks.

A Comprehensive Survey of Agents for Computer Use: Foundations, Challenges, and Future Directions

cs.AI · 2025-01-27 · unverdicted · novelty 5.0

A survey of 87 agents for computer use and 33 datasets that introduces a three-dimensional taxonomy across domain, interaction, and agent perspectives and identifies six research gaps.

ClawMobile: Rethinking Smartphone-Native Agentic Systems

cs.MA · 2026-02-26 · unverdicted · novelty 4.0

ClawMobile proposes a hierarchical system separating probabilistic LLM planning from structured deterministic execution to improve stability and reproducibility of agentic systems on real smartphones.

Personal LLM Agents: Insights and Survey about the Capability, Efficiency and Security

cs.HC · 2024-01-10 · unverdicted · novelty 3.0

This survey discusses key components and challenges for Personal LLM Agents and reviews solutions for their capability, efficiency, and security.

MobiBench: Multi-Branch, Modular Benchmark for Mobile GUI Agents

cs.AI · 2025-12-14

citing papers explorer

Showing 1 of 1 citing paper after filters.

ScaleWoB: Guiding GUI Agents with Coding Agents via Large-Scale Environmental Synthesis cs.AI · 2026-05-24 · unverdicted · none · ref 1
ScaleWoB generates 100+ synthetic interactive GUI environments and 1000+ verifiable tasks as web pages, releasing a 120-task mobile benchmark where state-of-the-art agents achieve 27.92% success (17.82% on long-horizon tasks) versus 92.08% for humans, with synthetic results generalizing to real apps

Autodroid: Llm-powered task automation in android, 2024

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer