FedGUI is the first comprehensive benchmark for federated GUI agents that studies cross-platform, cross-device, cross-OS, and cross-source heterogeneity, with experiments showing performance gains from cross-platform collaboration and identifying platform and OS as the most influential factors.
Os-genesis: Automating gui agent trajectory construction via reverse task synthesis
8 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
roles
method 1polarities
background 1representative citing papers
UI-Copilot adds a selective copilot for memory and math to GUI agents and trains tool use with separate single-turn and multi-turn optimization, yielding SOTA results on MemGUI-Bench and a 17.1% gain on AndroidWorld.
RISK introduces a dataset, benchmark, and R1-style RL fine-tuning for GUI agents that achieve 6.8-8.8% offline gains and 70.5% online task success in e-commerce risk management using 7.2% of baseline parameters.
Mobile-R1 introduces a hierarchical three-stage curriculum that combines format alignment, verifiable action feedback, and multi-turn environment training to improve exploration and self-correction in VLM-based mobile agents, plus a new Chinese GUI dataset and benchmark.
LPO optimizes GUI agent positional accuracy by combining information entropy for zone selection with a physical-distance reward inside a Group Relative Preference Optimization framework, claiming SOTA results on benchmarks and online tests.
SynthAgent uses dual refinement of synthetic tasks and trajectories to produce higher-quality training data that improves web agent adaptation to target environments.
The paper delivers the first systematic review of self-evolving agents, structured around what components evolve, when adaptation occurs, and how it is implemented.
A survey consolidating benchmarks, agent frameworks, real-world applications, and protocols for LLM-based autonomous agents into a proposed taxonomy with recommendations for future research.
citing papers explorer
-
FedGUI: Benchmarking Federated GUI Agents across Heterogeneous Platforms, Devices, and Operating Systems
FedGUI is the first comprehensive benchmark for federated GUI agents that studies cross-platform, cross-device, cross-OS, and cross-source heterogeneity, with experiments showing performance gains from cross-platform collaboration and identifying platform and OS as the most influential factors.
-
UI-Copilot: Advancing Long-Horizon GUI Automation via Tool-Integrated Policy Optimization
UI-Copilot adds a selective copilot for memory and math to GUI agents and trains tool use with separate single-turn and multi-turn optimization, yielding SOTA results on MemGUI-Bench and a 17.1% gain on AndroidWorld.
-
RISK: A Framework for GUI Agents in E-commerce Risk Management
RISK introduces a dataset, benchmark, and R1-style RL fine-tuning for GUI agents that achieve 6.8-8.8% offline gains and 70.5% online task success in e-commerce risk management using 7.2% of baseline parameters.
-
Mobile-R1: Towards Interactive Capability for VLM-Based Mobile Agent via Systematic Training
Mobile-R1 introduces a hierarchical three-stage curriculum that combines format alignment, verifiable action feedback, and multi-turn environment training to improve exploration and self-correction in VLM-based mobile agents, plus a new Chinese GUI dataset and benchmark.
-
LPO: Towards Accurate GUI Agent Interaction via Location Preference Optimization
LPO optimizes GUI agent positional accuracy by combining information entropy for zone selection with a physical-distance reward inside a Group Relative Preference Optimization framework, claiming SOTA results on benchmarks and online tests.
-
SynthAgent: Adapting Web Agents with Synthetic Supervision
SynthAgent uses dual refinement of synthetic tasks and trajectories to produce higher-quality training data that improves web agent adaptation to target environments.
-
A Survey of Self-Evolving Agents: What, When, How, and Where to Evolve on the Path to Artificial Super Intelligence
The paper delivers the first systematic review of self-evolving agents, structured around what components evolve, when adaptation occurs, and how it is implemented.
-
From LLM Reasoning to Autonomous AI Agents: A Comprehensive Review
A survey consolidating benchmarks, agent frameworks, real-world applications, and protocols for LLM-based autonomous agents into a proposed taxonomy with recommendations for future research.