GUI agents can transform live web interfaces in real-time via DOM manipulations to deliver contextual assistance directly within the application.
Gui agents: A survey
5 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
roles
background 1polarities
background 1representative citing papers
A controlled study in an audio-streaming search app shows GUI agents match human task success and query patterns but use more search-centric, low-branching navigation while humans are content-centric and exploratory.
VF-Coder raises GUI code success rate from 21.68% to 28.29% and visual score from 0.4284 to 0.5584 on a new 984-task benchmark by adding direct visual perception and interaction.
An anonymization framework replaces sensitive UI content with deterministic placeholders to protect privacy in mobile GUI agents while preserving task performance.
UI-TARS-2 reaches 88.2 on Online-Mind2Web, 47.5 on OSWorld, 50.6 on WindowsAgentArena, and 73.3 on AndroidWorld while attaining 59.8 mean normalized score on a 15-game suite through multi-turn RL and scalable data generation.
citing papers explorer
-
Beyond Chat and Clicks: GUI Agents for In-Situ Assistance via Live Interface Transformation
GUI agents can transform live web interfaces in real-time via DOM manipulations to deliver contextual assistance directly within the application.
-
Same Outcomes, Different Journeys: A Trace-Level Framework for Comparing Human and GUI-Agent Behavior in Production Search Systems
A controlled study in an audio-streaming search app shows GUI agents match human task success and query patterns but use more search-centric, low-branching navigation while humans are content-centric and exploratory.
-
Coding with Eyes: Visual Feedback Unlocks Reliable GUI Code Generating and Debugging
VF-Coder raises GUI code success rate from 21.68% to 28.29% and visual score from 0.4284 to 0.5584 on a new 984-task benchmark by adding direct visual perception and interaction.
-
Anonymization-Enhanced Privacy Protection for Mobile GUI Agents: Available but Invisible
An anonymization framework replaces sensitive UI content with deterministic placeholders to protect privacy in mobile GUI agents while preserving task performance.
-
UI-TARS-2 Technical Report: Advancing GUI Agent with Multi-Turn Reinforcement Learning
UI-TARS-2 reaches 88.2 on Online-Mind2Web, 47.5 on OSWorld, 50.6 on WindowsAgentArena, and 73.3 on AndroidWorld while attaining 59.8 mean normalized score on a 15-game suite through multi-turn RL and scalable data generation.