HiViG is a test-time critic that combines macro-action history summarization with visual grounding of execution coordinates to reduce short-sighted and visually erroneous actions in long-horizon GUI agents.
arXiv preprint arXiv:2509.23263 , year =
4 Pith papers cite this work. Polarity classification is still indexing.
fields
cs.AI 4years
2026 4verdicts
UNVERDICTED 4representative citing papers
PRO-CUA trains CUAs via decoupled on-policy rollouts and PRM-guided step-level optimization to enable dense credit assignment without expert trajectories or golden answers.
StainFlow proposes global entity stain tracking and local stain evidence linking modules to improve process rewards for GUI agents, reporting 3.2% relative gain in online RL success and 1.8% in judgment accuracy on AndroidWorld and OGRBench.
Xiaomi-GUI-0 reports 72.0% success on RealMobile and 78.9% on AndroidWorld via real-device closed-loop training with multi-source data and three-stage RL pipeline.
citing papers explorer
-
A History-Aware Visually Grounded Critic for Computer Use Agents
HiViG is a test-time critic that combines macro-action history summarization with visual grounding of execution coordinates to reduce short-sighted and visually erroneous actions in long-horizon GUI agents.
-
PRO-CUA: Process-Reward Optimization for Computer Use Agents
PRO-CUA trains CUAs via decoupled on-policy rollouts and PRM-guided step-level optimization to enable dense credit assignment without expert trajectories or golden answers.
-
StainFlow: Entity-Stain Tracking and Evidence Linking for Process Rewards in GUI Agents
StainFlow proposes global entity stain tracking and local stain evidence linking modules to improve process rewards for GUI agents, reporting 3.2% relative gain in online RL success and 1.8% in judgment accuracy on AndroidWorld and OGRBench.
-
Xiaomi-GUI-0 Technical Report
Xiaomi-GUI-0 reports 72.0% success on RealMobile and 78.9% on AndroidWorld via real-device closed-loop training with multi-source data and three-stage RL pipeline.