arXiv preprint arXiv:2509.23263 , year =

Tao Xiong, Xavier Hu, Yurun Chen, Yuhang Liu, Changqiao Wu, Pengzhi Gao · 2025 · arXiv 2509.23263

4 Pith papers cite this work. Polarity classification is still indexing.

4 Pith papers citing it

representative citing papers

A History-Aware Visually Grounded Critic for Computer Use Agents

cs.AI · 2026-06-09 · unverdicted · novelty 7.0

HiViG is a test-time critic that combines macro-action history summarization with visual grounding of execution coordinates to reduce short-sighted and visually erroneous actions in long-horizon GUI agents.

PRO-CUA: Process-Reward Optimization for Computer Use Agents

cs.AI · 2026-05-27 · unverdicted · novelty 7.0

PRO-CUA trains CUAs via decoupled on-policy rollouts and PRM-guided step-level optimization to enable dense credit assignment without expert trajectories or golden answers.

StainFlow: Entity-Stain Tracking and Evidence Linking for Process Rewards in GUI Agents

cs.AI · 2026-06-05 · unverdicted · novelty 5.0

StainFlow proposes global entity stain tracking and local stain evidence linking modules to improve process rewards for GUI agents, reporting 3.2% relative gain in online RL success and 1.8% in judgment accuracy on AndroidWorld and OGRBench.

Xiaomi-GUI-0 Technical Report

cs.AI · 2026-06-30 · unverdicted · novelty 4.0 · 2 refs

Xiaomi-GUI-0 reports 72.0% success on RealMobile and 78.9% on AndroidWorld via real-device closed-loop training with multi-source data and three-stage RL pipeline.

citing papers explorer

Showing 4 of 4 citing papers after filters.

A History-Aware Visually Grounded Critic for Computer Use Agents cs.AI · 2026-06-09 · unverdicted · none · ref 43
HiViG is a test-time critic that combines macro-action history summarization with visual grounding of execution coordinates to reduce short-sighted and visually erroneous actions in long-horizon GUI agents.
PRO-CUA: Process-Reward Optimization for Computer Use Agents cs.AI · 2026-05-27 · unverdicted · none · ref 6
PRO-CUA trains CUAs via decoupled on-policy rollouts and PRM-guided step-level optimization to enable dense credit assignment without expert trajectories or golden answers.
StainFlow: Entity-Stain Tracking and Evidence Linking for Process Rewards in GUI Agents cs.AI · 2026-06-05 · unverdicted · none · ref 37
StainFlow proposes global entity stain tracking and local stain evidence linking modules to improve process rewards for GUI agents, reporting 3.2% relative gain in online RL success and 1.8% in judgment accuracy on AndroidWorld and OGRBench.
Xiaomi-GUI-0 Technical Report cs.AI · 2026-06-30 · unverdicted · none · ref 46 · 2 links
Xiaomi-GUI-0 reports 72.0% success on RealMobile and 78.9% on AndroidWorld via real-device closed-loop training with multi-source data and three-stage RL pipeline.

arXiv preprint arXiv:2509.23263 , year =

fields

years

verdicts

representative citing papers

citing papers explorer