Gui-pra: Process reward agent for gui tasks

Tao Xiong, Xavier Hu, Yurun Chen, Yuhang Liu, Changqiao Wu, Pengzhi Gao, Wei Liu, Jian Luan, Shengyu Zhang · 2025 · arXiv 2509.23263

2 Pith papers cite this work. Polarity classification is still indexing.

2 Pith papers citing it

read on arXiv browse 2 citing papers

representative citing papers

PRO-CUA: Process-Reward Optimization for Computer Use Agents

cs.AI · 2026-05-27 · unverdicted · novelty 7.0

PRO-CUA trains CUAs via decoupled on-policy rollouts and PRM-guided step-level optimization to enable dense credit assignment without expert trajectories or golden answers.

Xiaomi-GUI-0 Technical Report

cs.AI · 2026-06-30 · unverdicted · novelty 4.0

Xiaomi-GUI-0 reports 72.0% success on an in-house real-mobile benchmark and 78.9% on AndroidWorld after training a GUI agent in a real-device closed loop with an error-driven data flywheel and three-stage RL pipeline.

citing papers explorer

Showing 2 of 2 citing papers.

PRO-CUA: Process-Reward Optimization for Computer Use Agents cs.AI · 2026-05-27 · unverdicted · none · ref 6
PRO-CUA trains CUAs via decoupled on-policy rollouts and PRM-guided step-level optimization to enable dense credit assignment without expert trajectories or golden answers.
Xiaomi-GUI-0 Technical Report cs.AI · 2026-06-30 · unverdicted · none · ref 50
Xiaomi-GUI-0 reports 72.0% success on an in-house real-mobile benchmark and 78.9% on AndroidWorld after training a GUI agent in a real-device closed loop with an error-driven data flywheel and three-stage RL pipeline.

Gui-pra: Process reward agent for gui tasks

fields

years

verdicts

representative citing papers

citing papers explorer