PhoneBuddy combines real-app and mock-app RL after shared SFT, raising real-phone task success from 36.67% to 45.33% and AndroidWorld from 60.3% to 83.2%.
Step: Success- rate-aware trajectory-efficient policy optimization
5 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
years
2026 5verdicts
UNVERDICTED 5roles
background 2polarities
background 2representative citing papers
PhoneWorld is a pipeline that converts real mobile trajectories into scalable controllable environments, yielding large gains on four benchmarks when used to supplement training data.
ExpThink reduces average CoT response length by up to 77% while improving accuracy on math benchmarks via experience-guided reward shaping and difficulty-adaptive advantage in RL.
Xiaomi-GUI-0 reports 72.0% success on RealMobile and 78.9% on AndroidWorld via real-device closed-loop training with multi-source data and three-stage RL pipeline.
World models trained on delta text, full text, diffusion images, and renderable code achieve SoTA on two benchmarks and improve downstream GUI agent performance on three mobile datasets with modality-specific strengths.
citing papers explorer
-
Xiaomi-GUI-0 Technical Report
Xiaomi-GUI-0 reports 72.0% success on RealMobile and 78.9% on AndroidWorld via real-device closed-loop training with multi-source data and three-stage RL pipeline.
-
How Mobile World Model Guides GUI Agents?
World models trained on delta text, full text, diffusion images, and renderable code achieve SoTA on two benchmarks and improve downstream GUI agent performance on three mobile datasets with modality-specific strengths.