Pith Number

pith:OZAGJFTR

pith:2025:OZAGJFTRZKRQMIICMHPS45JVFT

not attested not anchored not stored refs resolved

UI-R1: Enhancing Efficient Action Prediction of GUI Agents by Reinforcement Learning

Guanjing Xiong, Han Xiao, Hao Wang, Hongsheng Li, Liang Liu, Shuai Ren, Xi Yin, Yaxuan Guo, Yuxiang Chai, Zhengxi Lu

Rule-based RL on 136 GUI tasks lifts a 3B multimodal model to 22% higher action-prediction accuracy.

arxiv:2503.21620 v5 · 2025-03-27 · cs.AI

Open paper page JSON Open Graph Bundle Merged state Verified badge What is a Pith Number?

Add to your LaTeX paper

\usepackage{pith}
\pithnumber{OZAGJFTRZKRQMIICMHPS45JVFT}

Prints a linked badge after your title and injects PDF metadata. Compiles on arXiv. Learn more · Embed verified badge

Record completeness

1 Bitcoin timestamp

2 Internet Archive

3 Author claim open · sign in to claim

4 Citations open

5 Replications open

✓ Portable graph bundle live · download bundle · merged state

The bundle contains the canonical record plus signed events. A mirror can host it anywhere and recompute the same current state with the deterministic merge algorithm.

Claims

C1strongest claim

UI-R1-3B achieves significant improvements over the base model (Qwen2.5-VL-3B) on both in-domain and out-of-domain tasks, with average accuracy gains of 22.1% on ScreenSpot, 6.0% on ScreenSpot-Pro, and 12.7% on ANDROIDCONTROL.

C2weakest assumption

The rule-based action reward provides sufficient and unbiased supervision for policy optimization across diverse GUI tasks without post-hoc adjustments or hidden data selection.

C3one line summary

UI-R1 shows rule-based RL with GRPO on 136 GUI tasks improves a 3B MLLM's action prediction accuracy by 6-22% over its base model and matches larger SFT-trained models.

References

18 extracted · 18 resolved · 9 Pith anchors

[1] L1: Controlling how long a reasoning model thinks with reinforcement learning

[2] arXiv preprint arXiv:2407.17490 , year=

[3] VisRL: Intention-driven visual perception via reinforced reasoning 2025

[4] Navigating the Digital World as Humans Do: Universal Visual Grounding for GUI Agents · arXiv:2410.05243

[5] DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning · arXiv:2501.12948

Formal links

2 machine-checked theorem links

Cited by

30 papers in Pith

Grounded Reinforcement Learning for Visual Reasoning

ROSE: Rollout On Serving GPUs via Cooperative Elasticity for Agentic RL

PAGER: Bridging the Semantic-Execution Gap in Point-Precise Geometric GUI Control

CaptchaMind: Training CAPTCHA Solvers via Reinforcement Learning with Explicit Reasoning Supervision

SE-GA: Memory-Augmented Self-Evolution for GUI Agents

Receipt and verification

First computed	2026-05-17T23:38:48.089144Z
Builder	pith-number-builder-2026-05-17-v1
Signature	Pith Ed25519 (`pith-v1-2026-05`) · public key
Schema	pith-number/v1.0

Canonical hash

7640649671caa306210261df2e75352cead9fd9a1f5cd95a5f808466d5097937

Aliases

arxiv: 2503.21620 · arxiv_version: 2503.21620v5 · doi: 10.48550/arxiv.2503.21620 · pith_short_12: OZAGJFTRZKRQ · pith_short_16: OZAGJFTRZKRQMIIC · pith_short_8: OZAGJFTR

Agent API

Resolver JSON Graph JSON Events JSON Schema Signing key

Verify this Pith Number yourself

curl -sH 'Accept: application/ld+json' https://pith.science/pith/OZAGJFTRZKRQMIICMHPS45JVFT \
  | jq -c '.canonical_record' \
  | python3 -c "import sys,json,hashlib; b=json.dumps(json.loads(sys.stdin.read()), sort_keys=True, separators=(',',':'), ensure_ascii=False).encode(); print(hashlib.sha256(b).hexdigest())"
# expect: 7640649671caa306210261df2e75352cead9fd9a1f5cd95a5f808466d5097937

Canonical record JSON

{
  "metadata": {
    "abstract_canon_sha256": "996e1401c55cb49fa859097baae52738828675a6956b602244bf691972e2c774",
    "cross_cats_sorted": [],
    "license": "http://creativecommons.org/licenses/by/4.0/",
    "primary_cat": "cs.AI",
    "submitted_at": "2025-03-27T15:39:30Z",
    "title_canon_sha256": "abfd2d349b1b9ba15cf5fb12e0034980e0086d0d412dfc67f35fc9c4a60ab860"
  },
  "schema_version": "1.0",
  "source": {
    "id": "2503.21620",
    "kind": "arxiv",
    "version": 5
  }
}