Pith Number

pith:RWN7GKM5

pith:2026:RWN7GKM5UGUN4YJWNKD4K5VZJF

not attested not anchored not stored refs resolved

SE-GA: Memory-Augmented Self-Evolution for GUI Agents

Lanjun Wang, Shilong Jin, Zhuosheng Zhang

The SE-GA framework lets GUI agents self-evolve by retrieving memories at test time and retraining on the resulting data to reach higher success rates on multi-step tasks.

arxiv:2605.16883 v1 · 2026-05-16 · cs.LG

Open paper page JSON Open Graph Bundle Merged state What is a Pith Number?

Add to your LaTeX paper

\usepackage{pith}
\pithnumber{RWN7GKM5UGUN4YJWNKD4K5VZJF}

Prints a linked badge after your title and injects PDF metadata. Compiles on arXiv. Learn more

Record completeness

1 Bitcoin timestamp

2 Internet Archive

3 Author claim open · sign in to claim

4 Citations open

5 Replications open

✓ Portable graph bundle live · download bundle · merged state

The bundle contains the canonical record plus signed events. A mirror can host it anywhere and recompute the same current state with the deterministic merge algorithm.

Claims

C1strongest claim

SE-GA achieves state-of-the-art performance, reaching success rates of 89.0% on ScreenSpot and 75.8% on the challenging AndroidControl-High dataset with significant improvements on AndroidWorld.

C2weakest assumption

The data collected by TTME during inference is of sufficient quality and diversity to stabilize and enhance the foundational policy through the MASE training pipeline without introducing harmful biases or catastrophic forgetting.

C3one line summary

SE-GA combines Test-Time Memory Extension for dynamic context retrieval with Memory-Augmented Self-Evolution training to reach 89.0% on ScreenSpot and 75.8% on AndroidControl-High.

References

54 extracted · 54 resolved · 23 Pith anchors

[1] Charles Beattie, Thomas Köppe, Edgar A 2018 · arXiv:1707.01495

[2] Our 3.5 models and computer use, 2024 2024

[3] Qwen2.5-VL Technical Report 2025 · arXiv:2502.13923

[4] Amex: Android multi-annotation expo dataset for mobile gui agents 2025 · doi:10.18653/v1/2025.findings-acl.110

[5] Expanding Performance Boundaries of Open-Source Multimodal Models with Model, Data, and Test-Time Scaling 2024 · arXiv:2412.05271

Formal links

2 machine-checked theorem links

Receipt and verification

First computed	2026-05-20T00:03:28.132282Z
Builder	pith-number-builder-2026-05-17-v1
Signature	Pith Ed25519 (`pith-v1-2026-05`) · public key
Schema	pith-number/v1.0

Canonical hash

8d9bf3299da1a8de61366a87c576b9495d9a8a544022a02789be91016fe977cd

Aliases

arxiv: 2605.16883 · arxiv_version: 2605.16883v1 · doi: 10.48550/arxiv.2605.16883 · pith_short_12: RWN7GKM5UGUN · pith_short_16: RWN7GKM5UGUN4YJW · pith_short_8: RWN7GKM5

Agent API

Resolver JSON Graph JSON Events JSON Schema Signing key

Verify this Pith Number yourself

curl -sH 'Accept: application/ld+json' https://pith.science/pith/RWN7GKM5UGUN4YJWNKD4K5VZJF \
  | jq -c '.canonical_record' \
  | python3 -c "import sys,json,hashlib; b=json.dumps(json.loads(sys.stdin.read()), sort_keys=True, separators=(',',':'), ensure_ascii=False).encode(); print(hashlib.sha256(b).hexdigest())"
# expect: 8d9bf3299da1a8de61366a87c576b9495d9a8a544022a02789be91016fe977cd

Canonical record JSON

{
  "metadata": {
    "abstract_canon_sha256": "92e4a4b010941537fa510bc1d347dccda9f84b9477ca68019c2669b6979fecd9",
    "cross_cats_sorted": [],
    "license": "http://arxiv.org/licenses/nonexclusive-distrib/1.0/",
    "primary_cat": "cs.LG",
    "submitted_at": "2026-05-16T08:51:57Z",
    "title_canon_sha256": "fde63058e5b2c82fcb5a9970e45e6402dbd520321982627d335c5e54f6f532a0"
  },
  "schema_version": "1.0",
  "source": {
    "id": "2605.16883",
    "kind": "arxiv",
    "version": 1
  }
}