Pith Number

pith:XKO3DYKM

pith:2026:XKO3DYKMVIOKZM5HX6TSVZFV26

not attested not anchored not stored refs resolved

What to Ignore, What to React: Visually Robust RL Fine-Tuning of VLA Models

Chuheng Zhang, Jiang Bian, Jingjing Fu, Jun Zhang, Ling Zhang, Li Zhao, Mingyu Liu, Rui Wang, Yuanfang Peng

PAIR-VLA adds invariance and sensitivity objectives over paired visual variants to improve RL fine-tuning of VLA models under visual shifts.

arxiv:2605.13105 v1 · 2026-05-13 · cs.RO

Open paper page JSON Open Graph Bundle Merged state Verified badge What is a Pith Number?

Add to your LaTeX paper

\usepackage{pith}
\pithnumber{XKO3DYKMVIOKZM5HX6TSVZFV26}

Prints a linked badge after your title and injects PDF metadata. Compiles on arXiv. Learn more · Embed verified badge

Record completeness

1 Bitcoin timestamp

2 Internet Archive

3 Author claim open · sign in to claim

4 Citations open

5 Replications open

✓ Portable graph bundle live · download bundle · merged state

The bundle contains the canonical record plus signed events. A mirror can host it anywhere and recompute the same current state with the deterministic merge algorithm.

Claims

C1strongest claim

Our method consistently improves over standard PPO, achieving average improvements of 16.62% on π0.5 and 9.10% on OpenVLA across diverse out-of-distribution visual shifts.

C2weakest assumption

That paired visual variants (task-preserving and task-altering) can be reliably generated or labeled during training to provide accurate behavior-level supervision without introducing new biases.

C3one line summary

PAIR-VLA adds invariance and sensitivity objectives over paired visual variants during PPO fine-tuning of VLA models, yielding 9-16% average gains on ManiSkill3 under distractors, textures, poses, viewpoints, and lighting shifts.

References

43 extracted · 43 resolved · 9 Pith anchors

[1] Open x- embodiment: Robotic learning datasets and rt-x models: Open x-embodiment collaboration 0 2024

[2] DROID: A Large-Scale In-The-Wild Robot Manipulation Dataset 2024 · arXiv:2403.12945

[3] Rt-2: Vision-language-action models transfer web knowledge to robotic control, 2023 2023

[4] Octo: An open-source generalist robot policy, 2024 2024

[5] $\pi_0$: A Vision-Language-Action Flow Model for General Robot Control 2024 · arXiv:2410.24164

Receipt and verification

First computed	2026-05-18T03:08:58.185087Z
Builder	pith-number-builder-2026-05-17-v1
Signature	Pith Ed25519 (`pith-v1-2026-05`) · public key
Schema	pith-number/v1.0

Canonical hash

ba9db1e14caa1cacb3a7bfa72ae4b5d7ba02910f260d89f95f7fc67d2f8b6f37

Aliases

arxiv: 2605.13105 · arxiv_version: 2605.13105v1 · doi: 10.48550/arxiv.2605.13105 · pith_short_12: XKO3DYKMVIOK · pith_short_16: XKO3DYKMVIOKZM5H · pith_short_8: XKO3DYKM

Agent API

Resolver JSON Graph JSON Events JSON Schema Signing key

Verify this Pith Number yourself

curl -sH 'Accept: application/ld+json' https://pith.science/pith/XKO3DYKMVIOKZM5HX6TSVZFV26 \
  | jq -c '.canonical_record' \
  | python3 -c "import sys,json,hashlib; b=json.dumps(json.loads(sys.stdin.read()), sort_keys=True, separators=(',',':'), ensure_ascii=False).encode(); print(hashlib.sha256(b).hexdigest())"
# expect: ba9db1e14caa1cacb3a7bfa72ae4b5d7ba02910f260d89f95f7fc67d2f8b6f37

Canonical record JSON

{
  "metadata": {
    "abstract_canon_sha256": "2216604d31c8e863c8e4a104c63fcf7c24903ce67f6a7380cbd2b52faf0e1fc4",
    "cross_cats_sorted": [],
    "license": "http://arxiv.org/licenses/nonexclusive-distrib/1.0/",
    "primary_cat": "cs.RO",
    "submitted_at": "2026-05-13T07:15:37Z",
    "title_canon_sha256": "974139390ca7ed35f9d6b6dc7187f4d9158adabc2396863129e1be71ce3dadfc"
  },
  "schema_version": "1.0",
  "source": {
    "id": "2605.13105",
    "kind": "arxiv",
    "version": 1
  }
}