pith. sign in
Pith Number

pith:5T3JCLYX

pith:2026:5T3JCLYXK3UK6MSXLGSJEUWPXL
not attested not anchored not stored refs pending

Holder Policy Optimisation

Chenyang Le, Dingli Liang, Jiachen Zhu, Jianghao Lin, Jun Wang, Lingyu Yang, Weinan Zhang, Yihang Chen, Yuxiang Chen, Zhaokai Wang, Ziqin Gong

HölderPO resolves GRPO's aggregation trade-off by using a tunable Hölder mean with annealed parameter p to control gradient concentration and variance.

arxiv:2605.12058 v2 · 2026-05-12 · cs.LG · cs.AI

Add to your LaTeX paper
\usepackage{pith}
\pithnumber{5T3JCLYXK3UK6MSXLGSJEUWPXL}

Prints a linked badge after your title and injects PDF metadata. Compiles on arXiv. Learn more · Embed verified badge

Record completeness

1 Bitcoin timestamp
2 Internet Archive
3 Author claim open · sign in to claim
4 Citations open
5 Replications open
Portable graph bundle live · download bundle · merged state
The bundle contains the canonical record plus signed events. A mirror can host it anywhere and recompute the same current state with the deterministic merge algorithm.

Claims

C1strongest claim

our approach achieves a state-of-the-art average accuracy of 54.9% across multiple mathematical benchmarks, yielding a substantial 7.2% relative gain over standard GRPO and secures an exceptional 93.8% success rate on ALFWorld.

C2weakest assumption

That modulating the single scalar p via annealing will reliably resolve the concentration-stability trade-off across different model sizes, tasks, and sampling budgets without introducing new failure modes not captured by the reported experiments.

C3one line summary

HölderPO unifies token aggregation in GRPO via the Hölder mean with dynamic p annealing, reporting 54.9% average math-benchmark accuracy and 93.8% ALFWorld success.

Receipt and verification
First computed 2026-05-22T01:04:06.183005Z
Builder pith-number-builder-2026-05-17-v1
Signature Pith Ed25519 (pith-v1-2026-05) · public key
Schema pith-number/v1.0

Canonical hash

ecf6912f1756e8af325759a49252cfbaef349a6d5a9e0e4e2b22fa8b61802a22

Aliases

arxiv: 2605.12058 · arxiv_version: 2605.12058v2 · doi: 10.48550/arxiv.2605.12058 · pith_short_12: 5T3JCLYXK3UK · pith_short_16: 5T3JCLYXK3UK6MSX · pith_short_8: 5T3JCLYX
Agent API
Verify this Pith Number yourself
curl -sH 'Accept: application/ld+json' https://pith.science/pith/5T3JCLYXK3UK6MSXLGSJEUWPXL \
  | jq -c '.canonical_record' \
  | python3 -c "import sys,json,hashlib; b=json.dumps(json.loads(sys.stdin.read()), sort_keys=True, separators=(',',':'), ensure_ascii=False).encode(); print(hashlib.sha256(b).hexdigest())"
# expect: ecf6912f1756e8af325759a49252cfbaef349a6d5a9e0e4e2b22fa8b61802a22
Canonical record JSON
{
  "metadata": {
    "abstract_canon_sha256": "f74a957ad172eea0f389658e2d4e70c908f04e77b1b0ae047729822bbb14d558",
    "cross_cats_sorted": [
      "cs.AI"
    ],
    "license": "http://creativecommons.org/licenses/by/4.0/",
    "primary_cat": "cs.LG",
    "submitted_at": "2026-05-12T12:45:03Z",
    "title_canon_sha256": "69378e129fc2f4f34aa8ffab9c29b8251f76273548881c5cac738d2b94b74a9a"
  },
  "schema_version": "1.0",
  "source": {
    "id": "2605.12058",
    "kind": "arxiv",
    "version": 2
  }
}