pith:CHS6VQH4
Breaking $\textit{Winner-Takes-All}$: Cooperative Policy Optimization Improves Diverse LLM Reasoning
GCPO replaces individual rollout competition with team-level credit assignment based on contributions to collective solution coverage.
arxiv:2605.11461 v2 · 2026-05-12 · cs.AI · cs.LG
Add to your LaTeX paper
\usepackage{pith}
\pithnumber{CHS6VQH4UVNWF746MQKITXQQC2}
Prints a linked badge after your title and injects PDF metadata. Compiles on arXiv. Learn more · Embed verified badge
Record completeness
Claims
Experiments across multiple reasoning benchmarks demonstrate that GCPO significantly improves both reasoning accuracy and solution diversity over existing approaches.
That the determinant volume over reward-weighted semantic embeddings provides a reliable, unbiased measure of non-redundant solution coverage whose marginal contributions can be computed and redistributed without introducing new optimization pathologies or sensitivity to embedding choice.
GCPO shifts RLVR from rollout competition to team cooperation by assigning advantages via marginal contributions to a determinant-based coverage volume over semantic embeddings, yielding higher accuracy and solution diversity on reasoning benchmarks.
Formal links
Receipt and verification
| First computed | 2026-05-20T00:04:35.997203Z |
|---|---|
| Builder | pith-number-builder-2026-05-17-v1 |
| Signature | Pith Ed25519
(pith-v1-2026-05) · public key |
| Schema | pith-number/v1.0 |
Canonical hash
11e5eac0fca55b62ff9e641489de10169af9b1dc277c8d906bbd50b85a04bcae
Aliases
· · · · ·Agent API
Verify this Pith Number yourself
curl -sH 'Accept: application/ld+json' https://pith.science/pith/CHS6VQH4UVNWF746MQKITXQQC2 \
| jq -c '.canonical_record' \
| python3 -c "import sys,json,hashlib; b=json.dumps(json.loads(sys.stdin.read()), sort_keys=True, separators=(',',':'), ensure_ascii=False).encode(); print(hashlib.sha256(b).hexdigest())"
# expect: 11e5eac0fca55b62ff9e641489de10169af9b1dc277c8d906bbd50b85a04bcae
Canonical record JSON
{
"metadata": {
"abstract_canon_sha256": "a0d9d3e90b3fd7695e1ed94d53f540403a43942bc2635c99228a2160b908a606",
"cross_cats_sorted": [
"cs.LG"
],
"license": "http://arxiv.org/licenses/nonexclusive-distrib/1.0/",
"primary_cat": "cs.AI",
"submitted_at": "2026-05-12T03:20:24Z",
"title_canon_sha256": "a553f839b835ffb46dcb0fb8d8eb8c3995e64ab782e4b0ea08b0881da51eb5f3"
},
"schema_version": "1.0",
"source": {
"id": "2605.11461",
"kind": "arxiv",
"version": 2
}
}