pith:TVTKIJ5S
High entropy leads to symmetry-equivariant policies in Dec-POMDPs
Sufficiently high entropy regularization in any Dec-POMDP makes policy gradient flow with tabular softmax converge to the same symmetry-equivariant joint policy from every initialization.
arxiv:2511.22581 v5 · 2025-11-27 · cs.LG · cs.MA
Add to your LaTeX paper
\usepackage{pith}
\pithnumber{TVTKIJ5SFEPIQA27ODCVPOUP2Y}
Prints a linked badge after your title and injects PDF metadata. Compiles on arXiv. Learn more · Embed verified badge
Record completeness
Claims
We prove that in any Dec-POMDP, sufficiently high entropy regularization ensures that the policy gradient flow with tabular softmax parametrization always converges, for any initialization, to the same joint policy, and that this joint policy is equivariant w.r.t. all symmetries of the Dec-POMDP.
The assumption that entropy regularization is 'sufficiently high' to force convergence to the unique equivariant policy under tabular softmax parametrization in arbitrary Dec-POMDPs.
High entropy regularization guarantees convergence to symmetry-equivariant policies in Dec-POMDPs, making cross-play returns match self-play returns.
Formal links
Cited by
Receipt and verification
| First computed | 2026-06-08T01:03:51.786624Z |
|---|---|
| Builder | pith-number-builder-2026-05-17-v1 |
| Signature | Pith Ed25519
(pith-v1-2026-05) · public key |
| Schema | pith-number/v1.0 |
Canonical hash
9d66a427b2291e88035f70c557ba8fd62f3c5c1308ccb732e2ff94724e1b4d49
Aliases
· · · · ·Agent API
Verify this Pith Number yourself
curl -sH 'Accept: application/ld+json' https://pith.science/pith/TVTKIJ5SFEPIQA27ODCVPOUP2Y \
| jq -c '.canonical_record' \
| python3 -c "import sys,json,hashlib; b=json.dumps(json.loads(sys.stdin.read()), sort_keys=True, separators=(',',':'), ensure_ascii=False).encode(); print(hashlib.sha256(b).hexdigest())"
# expect: 9d66a427b2291e88035f70c557ba8fd62f3c5c1308ccb732e2ff94724e1b4d49
Canonical record JSON
{
"metadata": {
"abstract_canon_sha256": "819c6b1c9354a7f25db8dc7ab33bd055709d415f1534be8346080dd82694cc12",
"cross_cats_sorted": [
"cs.MA"
],
"license": "http://creativecommons.org/licenses/by/4.0/",
"primary_cat": "cs.LG",
"submitted_at": "2025-11-27T16:13:27Z",
"title_canon_sha256": "c2a96431595a94e9af6a69640f799e7d9b707b5ec410a680d7dc97b49647592e"
},
"schema_version": "1.0",
"source": {
"id": "2511.22581",
"kind": "arxiv",
"version": 5
}
}