pith:DM6LXZ55
Matrix-Space Reinforcement Learning for Reusing Local Transition Geometry
Positive semidefinite matrix descriptors of trajectory segments let reinforcement learning agents reuse local transition geometry across tasks.
arxiv:2605.14304 v1 · 2026-05-14 · cs.LG · cs.AI
Add to your LaTeX paper
\usepackage{pith}
\pithnumber{DM6LXZ552J4LVENK4XZ3IUT3PX}
Prints a linked badge after your title and injects PDF metadata. Compiles on arXiv. Learn more · Embed verified badge
Record completeness
Claims
We prove that the descriptor is well defined up to coordinate gauge, complete for the induced low-order additive signal class, additive under valid segment composition, and minimally sufficient among admissible additive descriptors. We further show that conditioning value functions on the trajectory-segment matrix yields a first-order smooth approximation of action values, enabling source-learned matrix-to-value mappings to bootstrap learning in new tasks. Empirically, MSRL achieves the best average finite-budget target AUC of 0.73.
That the positive semidefinite matrix descriptors aggregating first- and second-order statistics of lifted one-step transitions actually expose shared hidden structure that supports valid algebraic composition and useful transfer across tasks.
MSRL represents trajectory segments as PSD matrices to prove additive composition properties and bootstrap value functions for better transfer, reaching 0.73 AUC versus 0.57-0.65 baselines.
References
Formal links
Receipt and verification
| First computed | 2026-05-17T23:39:10.064483Z |
|---|---|
| Builder | pith-number-builder-2026-05-17-v1 |
| Signature | Pith Ed25519
(pith-v1-2026-05) · public key |
| Schema | pith-number/v1.0 |
Canonical hash
1b3cbbe7bdd278ba91aae5f3b4527b7dc709cbc0b7f29ee7b798669a79dc29be
Aliases
· · · · ·Agent API
Verify this Pith Number yourself
curl -sH 'Accept: application/ld+json' https://pith.science/pith/DM6LXZ552J4LVENK4XZ3IUT3PX \
| jq -c '.canonical_record' \
| python3 -c "import sys,json,hashlib; b=json.dumps(json.loads(sys.stdin.read()), sort_keys=True, separators=(',',':'), ensure_ascii=False).encode(); print(hashlib.sha256(b).hexdigest())"
# expect: 1b3cbbe7bdd278ba91aae5f3b4527b7dc709cbc0b7f29ee7b798669a79dc29be
Canonical record JSON
{
"metadata": {
"abstract_canon_sha256": "f9eee20125e076820ee708c6ede9be132a01769f7dc8b5e52a5020269aa707b8",
"cross_cats_sorted": [
"cs.AI"
],
"license": "http://creativecommons.org/licenses/by/4.0/",
"primary_cat": "cs.LG",
"submitted_at": "2026-05-14T03:12:29Z",
"title_canon_sha256": "9ddb57cf15e2a03801f6e7d72fc85a4e06a4675eca41fe42b7bf00f536a27170"
},
"schema_version": "1.0",
"source": {
"id": "2605.14304",
"kind": "arxiv",
"version": 1
}
}