pith:THM2OTVT
Collaborative Yet Personalized Policy Training: Single-Timescale Federated Actor-Critic
Agents share a linear subspace for collaboration while keeping personalized policies, yielding finite-time convergence rates that scale linearly with the number of agents under single-timescale Markovian updates.
arxiv:2605.14423 v1 · 2026-05-14 · cs.LG · cs.AI
Add to your LaTeX paper
\usepackage{pith}
\pithnumber{THM2OTVTUV4C33HYFVOBJMYKZU}
Prints a linked badge after your title and injects PDF metadata. Compiles on arXiv. Learn more · Embed verified badge
Record completeness
Claims
Under canonical single-timescale updates with Markovian sampling, we establish finite-time convergence via a novel joint linear approximation framework. Specifically, we show that the critic error converges to zero at the rate of Õ(1/((1−γ)4√(TK))), and the policy gradient norm converges to zero at the rate of Õ(1/((1−γ)6√(TK))), ... These results demonstrate linear speedup with respect to the number of agents K, despite heterogeneous Markovian trajectories under distinct transition kernels and coupled learning dynamics.
That a single common linear subspace is expressive enough to capture the shared structure across all agents' heterogeneous environments while the remaining personalization can be handled by local heads, and that the perturbation analysis for projected subspace updates and the conditional mixing arguments for heterogeneous Markovian noise remain valid under the coupled policy-critic dynamics.
A federated actor-critic framework lets agents share a linear subspace representation for policies while maintaining personalized local actors and critics, achieving critic error and policy gradient convergence rates of order 1 over square root of TK with linear speedup in K agents under environment
References
Receipt and verification
| First computed | 2026-05-17T23:39:07.224863Z |
|---|---|
| Builder | pith-number-builder-2026-05-17-v1 |
| Signature | Pith Ed25519
(pith-v1-2026-05) · public key |
| Schema | pith-number/v1.0 |
Canonical hash
99d9a74eb3a5782decf82d5c14b30acd273b770fff7075175d2fc7f17f6d2c35
Aliases
· · · · ·Agent API
Verify this Pith Number yourself
curl -sH 'Accept: application/ld+json' https://pith.science/pith/THM2OTVTUV4C33HYFVOBJMYKZU \
| jq -c '.canonical_record' \
| python3 -c "import sys,json,hashlib; b=json.dumps(json.loads(sys.stdin.read()), sort_keys=True, separators=(',',':'), ensure_ascii=False).encode(); print(hashlib.sha256(b).hexdigest())"
# expect: 99d9a74eb3a5782decf82d5c14b30acd273b770fff7075175d2fc7f17f6d2c35
Canonical record JSON
{
"metadata": {
"abstract_canon_sha256": "7f3d656ea630b8a1489a217c6e70ae74211d324fe11b756ff46e72939594c50c",
"cross_cats_sorted": [
"cs.AI"
],
"license": "http://creativecommons.org/licenses/by/4.0/",
"primary_cat": "cs.LG",
"submitted_at": "2026-05-14T06:10:31Z",
"title_canon_sha256": "ea38118b998e53d9d07f82342507701f103b0b2d618b67e6ca3392c021006205"
},
"schema_version": "1.0",
"source": {
"id": "2605.14423",
"kind": "arxiv",
"version": 1
}
}