pith:LLYFYPQ7
Proximal Action Replacement for Behavior Cloning Actor-Critic in Offline Reinforcement Learning
Proximal action replacement overcomes the imitation ceiling in BC-regularized actor-critic by substituting suboptimal dataset actions with value-guided improvements.
arxiv:2602.07441 v2 · 2026-02-07 · cs.LG · cs.AI
Record completeness
Claims
PAR consistently improves performance across offline RL benchmarks and approaches state-of-the-art results simply by being combined with the basic TD3+BC.
That actions generated by the stable target policy, guided by local ascent of the action-value function and bounded by value uncertainty, can be substituted without destabilizing training or introducing new bias when dataset actions are suboptimal.
Proximal action replacement breaks the imitation ceiling in BC-regularized offline RL actor-critic by substituting suboptimal dataset actions with value-guided improvements from a stable target policy.
Formal links
Cited by
Receipt and verification
| First computed | 2026-05-17T23:39:00.026294Z |
|---|---|
| Builder | pith-number-builder-2026-05-17-v1 |
| Signature | Pith Ed25519 (pith-v1-2026-05) · public key |
| Schema | pith-number/v1.0 |
Canonical hash
5af05c3e1f52817167a217493e018987643dfe7037bdd563807d036431926018
Aliases
· ·Agent API
Verify this Pith Number yourself
curl -sH 'Accept: application/ld+json' https://pith.science/pith/LLYFYPQ7KKAXCZ5CC5ET4AMJQ5 \
| jq -c '.canonical_record' \
| python3 -c "import sys,json,hashlib; b=json.dumps(json.loads(sys.stdin.read()), sort_keys=True, separators=(',',':'), ensure_ascii=False).encode(); print(hashlib.sha256(b).hexdigest())"
# expect: 5af05c3e1f52817167a217493e018987643dfe7037bdd563807d036431926018
Canonical record JSON
{
"metadata": {
"abstract_canon_sha256": "dd379b69b869ef4149d7ec98a96f6edfb06f271348c99998106e1a120b549e75",
"cross_cats_sorted": [
"cs.AI"
],
"license": "http://creativecommons.org/licenses/by/4.0/",
"primary_cat": "cs.LG",
"submitted_at": "2026-02-07T08:44:27Z",
"title_canon_sha256": "11ceda1decf65b89cb4513c31144ecd8ad1132fdbf83a6e4a8002b631649229e"
},
"schema_version": "1.0",
"source": {
"id": "2602.07441",
"kind": "arxiv",
"version": 2
}
}