pith:SOUJZI63
Steering Your Diffusion Policy with Latent Space Reinforcement Learning
Optimizing in a diffusion policy's latent noise space enables sample-efficient autonomous robotic adaptation without altering model weights.
arxiv:2506.15799 v2 · 2025-06-18 · cs.RO · cs.LG
Record completeness
Claims
We show that DSRL is highly sample efficient, requires only black-box access to the BC policy, and enables effective real-world autonomous policy improvement.
That optimizing actions via RL in the diffusion model's latent noise space will produce meaningful policy improvements without access to model gradients or internal weights, and that this optimization remains stable across real-world robotic tasks.
DSRL steers pretrained diffusion policies for robotics by applying RL to their latent noise inputs, achieving sample-efficient real-world adaptation with only black-box access.
References
Formal links
Cited by
Receipt and verification
| First computed | 2026-05-17T23:38:12.935251Z |
|---|---|
| Builder | pith-number-builder-2026-05-17-v1 |
| Signature | Pith Ed25519 (pith-v1-2026-05) · public key |
| Schema | pith-number/v1.0 |
Canonical hash
93a89ca3db12c6aa25981bd7624772d299dd180391dde7212029dd62572bc52f
Aliases
· · · · ·Agent API
Verify this Pith Number yourself
curl -sH 'Accept: application/ld+json' https://pith.science/pith/SOUJZI63CLDKUJMYDPLWER3S2K \
| jq -c '.canonical_record' \
| python3 -c "import sys,json,hashlib; b=json.dumps(json.loads(sys.stdin.read()), sort_keys=True, separators=(',',':'), ensure_ascii=False).encode(); print(hashlib.sha256(b).hexdigest())"
# expect: 93a89ca3db12c6aa25981bd7624772d299dd180391dde7212029dd62572bc52f
Canonical record JSON
{
"metadata": {
"abstract_canon_sha256": "663649781b6ec4cce2ed62242e79588a55edf61a4d2087d98728585e10f0b98a",
"cross_cats_sorted": [
"cs.LG"
],
"license": "http://arxiv.org/licenses/nonexclusive-distrib/1.0/",
"primary_cat": "cs.RO",
"submitted_at": "2025-06-18T18:35:57Z",
"title_canon_sha256": "123eb9f732b785ef2fd3accc2cdd00a4ee02ede57d79f226477b9bc4cf77dffd"
},
"schema_version": "1.0",
"source": {
"id": "2506.15799",
"kind": "arxiv",
"version": 2
}
}