pith. sign in
Pith Number

pith:5UEZA2UH

pith:2026:5UEZA2UHFCXB5IND22XSI5Q56F
not attested not anchored not stored refs resolved

Residual Reinforcement Learning for Robot Teleoperation under Stochastic Delays

Kaize Deng, Zewen Yang

An LSTM state estimator paired with residual RL produces stable robot teleoperation under stochastic delays.

arxiv:2605.15480 v1 · 2026-05-14 · cs.RO · cs.AI

Add to your LaTeX paper
\usepackage{pith}
\pithnumber{5UEZA2UHFCXB5IND22XSI5Q56F}

Prints a linked badge after your title and injects PDF metadata. Compiles on arXiv. Learn more · Embed verified badge

Record completeness

1 Bitcoin timestamp
2 Internet Archive
3 Author claim open · sign in to claim
4 Citations open
5 Replications open
Portable graph bundle live · download bundle · merged state
The bundle contains the canonical record plus signed events. A mirror can host it anywhere and recompute the same current state with the deterministic merge algorithm.

Claims

C1strongest claim

Experimental validation on Franka Panda robots demonstrates that our approach significantly outperforms the state-of-the-art baselines, ensuring robust and stable teleoperation even under high-variance stochastic delays.

C2weakest assumption

The LSTM can reliably reconstruct smooth, continuous state estimates from delayed and discontinuous observations in a way that does not introduce errors large enough to destabilize the residual RL policy or degrade overall control performance.

C3one line summary

An LSTM state estimator paired with a residual RL policy enables robust robot teleoperation under stochastic delays by reconstructing continuous states and learning compensatory torques, outperforming baselines on Franka Panda robots.

References

14 extracted · 14 resolved · 0 Pith anchors

[1] Barde, P., Roy, J., de La Saulece, ´E., Calauz` enes, C., and Moinard, V. (2020). At human speed: Deep re- inforcement learning with action delay. InInternational Conference on Learning Representation 2020
[2] Choi, P.J., Oskouian, R.J., and Tubbs, R.S. (2018). Telesurgery: Past, Present, and Future.Cureus, 10(5) 2018
[3] Huang, B., Gong, Y., Yang, Z., Ren, T., and Figueredo, L. (2026). Contact-Safe Reinforcement Learning with ProMP Reparameterization and Energy Awareness 2026
[4] Huang, J., Chen, J., and Sun, C. (2019). Reinforcement learning in robotic teleoperation with time delay: A survey.Annual Reviews in Control, 48, 189–203. doi: 10.1016/j.arcontrol.2019.06.005 2019 · doi:10.1016/j.arcontrol.2019.06.005
[5] Residual Reinforce- ment Learning for Robot Control 2019 · doi:10.1109/icra.2019.8794127

Formal links

2 machine-checked theorem links

Receipt and verification
First computed 2026-05-20T00:01:00.710865Z
Builder pith-number-builder-2026-05-17-v1
Signature Pith Ed25519 (pith-v1-2026-05) · public key
Schema pith-number/v1.0

Canonical hash

ed09906a8728ae1ea1a3d6af24761df1698479b95e2a6edfb02cb17222fc0d15

Aliases

arxiv: 2605.15480 · arxiv_version: 2605.15480v1 · doi: 10.48550/arxiv.2605.15480 · pith_short_12: 5UEZA2UHFCXB · pith_short_16: 5UEZA2UHFCXB5IND · pith_short_8: 5UEZA2UH
Agent API
Verify this Pith Number yourself
curl -sH 'Accept: application/ld+json' https://pith.science/pith/5UEZA2UHFCXB5IND22XSI5Q56F \
  | jq -c '.canonical_record' \
  | python3 -c "import sys,json,hashlib; b=json.dumps(json.loads(sys.stdin.read()), sort_keys=True, separators=(',',':'), ensure_ascii=False).encode(); print(hashlib.sha256(b).hexdigest())"
# expect: ed09906a8728ae1ea1a3d6af24761df1698479b95e2a6edfb02cb17222fc0d15
Canonical record JSON
{
  "metadata": {
    "abstract_canon_sha256": "4c7c11da28e3709e7c21c8d78e8c32f40af3091d5278f9da7f3cf61d0ad606f3",
    "cross_cats_sorted": [
      "cs.AI"
    ],
    "license": "http://arxiv.org/licenses/nonexclusive-distrib/1.0/",
    "primary_cat": "cs.RO",
    "submitted_at": "2026-05-14T23:45:59Z",
    "title_canon_sha256": "31cef72b916e8610ed3473378c8a259332185108d2e9fea2c7dd336baecb30c1"
  },
  "schema_version": "1.0",
  "source": {
    "id": "2605.15480",
    "kind": "arxiv",
    "version": 1
  }
}