pith. sign in

arxiv: 2606.05660 · v1 · pith:BWPZF3LInew · submitted 2026-06-04 · 💻 cs.RO · cs.AI

Safe Embodied AI for Long-horizon Tasks: A Cross-layer Analysis of Robotic Manipulation

Pith reviewed 2026-06-28 01:44 UTC · model grok-4.3

classification 💻 cs.RO cs.AI
keywords embodied AI safetylong-horizon robotic manipulationplanning-time safetypolicy-time safetyexecution-time safetycross-layer analysissafety benchmarks
0
0 comments X

The pith

A survey organizes safety literature for long-horizon robotic manipulation by planning-time, policy-time, and execution-time intervention loci and weighs the evidence supporting each approach.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper structures existing work on safe embodied AI around three intervention points in the control loop and evaluates whether each area rests on formal proofs, statistical results, or informal heuristics. It treats long-horizon manipulation as a revealing test case because semantic errors, subtask drift, and physical contact risks can compound inside one closed system. The review separates backbone capability papers from direct safety mechanisms and benchmark studies, then flags where evidence is thin and where it is stronger.

Core claim

Safety research for long-horizon robotic manipulation can be organized by intervention locus into planning-time, policy-time, and execution-time methods; each locus supplies distinct kinds of evidence (formal guarantees, statistical support, or empirical heuristics), and current coverage is uneven with notable gaps in policy-time safety, formal support for contact-rich tasks, and manipulation-specific benchmarks.

What carries the argument

Intervention-locus taxonomy that partitions safety mechanisms into planning-time, policy-time, and execution-time categories and classifies supporting evidence by strength.

If this is right

  • Planning-time methods currently rest on more formal guarantees than the other two loci.
  • Policy-time safety remains the least supported by direct evidence.
  • Contact-rich long-horizon tasks lack formal safety arguments.
  • Uncertainty-triggered interventions are still immature.
  • Few benchmarks are tailored to manipulation safety.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • Cross-layer assurance would require coordinated guarantees that span all three loci rather than isolated fixes at one layer.
  • Evaluation design could shift from task-success metrics toward explicit measurement of error accumulation across subtasks.
  • Safer real-world deployment would benefit from benchmarks that include contact-rich failure modes and recovery from semantic misgrounding.

Load-bearing premise

Long-horizon robotic manipulation is representative enough that patterns observed in its safety literature generalize to other embodied AI settings.

What would settle it

A new comprehensive benchmark that shows policy-time safety methods supply stronger formal or statistical evidence than planning-time or execution-time methods across multiple long-horizon manipulation tasks.

read the original abstract

Embodied AI systems are increasingly expected to reason and act over extended horizons in physical environments. This growing capability brings safety to the foreground, because failures in the physical world can harm people, damage objects, and disrupt workplaces. Although safe embodied AI has attracted substantial attention, the literature remains fragmented across planning, policy design, and runtime execution. Long-horizon robotic manipulation is a particularly revealing anchor domain for this problem because semantic misgrounding, subtask-level error propagation, execution drift, and contact-rich physical risk can accumulate within the same closed-loop system. This survey therefore provides a structured review of safety in long-horizon robotic manipulation from an embodied AI perspective. We organize the literature by intervention locus, covering planning-time, policy-time, and execution-time safety, and we analyze the strength of the evidence that each line of work provides, distinguishing formal guarantees, statistical support, and empirical safety heuristics. This framework clarifies the distinct roles of backbone capability papers, direct safety mechanisms, and benchmark or evaluation studies, while exposing where current safety claims are well supported and where they remain indirect. We identify persistent gaps, including limited evidence for policy-time safety, weak formal support for contact-rich long-horizon manipulation, immature uncertainty-triggered intervention, and a shortage of manipulation-specific safety benchmarks. We conclude by outlining research directions for cross-layer assurance, evaluation design, and safer deployment of long-horizon robotic agents in real-world settings.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

0 major / 2 minor

Summary. The manuscript is a survey paper that provides a structured review of safety in long-horizon robotic manipulation from an embodied AI perspective. It organizes the literature by intervention locus into planning-time, policy-time, and execution-time safety; distinguishes formal guarantees, statistical support, and empirical heuristics; clarifies roles of backbone capability papers, direct safety mechanisms, and benchmarks; identifies gaps such as limited policy-time evidence, weak formal support for contact-rich manipulation, immature uncertainty-triggered interventions, and shortage of manipulation-specific benchmarks; and outlines directions for cross-layer assurance and evaluation design.

Significance. If the organization and evidence-strength distinctions hold, the survey could serve as a useful reference framework for the field by exposing where safety claims rest on strong versus indirect support and by highlighting persistent gaps in long-horizon embodied manipulation, thereby guiding more targeted research on cross-layer safety.

minor comments (2)
  1. Abstract and introduction use 'cross-layer analysis' and 'intervention locus' interchangeably; a brief explicit mapping between these terms in §1 would improve clarity.
  2. The claim of 'persistent gaps' (limited policy-time evidence, etc.) would benefit from one or two concrete citation examples per gap to allow readers to verify the assessment without consulting the full reference list.

Simulated Author's Rebuttal

0 responses · 0 unresolved

We thank the referee for the positive summary of our survey and the recommendation for minor revision. We are pleased that the organization by intervention locus (planning-time, policy-time, execution-time) and the distinctions among formal guarantees, statistical support, and empirical heuristics are recognized as potentially useful for exposing gaps in long-horizon embodied manipulation safety.

Circularity Check

0 steps flagged

No significant circularity

full rationale

This is a survey paper whose central claim is to organize and assess existing literature on safety in long-horizon robotic manipulation by intervention locus (planning-time, policy-time, execution-time). No mathematical derivations, equations, fitted parameters, or predictions appear in the manuscript. The organization and evidence assessment are delivered directly by the paper's own structure and content, with no reduction of any claim to self-citation chains or definitional inputs. The work is therefore self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

This is a literature survey that does not introduce new parameters, axioms, or entities; it reviews prior work.

pith-pipeline@v0.9.1-grok · 5808 in / 947 out tokens · 33985 ms · 2026-06-28T01:44:35.200552+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

300 extracted references · 10 linked inside Pith

  1. [1]

    Advances in Neural Information Processing Systems , year=

    Reduced Policy Optimization for Continuous Control with Hard Constraints , author=. Advances in Neural Information Processing Systems , year=

  2. [2]

    Training a Helpful and Harmless Assistant with

    Bai, Yuntao and Jones, Andy and Ndousse, Kamal and Askell, Amanda and Chen, Anna and DasSarma, Nova and Drain, Dawn and Fort, Stanislav and Ganguli, Deep and Henighan, Tom and others , year=. Training a Helpful and Harmless Assistant with

  3. [3]

    International Journal on Software Tools for Technology Transfer , volume=

    Vacuity detection in temporal model checking , author=. International Journal on Software Tools for Technology Transfer , volume=. 2003 , publisher=

  4. [4]

    Journal of the ACM (JACM) , volume=

    The complexity of propositional linear temporal logics , author=. Journal of the ACM (JACM) , volume=. 1985 , publisher=

  5. [5]

    Advances in neural information processing systems , volume=

    Deep reinforcement learning from human preferences , author=. Advances in neural information processing systems , volume=

  6. [6]

    2023 , organization=

    Ren, Allen Z and Dixit, Anushri and Bodrova, Alexandra and Singh, Sumeet and Tu, Stephen and Brown, Noah and Xu, Peng and Takayama, Leila and Xia, Fei and Varley, Jake and others , booktitle=. 2023 , organization=

  7. [7]

    Yuan, Jessie and Wu, Yilin and Bajcsy, Andrea , note=

  8. [8]

    Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) , pages=

    Vision-and-language navigation: A survey of tasks, methods, and future directions , author=. Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) , pages=

  9. [9]

    Hamilton-

    Bansal, Somil and Chen, Mo and Herbert, Sylvia and Tomlin, Claire J , booktitle=. Hamilton-. 2017 , organization=

  10. [10]

    arXiv preprint arXiv:2405.14093 , year=

    A survey on vision-language-action models for embodied ai , author=. arXiv preprint arXiv:2405.14093 , year=

  11. [11]

    arXiv preprint arXiv:2504.15585 , year=

    A comprehensive survey in llm (-agent) full stack safety: Data, training and deployment , author=. arXiv preprint arXiv:2504.15585 , year=

  12. [12]

    arXiv preprint arXiv:2602.10326 , year=

    Flow Matching with Uncertainty Quantification and Guidance , author=. arXiv preprint arXiv:2602.10326 , year=

  13. [13]

    Ying, Zonghao and Wang, Le and Xiao, Yisong and Wang, Jiakai and Ma, Yuqing and Guo, Jinyang and Yin, Zhenfei and Zhang, Mingchuan and Liu, Aishan and Liu, Xianglong , note=

  14. [14]

    Chen, Zixing and Gao, Yifeng and Wang, Li and Zhao, Yunhan and Liu, Yi and Li, Jiayu and Zheng, Xiang and Wu, Zuxuan and Wang, Cong and Ma, Xingjun and others , note=

  15. [15]

    Zhang, Yuhao and Zhang, Borong and Fan, Jiaming and Shen, Jiachen and Cai, Yishuai and Yang, Yaodong and Ji, Jiaming , note=

  16. [16]

    2025 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) , pages=

    LBAP: Improved Uncertainty Alignment of LLM Planners using Bayesian Inference , author=. 2025 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) , pages=. 2025 , organization=

  17. [17]

    Lee, Jongseok and Balachandran, Ribin and Singh, Harsimran and Feng, Jianxiang and Mishra, Hrishik and De Stefano, Marco and Triebel, Rudolph and Albu-Schaeffer, Alin and Kondak, Konstantin , note=

  18. [18]

    Learning for Dynamics and Control Conference , pages=

    Joint synthesis of safety certificate and safe control policy using constrained reinforcement learning , author=. Learning for Dynamics and Control Conference , pages=. 2022 , organization=

  19. [19]

    Advances in neural information processing systems , volume=

    Exact verification of relu neural control barrier functions , author=. Advances in neural information processing systems , volume=

  20. [20]

    2025 , organization=

    Seo, Junwon and Nakamura, Kensuke and Bajcsy, Andrea , booktitle=. 2025 , organization=

  21. [21]

    2025 , booktitle =

    Wu, Yilin and Tian, Thomas and Swamy, Gokul and Bajcsy, Andrea , title =. 2025 , booktitle =

  22. [22]

    2024 , eprint =

    Duan, Jiafei and Yuan, Wentao and Pumacay, Wilbert and Wang, Yi Ru and Ehsani, Kiana and Fox, Dieter and Krishna, Ranjay , title =. 2024 , eprint =

  23. [23]

    Ji, Jiaming and Zhou, Jiayi and Zhang, Borong and Dai, Juntao and Pan, Xuehai and Sun, Ruiyang and Huang, Weidong and Geng, Yiran and Liu, Mickel and Yang, Yaodong , journal=

  24. [24]

    , title =

    Ravichandran, Zachary and Snyder, David and Robey, Alexander and Hassani, Hamed and Kumar, Vijay and Pappas, George J. , title =. 2026 , note =

  25. [25]

    2024 , journal =

    Kicki, Piotr and Liu, Puze and Tateo, Davide and Bou-Ammar, Haitham and Walas, Krzysztof and Skrzypczyński, Piotr and Peters, Jan , title =. 2024 , journal =

  26. [26]

    2025 , note =

    Feng, Zeyuan and Zhang, Haimingyue and Bansal, Somil , title =. 2025 , note =

  27. [27]

    arXiv preprint arXiv:2409.12045 , year=

    Handling long-term safety and uncertainty in safe reinforcement learning , author=. arXiv preprint arXiv:2409.12045 , year=

  28. [28]

    2025 , note =

    Wang, Le and Ying, Zonghao and Yang, Xiao and Zou, Quanchen and Yin, Zhenfei and Li, Tianlin and Yang, Jian and Yang, Yaodong and Liu, Aishan and Liu, Xianglong , title =. 2025 , note =

  29. [29]

    2025 , journal =

    Liu, Puze and Bou-Ammar, Haitham and Peters, Jan and Tateo, Davide , title =. 2025 , journal =

  30. [30]

    Conference on Robot Learning , pages=

    Learning from demonstrations using signal temporal logic , author=. Conference on Robot Learning , pages=. 2021 , organization=

  31. [31]

    2023 IEEE International Conference on Robotics and Automation (ICRA) , pages=

    Data-Efficient Learning of Natural Language to Linear Temporal Logic Translators for Robot Task Specification , author=. 2023 IEEE International Conference on Robotics and Automation (ICRA) , pages=. 2023 , organization=

  32. [32]

    Liu, Jason Xinyu and Shah, Ankit and Rosen, Eric and Jia, Mingxi and Konidaris, George and Tellex, Stefanie , booktitle=

  33. [33]

    IEEE Robotics and Automation Letters , volume=

    Cooperative object manipulation under signal temporal logic tasks and uncertain dynamics , author=. IEEE Robotics and Automation Letters , volume=. 2022 , publisher=

  34. [34]

    The International Journal of Robotics Research , volume=

    Sample-efficient safety assurances using conformal prediction , author=. The International Journal of Robotics Research , volume=. 2024 , publisher=

  35. [35]

    1st Workshop on Safely Leveraging Vision-Language Foundation Models in Robotics: Challenges and Opportunities at ICRA 2025 , year=

    Towards safe robot foundation models using inductive biases , author=. 1st Workshop on Safely Leveraging Vision-Language Foundation Models in Robotics: Challenges and Opportunities at ICRA 2025 , year=

  36. [36]

    2024 , journal =

    Gu, Shangding and Yang, Long and Du, Yali and Chen, Guang and Walter, Florian and Wang, Jun and Knoll, Alois , title =. 2024 , journal =

  37. [37]

    2025 , journal =

    Zheng, Ying and Yao, Lei and Su, Yuejiao and Zhang, Yi and Wang, Yi and Zhao, Sicheng and Zhang, Yiyi and Chau, Lap-Pui , title =. 2025 , journal =

  38. [38]

    Safe Learning for Contact-Rich Robot Tasks: A Survey from Classical Learning-Based Methods to Safe Foundation Models , author=

  39. [39]

    The International Journal of Robotics Research , pages=

    A survey on imitation learning for contact-rich tasks in robotics , author=. The International Journal of Robotics Research , pages=. 2025 , publisher=

  40. [40]

    2024 , note =

    Ji, Jiaming and Zhou, Jiayi and Lou, Hantao and Chen, Boyuan and Hong, Donghai and Wang, Xuyao and Chen, Wenqi and Wang, Kaile and Pan, Rui and Li, Jiahao and Wang, Mohan and Dai, Josef and Qiu, Tianyi and Xu, Hua and Li, Dong and Chen, Weipeng and Song, Jun and Zheng, Bo and Yang, Yaodong , title =. 2024 , note =

  41. [41]

    2025 , journal =

    Liu, Yang and Chen, Weixing and Bai, Yongjie and Liang, Xiaodan and Li, Guanbin and Gao, Wen and Lin, Liang , title =. 2025 , journal =

  42. [42]

    2026 , eprint =

    Stulp, Freek and Bustamante, Samuel and Silvério, João and Albu-Schäffer, Alin and Bohg, Jeannette and Song, Shuran , title =. 2026 , eprint =

  43. [43]

    2023 , journal =

    Dai, Juntao and Ji, Jiaming and Yang, Long and Zheng, Qian and Pan, Gang , title =. 2023 , journal =

  44. [44]

    BeaverTails: Towards improved safety alignment of

    Ji, Jiaming and Liu, Mickel and Dai, Josef and Pan, Xuehai and Zhang, Chi and Bian, Ce and Chen, Boyuan and Sun, Ruiyang and Wang, Yizhou and Yang, Yaodong , booktitle=. BeaverTails: Towards improved safety alignment of

  45. [45]

    arXiv preprint arXiv:2601.10827 , year=

    Approximately Optimal Global Planning for Contact-Rich SE (2) Manipulation on a Graph of Reachable Sets , author=. arXiv preprint arXiv:2601.10827 , year=

  46. [46]

    the Thirty-Fourth International Joint Conference on Artificial Intelligence (IJCAI) , pages=

    A comprehensive survey on physical risk control in the era of foundation model-enabled robotics , author=. the Thirty-Fourth International Joint Conference on Artificial Intelligence (IJCAI) , pages=

  47. [47]

    2025 IEEE International Conference on Robotics and Automation (ICRA) , pages=

    Adaptive compliance policy: Learning approximate compliance for diffusion guided control , author=. 2025 IEEE International Conference on Robotics and Automation (ICRA) , pages=. 2025 , organization=

  48. [48]

    arXiv preprint arXiv:2005.00227 , year=

    Learning compliance adaptation in contact-rich manipulation , author=. arXiv preprint arXiv:2005.00227 , year=

  49. [49]

    Minimalist Compliance Control , author=

  50. [50]

    IEEE Robotics and Automation Letters , volume=

    A perturbation-robust framework for admittance control of robotic systems with high-stiffness contacts and heavy payload , author=. IEEE Robotics and Automation Letters , volume=. 2024 , publisher=

  51. [51]

    IEEE Robotics and Automation Letters , volume=

    Guarding force: Safety-critical compliant control for robot-environment interaction , author=. IEEE Robotics and Automation Letters , volume=. 2025 , publisher=

  52. [52]

    IEEE Robotics and Automation Letters , volume=

    Force-constrained visual policy: Safe robot-assisted dressing via multi-modal sensing , author=. IEEE Robotics and Automation Letters , volume=. 2024 , publisher=

  53. [53]

    Frontiers in Robotics and AI , volume=

    Safe contact-based robot active search using Bayesian optimization and control barrier functions , author=. Frontiers in Robotics and AI , volume=. 2024 , publisher=

  54. [54]

    IFAC-PapersOnLine , volume=

    Adaptive admittance control for safety-critical physical human robot collaboration , author=. IFAC-PapersOnLine , volume=. 2023 , publisher=

  55. [55]

    IEEE Robotics and Automation Letters , year=

    Contact-aware safety in soft robots using high-order control barrier and lyapunov functions , author=. IEEE Robotics and Automation Letters , year=

  56. [56]

    Advances in Neural Information Processing Systems , volume=

    Dynamic model predictive shielding for provably safe reinforcement learning , author=. Advances in Neural Information Processing Systems , volume=

  57. [57]

    Reachability-based trajectory safeguard (

    Shao, Yifei Simon and Chen, Chao and Kousik, Shreyas and Vasudevan, Ram , journal=. Reachability-based trajectory safeguard (. 2021 , publisher=

  58. [58]

    arXiv preprint arXiv:2511.13459 , year=

    Contact-Safe Reinforcement Learning with ProMP Reparameterization and Energy Awareness , author=. arXiv preprint arXiv:2511.13459 , year=

  59. [59]

    arXiv preprint arXiv:2503.00287 , year=

    Passivity-centric safe reinforcement learning for contact-rich robotic tasks , author=. arXiv preprint arXiv:2503.00287 , year=

  60. [60]

    IEEE Robotics and Automation Letters , volume=

    Srl-vic: A variable stiffness-based safe reinforcement learning for contact-rich robotic tasks , author=. IEEE Robotics and Automation Letters , volume=. 2024 , publisher=

  61. [61]

    IEEE Robotics and Automation Letters , volume=

    Stability-guaranteed reinforcement learning for contact-rich manipulation , author=. IEEE Robotics and Automation Letters , volume=. 2020 , publisher=

  62. [62]

    IFAC-PapersOnLine , volume=

    Cheq-ing the box: Safe variable impedance learning for robotic polishing , author=. IFAC-PapersOnLine , volume=. 2025 , publisher=

  63. [63]

    2025 , organization=

    Liu, Wenhai and Wang, Junbo and Wang, Yiming and Wang, Weiming and Lu, Cewu , booktitle=. 2025 , organization=

  64. [64]

    IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) , pages=

    Towards long-horizon vision-language navigation: Platform, benchmark and method , author=. IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) , pages=

  65. [65]

    2026 , publisher=

    Bi, Jianxin and Ma, Kevin Yuchen and Hao, Ce and Zheng, Mike Shou and Soh, Harold , journal=. 2026 , publisher=

  66. [66]

    Huang, Jialei and Wang, Shuo and Lin, Fanqi and Hu, Yihang and Wen, Chuan and Gao, Yang , note=

  67. [67]

    2025 , organization=

    Zhang, Zongzheng and Xu, Haobo and Yang, Zhuo and Yue, Chenghao and Lin, Zehao and Gao, Huan-ang and Wang, Ziwei and Zhao, Hao , booktitle=. 2025 , organization=

  68. [68]

    arXiv preprint arXiv:1707.06347 , year=

    Proximal policy optimization algorithms , author=. arXiv preprint arXiv:1707.06347 , year=

  69. [69]

    2024 IEEE International Conference on Robotics and Automation (ICRA) , pages=

    A reinforcement learning-based control strategy for robust interaction of robotic systems with uncertain environments , author=. 2024 IEEE International Conference on Robotics and Automation (ICRA) , pages=. 2024 , organization=

  70. [70]

    arXiv preprint arXiv:2511.07381 , year=

    Residual Rotation Correction using Tactile Equivariance , author=. arXiv preprint arXiv:2511.07381 , year=

  71. [71]

    arXiv preprint arXiv:2602.23253 , year=

    SPARR: Simulation-based Policies with Asymmetric Real-world Residuals for Assembly , author=. arXiv preprint arXiv:2602.23253 , year=

  72. [72]

    arXiv preprint arXiv:2602.14174 , year=

    Direction Matters: Learning Force Direction Enables Sim-to-Real Contact-Rich Manipulation , author=. arXiv preprint arXiv:2602.14174 , year=

  73. [73]

    Applied Sciences , volume=

    Variable compliance control for robotic peg-in-hole assembly: A deep-reinforcement-learning approach , author=. Applied Sciences , volume=. 2020 , publisher=

  74. [74]

    arXiv preprint arXiv:1812.06298 , year=

    Residual policy learning , author=. arXiv preprint arXiv:1812.06298 , year=

  75. [75]

    2019 international conference on robotics and automation (ICRA) , pages=

    Residual reinforcement learning for robot control , author=. 2019 international conference on robotics and automation (ICRA) , pages=. 2019 , organization=

  76. [76]

    Unsupervised Discovery of Failure Taxonomies from Deployment Logs , author=

  77. [77]

    arXiv preprint arXiv:2509.13949 , year=

    Share-rl: Structured, interactive reinforcement learning for contact-rich industrial assembly tasks , author=. arXiv preprint arXiv:2509.13949 , year=

  78. [78]

    IEEE Robotics and Automation Letters , year=

    A Hybrid Framework Using Diffusion Policy and Residual RL for Force-Sensitive Robotic Manipulation , author=. IEEE Robotics and Automation Letters , year=

  79. [79]

    arXiv preprint arXiv:2010.14497 , year=

    Conservative safety critics for exploration , author=. arXiv preprint arXiv:2010.14497 , year=

  80. [80]

    2025 IEEE International Conference on Robotics and Automation (ICRA) , pages=

    Tacdiffusion: Force-domain diffusion policy for precise tactile manipulation , author=. 2025 IEEE International Conference on Robotics and Automation (ICRA) , pages=. 2025 , organization=

Showing first 80 references.