pith. machine review for the scientific record. sign in

arxiv: 2604.20193 · v1 · submitted 2026-04-22 · 💻 cs.RO

Recognition: unknown

LLM-Guided Safety Agent for Edge Robotics with an ISO-Compliant Perception-Compute-Control Architecture

Authors on Pith no claims yet

Pith reviewed 2026-05-10 00:48 UTC · model grok-4.3

classification 💻 cs.RO
keywords LLM-guided safetyedge roboticsISO 13849functional safetydual-modular redundancyhuman-robot interactionsafety predicatesperception-compute-control
0
0 comments X

The pith

An LLM safety agent with redundant architecture enables ISO 13849 compliant edge robotics on cost-effective hardware.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper seeks to resolve the tension between probabilistic AI perception in robots and the need for deterministic safety in industrial applications. By using large language models to convert natural-language safety regulations into executable predicates, the system deploys these in a low-latency architecture with dual redundancy. This setup runs on affordable edge hardware while maintaining fault tolerance through parallel independent paths for perception, computation, and control. Testing in human-robot scenarios indicates it can satisfy ISO 13849 Category 3 and Performance Level d requirements. If correct, this opens the door to deploying advanced AI robots safely in factories without prohibitive costs.

Core claim

We present an LLM-guided safety agent for edge robotics, built on an ISO-compliant low-latency perception-compute-control architecture. Our method translates natural-language safety regulations into executable predicates and deploys them through a redundant heterogeneous edge runtime. For fault-tolerant closed-loop execution under edge constraints, we adopt a symmetric dual-modular redundancy design with parallel independent execution for low-latency perception, computation, and control. We prototype the system on a dual-RK3588 platform and evaluate it in representative human-robot interaction scenarios. The results demonstrate a practical edge implementation path toward ISO 13849 Category 3

What carries the argument

The LLM-guided translation of safety regulations into executable predicates within a symmetric dual-modular redundant perception-compute-control architecture.

If this is right

  • Achieves practical compliance with ISO 13849 Category 3 and PL d using cost-effective hardware.
  • Enables low-latency fault-tolerant closed-loop control in human-robot interaction.
  • Supports deterministic safety predicates despite probabilistic perception.
  • Deploys via parallel independent execution paths on redundant heterogeneous edge runtimes.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The predicate-translation method could extend to other regulatory domains such as automotive or medical device safety standards.
  • The dual-redundancy design on commodity hardware might lower barriers for smaller teams building safety-critical systems.
  • Longer-term testing could examine how well the architecture handles regulatory updates without full system redesign.
  • The approach invites comparison with purely rule-based or formal-verification methods for the same safety goals.

Load-bearing premise

An LLM can reliably translate natural-language safety regulations into executable predicates that enforce deterministic behavior despite the probabilistic nature of AI perception.

What would settle it

A test showing that the generated predicates allow unsafe robot actions in a human-interaction scenario or that the redundant system fails to meet the response-time criteria for Performance Level d would disprove the central claim.

Figures

Figures reproduced from arXiv: 2604.20193 by Anyang Liang, Chen Qian, Huayu Zhang, Lu Cheng, Ruofan Zhang, Sheng Yin, Xiaoyun Yuan, Xu Huang, Yin Zhou, Yuan Cheng, Yuefeng Song.

Figure 1
Figure 1. Figure 1: Real-world deployment of the proposed LLM [PITH_FULL_IMAGE:figures/full_fig_p001_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: The hierarchical framework of the safety agent. Step 1: LLM-based safety formalization for constraint extraction [PITH_FULL_IMAGE:figures/full_fig_p002_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Hardware architecture of the integrated PPC system. The design features a symmetric dual-node redundancy centered [PITH_FULL_IMAGE:figures/full_fig_p005_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: Visual representation of the three experimental scenarios utilized in the performance evaluation, with corresponding [PITH_FULL_IMAGE:figures/full_fig_p007_4.png] view at source ↗
read the original abstract

Ensuring functional safety in human-robot interaction is challenging because AI perception is inherently probabilistic, whereas industrial standards require deterministic behavior. We present an LLM-guided safety agent for edge robotics, built on an ISO-compliant low-latency perception-compute-control architecture. Our method translates natural-language safety regulations into executable predicates and deploys them through a redundant heterogeneous edge runtime. For fault-tolerant closed-loop execution under edge constraints, we adopt a symmetric dual-modular redundancy design with parallel independent execution for low-latency perception, computation, and control. We prototype the system on a dual-RK3588 platform and evaluate it in representative human-robot interaction scenarios. The results demonstrate a practical edge implementation path toward ISO 13849 Category 3 and PL d using cost-effective hardware, supporting practical deployment of safety-critical embodied AI.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

1 major / 1 minor

Summary. The paper presents an LLM-guided safety agent for edge robotics built on an ISO-compliant low-latency perception-compute-control architecture. It translates natural-language safety regulations into executable predicates, deploys them via a symmetric dual-modular redundancy design with parallel independent execution on heterogeneous edge hardware, prototypes the system on a dual-RK3588 platform, and evaluates it in representative human-robot interaction scenarios. The central claim is that these results demonstrate a practical edge implementation path toward ISO 13849 Category 3 and PL d using cost-effective hardware.

Significance. If the quantitative safety analysis were supplied, the work would offer a concrete engineering contribution to reconciling probabilistic AI perception with deterministic industrial safety standards, supporting practical deployment of safety-critical embodied AI on affordable platforms.

major comments (1)
  1. [Prototype results section] Prototype results section: the evaluation reports only scenario-based testing of the perception-compute-control loop and supplies no calculated MTTFd, DCavg, or PFHd values for the symmetric dual-modular redundancy, no fault-injection results showing dangerous failure rates, and no explicit mapping of the LLM-derived predicates to the safety requirements of ISO 13849 Category 3 / PL d. Without at least one of these, the central claim that the prototype demonstrates a standards-compliant implementation path remains unsupported.
minor comments (1)
  1. [Abstract] Abstract: the sentence asserting that 'evaluation results demonstrate the implementation path' would be strengthened by an explicit cross-reference to the quantitative metrics or tables that would appear after revision.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for the constructive feedback emphasizing the importance of quantitative safety metrics. We address the concern regarding the prototype results section while clarifying the scope and contributions of our work.

read point-by-point responses
  1. Referee: [Prototype results section] Prototype results section: the evaluation reports only scenario-based testing of the perception-compute-control loop and supplies no calculated MTTFd, DCavg, or PFHd values for the symmetric dual-modular redundancy, no fault-injection results showing dangerous failure rates, and no explicit mapping of the LLM-derived predicates to the safety requirements of ISO 13849 Category 3 / PL d. Without at least one of these, the central claim that the prototype demonstrates a standards-compliant implementation path remains unsupported.

    Authors: We agree that the current prototype results section presents only scenario-based testing of the perception-compute-control loop and does not include calculated MTTFd, DCavg, or PFHd values, fault-injection results, or an explicit mapping of the LLM-derived predicates to ISO 13849 Category 3 / PL d requirements. The manuscript's primary contribution is the LLM-guided translation of natural-language safety regulations into executable predicates deployed via symmetric dual-modular redundancy on heterogeneous edge hardware, with the dual-RK3588 prototype demonstrating low-latency closed-loop operation in representative human-robot interaction scenarios. This architecture is explicitly designed to support the diagnostic coverage and redundancy needed for Category 3, providing a practical implementation path on cost-effective platforms rather than claiming full certification. To strengthen the central claim, we will revise the prototype results section to add an explicit mapping of the safety predicates to the relevant ISO 13849 safety requirements and include estimated PFHd values based on published reliability data for the RK3588 SoCs under the dual-modular configuration. Full fault-injection campaigns to measure dangerous failure rates are outside the scope of this prototype demonstration but are compatible with the architecture; we will explicitly note this as a limitation and direction for future work. These changes will better substantiate the standards-compliant path without overstating the current evaluation. revision: partial

Circularity Check

0 steps flagged

No significant circularity; engineering prototype with independent scenario evaluation

full rationale

The paper describes an LLM-guided safety agent implemented on a dual-RK3588 edge platform using symmetric dual-modular redundancy. Its central claim rests on scenario-based testing of the perception-compute-control loop rather than any derivation, fitted parameters, or self-referential definitions. No equations, ansatzes, uniqueness theorems, or self-citations appear in the provided text as load-bearing steps that reduce the result to its inputs by construction. The work is framed as a practical implementation path and remains self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 0 invented entities

The central claim depends on the unverified effectiveness of LLM translation for safety predicates and the assumption that dual-modular redundancy delivers the required fault tolerance and determinism on edge hardware; no independent evidence or falsifiable predictions are supplied in the abstract.

axioms (2)
  • domain assumption LLM can translate natural-language safety regulations into correct executable predicates that ensure deterministic safety behavior
    Invoked as the core method for bridging probabilistic AI with deterministic standards.
  • domain assumption Symmetric dual-modular redundancy with parallel independent execution guarantees fault-tolerant closed-loop control under edge constraints
    Adopted explicitly for the low-latency perception-compute-control architecture.

pith-pipeline@v0.9.0 · 5465 in / 1324 out tokens · 52147 ms · 2026-05-10T00:48:10.449476+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

28 extracted references · 3 canonical work pages · 1 internal anchor

  1. [1]

    Andra Acsintoae, Andrei Florescu, Mariana-Iuliana Georgescu, Tudor Mare, Paul Sumedrea, Radu Tudor Ionescu, Fahad Shahbaz Khan, and Mubarak Shah. 2022. Ubnormal: New benchmark for supervised open-set video anomaly detection. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 20143–20153

  2. [2]

    Lina María Amaya-Mejía, Nicolás Duque-Suárez, Daniel Jaramillo-Ramírez, and Carol Martinez. 2022. Vision-based safety system for barrierless human-robot collaboration. In2022 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). IEEE, 7331–7336

  3. [3]

    Algirdas Avizienis, J-C Laprie, Brian Randell, and Carl Landwehr. 2004. Basic concepts and taxonomy of dependable and secure computing.IEEE transactions on dependable and secure computing1, 1 (2004), 11–33

  4. [4]

    Lukas Brunke, Yanni Zhang, Ralf Römer, Jack Naimer, Nikola Staykov, Siqi Zhou, and Angela P Schoellig. 2025. Semantically safe robot manipulation: From se- mantic scene understanding to motion safeguards.IEEE Robotics and Automation Letters(2025)

  5. [5]

    Yong Shean Chong and Yong Haur Tay. 2017. Abnormal event detection in videos using spatiotemporal autoencoder. InInternational symposium on neural networks. Springer, 189–196. LLM-Guided Safety Agent for Edge Robotics with an ISO-Compliant Perception-Compute-Control Architecture ICCAD ’26, November 8–12, 2026, San Jose, California, USA

  6. [6]

    Gautam Varma Datla, Anudeep Vurity, Tejaswani Dash, Tazeem Ahmad, Mohd Adnan, and Saima Rafi. 2025. Executable Governance for AI: Translating Policies into Rules Using LLMs.arXiv preprint arXiv:2512.04408(2025)

  7. [7]

    Mandeep Dhanda, Benedict Alexander Rogers, Stephanie Hall, Elies Dekoninck, and Vimal Dhokia. 2025. Reviewing human-robot collaboration in manufactur- ing: Opportunities and challenges in the context of industry 5.0.Robotics and Computer-Integrated Manufacturing93 (2025), 102937

  8. [8]

    Madeline Endres, Sarah Fakhoury, Saikat Chakraborty, and Shuvendu K Lahiri

  9. [9]

    Can large language models transform natural language intent into formal method postconditions?Proceedings of the ACM on Software Engineering1, FSE (2024), 1889–1912

  10. [10]

    Vincenzo Gervasi, Alessio Ferrari, Didar Zowghi, and Paola Spoletini. 2019. Ambiguity in requirements engineering: Towards a unifying framework. InFrom Software Engineering to Formal Methods and Tools, and Back: Essays Dedicated to Stefania Gnesi on the Occasion of Her 65th Birthday. Springer, 191–210

  11. [11]

    Niklas Grambow, Lisa-Marie Fenner, Felipe Kempkes, Philip Hotz, Dingyuan Wan, Jörg Krüger, and Kevin Haninger. 2026. Anomaly detection for generic failure monitoring in robotic assembly, screwing and manipulation.IEEE Robotics and Automation Letters(2026)

  12. [12]

    2023.ISO 13849-1:2023 Safety of Machinery — Safety-related Parts of Control Systems — Part 1: General Prin- ciples for Design

    International Organization for Standardization. 2023.ISO 13849-1:2023 Safety of Machinery — Safety-related Parts of Control Systems — Part 1: General Prin- ciples for Design. Standard ISO 13849-1:2023. International Organization for Standardization

  13. [13]

    Jiewu Leng, Weinan Sha, Baicun Wang, Pai Zheng, Cunbo Zhuang, Qiang Liu, Thorsten Wuest, Dimitris Mourtzis, and Lihui Wang. 2022. Industry 5.0: Prospect and retrospect.Journal of Manufacturing Systems65 (2022), 279–295

  14. [14]

    Jinfan Liu, Yichao Yan, Junjie Li, Weiming Zhao, Pengzhi Chu, Xingdong Sheng, Yunhui Liu, and Xiaokang Yang. 2024. Ipad: Industrial process anomaly detection dataset.IEEE Transactions on Circuits and Systems for Video Technology35, 1 (2024), 380–393

  15. [15]

    Weixin Luo, Wen Liu, and Shenghua Gao. 2017. A revisit of sparse coding based anomaly detection in stacked rnn framework. InProceedings of the IEEE international conference on computer vision. 341–349

  16. [16]

    Sotiris Makris. 2020. Dynamic Safety Zones in Human Robot Collaboration. In Cooperating Robots for Flexible Manufacturing. Springer, 271–287

  17. [17]

    Baoluo Meng, Robert Lorch, Kit Siu, Michael Durling, Sarat Chandra Varanasi, Saswata Paul, and Abha Moitra. 2026. Transforming Natural Language Require- ments to Formalism Using LLMs.Systems Engineering29, 2 (2026), 195–204

  18. [18]

    Romero Morais, Vuong Le, Truyen Tran, Budhaditya Saha, Moussa Mansour, and Svetha Venkatesh. 2019. Learning regularity in skeleton trajectories for anomaly detection in videos. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition. 11996–12004

  19. [19]

    Ike Obi, Vishnunandan LN Venkatesh, Weizheng Wang, Ruiqi Wang, Dayoon Suh, Temitope I Amosa, Wonse Jo, and Byung-Cheol Min. 2026. Pre-Execution Safety Gate & Task Safety Contracts for LLM-Controlled Robot Systems.arXiv preprint arXiv:2604.05427(2026)

  20. [20]

    Fenglian Pan, Yinwei Zhang, Jian Liu, Larry Head, Maria Elli, and Ignacio Alvarez

  21. [21]

    Reliability modeling for perception systems in autonomous vehicles: A recursive event-triggering point process approach.Transportation Research Part C: Emerging Technologies169 (2024), 104868

  22. [22]

    David Podgorelec, Suzana Uran, Andrej Nerat, Božidar Bratina, Sašo Pečnik, Marjan Dimec, Franc Žaberl, Borut Žalik, and Riko Šafarič. 2023. LiDAR-based maintenance of a safe distance between a human and a robot arm.Sensors23, 9 (2023), 4305

  23. [23]

    Martin J Rosenstrauch, Tessa J Pannen, and Jörg Krüger. 2018. Human robot collaboration-using kinect v2 for iso/ts 15066 speed and separation monitoring. Procedia Cirp76 (2018), 183–186

  24. [24]

    Ali Suvizi and Guru Venkataramani. 2025. Auto-Healer: Self-Healing Hardware for Perception Stage Faults in Autonomous Driving Systems. InProceedings of the 39th ACM International Conference on Supercomputing. 1064–1078

  25. [25]

    Reinhard Wilhelm, Jakob Engblom, Andreas Ermedahl, Niklas Holsti, Stephan Thesing, David Whalley, Guillem Bernat, Christian Ferdinand, Reinhold Heck- mann, Tulika Mitra, et al . 2008. The worst-case execution-time prob- lem—overview of methods and survey of tools.ACM transactions on embedded computing systems (TECS)7, 3 (2008), 1–53

  26. [26]

    Feiyu Wu, Xu Zheng, Yue Qu, Zhuocheng Wang, Zicheng Feng, and HUI LI. 2026. Grounding Generative Planners in Verifiable Logic: A Hybrid Architecture for Trustworthy Embodied AI. InThe Fourteenth International Conference on Learning Representations. https://openreview.net/forum?id=wb05ver1k8

  27. [27]

    Zhiyi Xue, Xiaohong Chen, and Min Zhang. 2026. Explicating Tacit Regulatory Knowledge from LLMs to Auto-Formalize Requirements for Compliance Test Case Generation.arXiv preprint arXiv:2601.09762(2026)

  28. [28]

    Ziyi Yang, Shreyas S Raman, Ankit Shah, and Stefanie Tellex. 2024. Plug in the safety chip: Enforcing constraints for llm-driven robot agents. In2024 IEEE International Conference on Robotics and Automation (ICRA). IEEE, 14435–14442