Harnessing Embodied Agents: Runtime Governance for Policy-Constrained Execution

Cong Yang; John See; Simin Luan; Xue Qin; Zhijun Li

arxiv: 2604.07833 · v3 · pith:RMQ3FXPQnew · submitted 2026-04-09 · 💻 cs.RO

Harnessing Embodied Agents: Runtime Governance for Policy-Constrained Execution

Xue Qin , Simin Luan , John See , Cong Yang , Zhijun Li This is my paper

Pith reviewed 2026-05-22 10:37 UTC · model grok-4.3

classification 💻 cs.RO

keywords embodied agentsruntime governancepolicy-constrained executionexecution monitoringrobot safetyagent oversightrollback mechanisms

0 comments

The pith

Embodied agents gain reliable safety when an external runtime layer handles policy checks and recoveries instead of embedding them in the agent itself.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

Embodied agents now execute actions in physical and tool environments rather than only reasoning. The paper establishes that safety and compliance become harder to manage when control logic lives inside the agent's loop, so it moves oversight to a separate runtime governance layer that performs policy checking, capability admission, execution monitoring, rollback, and human overrides. This separation is meant to make governance standardizable, auditable, and adaptable across different agents and policies. A reader would care because agents with real execution power can cause harm or violate rules if runtime drift or unauthorized actions go unchecked. The authors support the approach through randomized simulations showing strong interception and recovery rates that beat baselines.

Core claim

The paper proposes formalizing a control boundary among the embodied agent, Embodied Capability Modules, and an external runtime governance layer, then demonstrates through 1000 randomized trials that this layer intercepts 96.2 percent of unauthorized actions, reduces unsafe continuation from 100 percent to 22.2 percent under drift, and achieves 91.4 percent recovery success with full policy compliance.

What carries the argument

The external runtime governance layer that performs policy checking, capability admission, execution monitoring, rollback handling, and human override while keeping agent cognition separate.

If this is right

Policy violations can be caught before execution without rewriting agent internals.
Systems maintain higher compliance even when the agent's own state drifts during operation.
Recovery mechanisms restore safe execution while preserving full policy adherence.
Governance becomes easier to audit and update independently of the underlying agent.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same external layer could be tested on real hardware to check whether added monitoring increases latency beyond acceptable limits for time-critical tasks.
Similar separation might help non-robotic autonomous systems such as software agents that control cloud resources or financial transactions.
Future experiments could introduce adaptive attackers that try to exploit the governance interface itself.

Load-bearing premise

The randomized simulation trials accurately capture the dynamics, failure modes, and policy interactions of real embodied agents operating in physical environments.

What would settle it

Running the same governance layer on physical robots and measuring interception rates plus recovery success during actual runtime drift and policy violations.

Figures

Figures reproduced from arXiv: 2604.07833 by Cong Yang, John See, Simin Luan, Xue Qin, Zhijun Li.

**Figure 1.** Figure 1: Core idea. (a) In ungoverned execution, the embodied agent directly invokes capabilities without policy checking, runtime monitoring, or recovery support. (b) Our framework interposes a Runtime Governance Layer between agent cognition and physical execution: every capability invocation passes through admission control, policy checking, and execution monitoring, with recovery and human override available a… view at source ↗

**Figure 2.** Figure 2: Runtime governance framework architecture. Stratum 3 (Runtime Governance Layer) mediates between agent cognition and embodied execution through six coordinated components. Solid arrows indicate the forward execution path; dashed arrows indicate feedback, telemetry, and intervention flows. 4.3 Framework Architecture The runtime governance framework sits between the embodied agent and the execution substrat… view at source ↗

**Figure 3.** Figure 3: Policy-constrained execution pipeline. The seven stages form a governed lifecycle: agent proposals pass through admission and policy evaluation before monitored execution. Runtime observation may trigger intervention, recovery, or escalation. Dashed arrows indicate feedback, modification, rejection, and replanning paths. 4.9 Audit and Telemetry Layer The Audit and Telemetry Layer records what was proposed,… view at source ↗

read the original abstract

Embodied agents are evolving from passive reasoning systems into active executors that interact with tools, robots, and physical environments. Once granted execution authority, the central challenge becomes how to keep actions governable at runtime. Existing approaches embed safety and recovery logic inside the agent loop, making execution control difficult to standardize, audit, and adapt. This paper argues that embodied intelligence requires not only stronger agents, but stronger runtime governance. We propose a framework for policy-constrained execution that separates agent cognition from execution oversight. Governance is externalized into a dedicated runtime layer performing policy checking, capability admission, execution monitoring, rollback handling, and human override. We formalize the control boundary among the embodied agent, Embodied Capability Modules (ECMs), and runtime governance layer, and validate through 1000 randomized simulation trials across three governance dimensions. Results show 96.2% interception of unauthorized actions, reduction of unsafe continuation from 100% to 22.2% under runtime drift, and 91.4% recovery success with full policy compliance, substantially outperforming all baselines (p<0.001). By reframing runtime governance as a first-class systems problem, this paper positions policy-constrained execution as a key design principle for embodied agent systems.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper externalizes runtime governance for embodied agents with a clean boundary to ECMs and reports solid simulation metrics, but stays entirely in sim without real-robot grounding.

read the letter

The main takeaway is that this work treats governance as a separate runtime layer rather than something baked into the agent. They formalize the split between the agent, these Embodied Capability Modules, and the oversight component that handles policy checks, monitoring, and rollback. That separation is the concrete step forward, and the 1000-trial results give numbers to evaluate: 96% interception of bad actions, unsafe continuation down to 22%, and 91% recovery with compliance, all beating baselines at p<0.001.

Referee Report

1 major / 2 minor

Summary. The paper proposes a framework for policy-constrained execution in embodied agents by externalizing governance into a dedicated runtime layer that performs policy checking, capability admission, execution monitoring, rollback handling, and human override. It formalizes the control boundary among the embodied agent, Embodied Capability Modules (ECMs), and the runtime governance layer, and validates the approach through 1000 randomized simulation trials across three governance dimensions. The results report 96.2% interception of unauthorized actions, reduction of unsafe continuation from 100% to 22.2% under runtime drift, and 91.4% recovery success with full policy compliance, substantially outperforming baselines (p<0.001).

Significance. If the simulation faithfully captures physical dynamics, the work could meaningfully advance safety engineering for embodied systems by reframing governance as an external, auditable systems layer rather than an internal agent concern. The quantitative evaluation with 1000 trials and reported statistical significance (p<0.001) is a clear strength that provides concrete, falsifiable metrics for interception, drift recovery, and compliance.

major comments (1)

[Validation section] The section on validation through 1000 randomized simulation trials: the central claims of 96.2% interception, 22.2% unsafe continuation under drift, and 91.4% recovery rest entirely on these trials, yet the manuscript provides no specification of simulator dynamics, noise models, timing constraints, or how the three governance dimensions are implemented and parameterized. Without these details it is not possible to determine whether the reported metrics demonstrate robustness for physical embodied agents or depend on unstated simulation assumptions.

minor comments (2)

[Abstract] The abstract states that the approach 'substantially outperforming all baselines' but does not name or briefly characterize the baselines; adding one sentence identifying them would improve readability.
[Framework Formalization] The formalization of the agent/ECM/governance boundary would benefit from a diagram or pseudocode snippet illustrating the exact interface points for policy checking and rollback.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for their constructive feedback and recommendation for major revision. We address the major comment regarding the validation section below and describe the planned changes to the manuscript.

read point-by-point responses

Referee: [Validation section] The section on validation through 1000 randomized simulation trials: the central claims of 96.2% interception, 22.2% unsafe continuation under drift, and 91.4% recovery rest entirely on these trials, yet the manuscript provides no specification of simulator dynamics, noise models, timing constraints, or how the three governance dimensions are implemented and parameterized. Without these details it is not possible to determine whether the reported metrics demonstrate robustness for physical embodied agents or depend on unstated simulation assumptions.

Authors: We agree that the manuscript currently lacks explicit specifications of the simulator dynamics, noise models, timing constraints, and the implementation and parameterization of the three governance dimensions. This limits readers' ability to assess the generalizability of the reported metrics to physical embodied agents. In the revised manuscript we will expand the Validation section with a detailed description of the simulation platform and its physics model, the applied noise models for sensors and actuators, the timing constraints for monitoring and rollback, and the concrete parameterization of each governance dimension (including policy definitions, capability admission rules, and monitoring thresholds). We will also add pseudocode and a table summarizing the experimental parameters to support reproducibility. revision: yes

Circularity Check

0 steps flagged

No significant circularity; empirical results from independent randomized simulations

full rationale

The paper proposes an external runtime governance framework, formalizes agent/ECM/governance boundaries, and reports performance metrics (96.2% interception, 22.2% unsafe continuation, 91.4% recovery) from 1000 randomized simulation trials across three governance dimensions. These outcomes are generated by external simulation rather than by construction from fitted parameters, self-definitions, or self-citation chains. No equations, uniqueness theorems, or ansatzes are invoked that reduce the central claims to inputs. Design choices for the governance dimensions introduce only mild potential dependency, consistent with a score of 2 and the default expectation that most papers are not circular.

Axiom & Free-Parameter Ledger

1 free parameters · 1 axioms · 1 invented entities

The framework rests on the assumption that runtime enforcement of a clear control boundary is feasible and that simulation parameters can be chosen to represent realistic governance scenarios; ECMs are introduced as a new modular construct without independent external validation.

free parameters (1)

Governance policy thresholds and drift parameters
Specific numerical values for policy checks and runtime drift scenarios are required to produce the reported percentages and are not derived from first principles.

axioms (1)

domain assumption A clean separation between agent cognition and external execution oversight can be maintained at runtime without compromising agent performance.
This separation is invoked to justify moving safety logic outside the agent loop.

invented entities (1)

Embodied Capability Modules (ECMs) no independent evidence
purpose: To provide modular, governable units of capability between the agent and the runtime layer.
ECMs are postulated as part of the formal control boundary without prior independent evidence cited.

pith-pipeline@v0.9.0 · 5762 in / 1479 out tokens · 43576 ms · 2026-05-22T10:37:25.089452+00:00 · methodology

Review history (2 revisions) →

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Foundation/RealityFromDistinction.lean reality_from_one_distinction unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

We formalize the control boundary among the embodied agent, Embodied Capability Modules (ECMs), and runtime governance layer... six interacting components: (1) Capability Admission, (2) Policy Guard, (3) Execution Watcher, (4) Recovery and Rollback Manager, (5) Human Override Interface, and (6) Audit and Telemetry Layer.
IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

Results show 96.2% interception of unauthorized actions, reduction of unsafe continuation from 100% to 22.2% under runtime drift, and 91.4% recovery success... validated through 1000 randomized simulation trials

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Forward citations

Cited by 4 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

EmbodiedGovBench: A Benchmark for Governance, Recovery, and Upgrade Safety in Embodied Agent Systems
cs.RO 2026-04 unverdicted novelty 6.0

EmbodiedGovBench is a new benchmark framework that measures embodied agent systems on seven governance dimensions including policy adherence, recovery success, and upgrade safety.
Federated Single-Agent Robotics: Multi-Robot Coordination Without Intra-Robot Multi-Agent Fragmentation
cs.RO 2026-04 unverdicted novelty 5.0

Multi-robot coordination is achieved by federating single-agent robot runtimes at the fleet level instead of fragmenting each robot into multiple internal agents.
Federated Single-Agent Robotics: Multi-Robot Coordination Without Intra-Robot Multi-Agent Fragmentation
cs.RO 2026-04 unverdicted novelty 5.0

FSAR is a fleet coordination architecture that preserves each robot as a single-agent runtime and achieves multi-robot coordination via capability sharing, delegation, and layered recovery instead of internal agent fr...
ECM Contracts: Contract-Aware, Versioned, and Governable Capability Interfaces for Embodied Agents
cs.SE 2026-04 unverdicted novelty 5.0

ECM Contracts define a six-dimensional contract model for embodied capability modules that enables static checks for safe composition, installation, and versioned upgrades in robotics systems.

Reference graph

Works this paper leans on

60 extracted references · 60 canonical work pages · cited by 3 Pith papers · 10 internal anchors

[1]

RT-1: Robotics Transformer for real-world control at scale,

A. Brohan, N. Brown, J. Carbajal, Y . Chebotar, X. Chen, K. Choromanski, T. Ding,et al., “RT-1: Robotics Transformer for real-world control at scale,” inRobotics: Science and Systems (RSS), 2023

work page 2023
[2]

RT-2: Vision-Language-Action Models Transfer Web Knowledge to Robotic Control

A. Brohan, N. Brown, J. Carbajal, Y . Chebotar, J. Dabis, C. Finn, K. Hausman,et al., “RT-2: Vision- language-action models transfer web knowledge to robotic control,”arXiv preprint arXiv:2307.15818, 2023

work page internal anchor Pith review Pith/arXiv arXiv 2023
[3]

PaLM-E: An embodied multimodal language model,

D. Driess, F. Xia, M. S. M. Sajjadi, C. Lynch, A. Chowdhery, B. Ichter, A. Wahid, J. Tompson, Q. Vuong, T. Yu,et al., “PaLM-E: An embodied multimodal language model,” inProc. ICML, 2023

work page 2023
[4]

Toolformer: Language Models Can Teach Themselves to Use Tools

T. Schick, J. Dwivedi-Yu, R. Dessì, R. Raileanu, M. Lomeli, L. Zettlemoyer, N. Cancedda, and T. Scialom, “Toolformer: Language models can teach themselves to use tools,”arXiv preprint arXiv:2302.04761, 2023

work page internal anchor Pith review Pith/arXiv arXiv 2023
[5]

ReAct: Synergizing reasoning and acting in language models,

S. Yao, J. Zhao, D. Yu, N. Du, I. Shafran, K. Narasimhan, and Y . Cao, “ReAct: Synergizing reasoning and acting in language models,” inProc. Int. Conf. Learning Representations (ICLR), 2023

work page 2023
[6]

ChatGPT for Robotics: Design principles and model abilities,

S. Vemprala, R. Bonatti, A. Bucker, and A. Kapoor, “ChatGPT for Robotics: Design principles and model abilities,”IEEE Access, vol. 12, pp. 36857–36872, 2024

work page 2024
[7]

Do As I Can, Not As I Say: Grounding Language in Robotic Affordances

M. Ahn, A. Brohan, N. Brown, Y . Chebotar, O. Cortes, B. David, C. Finn,et al., “Do as I can, not as I say: Grounding language in robotic affordances,”arXiv preprint arXiv:2204.01691, 2022

work page internal anchor Pith review Pith/arXiv arXiv 2022
[8]

Code as policies: Language model programs for embodied control,

J. Liang, W. Huang, F. Xia, P. Xu, K. Hausman, B. Ichter, P. Florence, and A. Zeng, “Code as policies: Language model programs for embodied control,” inProc. IEEE ICRA, 2023

work page 2023
[9]

RoboClaw: Agentic robotic framework for scalable and long-horizon task execution with VLMs,

Y . Yang, Z. Jiang, Y . Zhang, J. Xu, and J. Zhu, “RoboClaw: Agentic robotic framework for scalable and long-horizon task execution with VLMs,”arXiv preprint arXiv:2503.07833, 2025

work page arXiv 2025
[10]

Inner Monologue: Embodied Reasoning through Planning with Language Models

W. Huang, F. Xia, T. Xiao, H. Chan, J. Liang, P. Florence, A. Zeng, J. Tompson, I. Mordatch, Y . Cheb- otar,et al., “Inner monologue: Embodied reasoning through planning with language models,”arXiv preprint arXiv:2207.05608, 2022

work page internal anchor Pith review Pith/arXiv arXiv 2022
[11]

VoxPoser: Composable 3D Value Maps for Robotic Manipulation with Language Models

W. Huang, C. Wang, R. Zhang, Y . Li, J. Wu, and L. Fei-Fei, “V oxPoser: Composable 3D value maps for robotic manipulation with language models,”arXiv preprint arXiv:2307.05973, 2023. 32

work page internal anchor Pith review Pith/arXiv arXiv 2023
[12]

Embodied foundation models at the edge: A survey of deployment constraints and mitigation strategies,

U. Grover, R. Ranjan, M. Mao, T. T. Dong, S. Praveen, Z. Wu, J. M. Chang, T. Mohsenin, Y . Sheng, A. Polyzou, E. Kanjo, and X. Lin, “Embodied foundation models at the edge: A survey of deployment constraints and mitigation strategies,”arXiv preprint arXiv:2603.16952, 2026

work page arXiv 2026
[13]

AEROS: A Single-Agent Operating Architecture with Embodied Capability Modules

Qin, X., Luan, S., See, J., Yang, C., Li, Z.: AEROS: Agent execution runtime operating system for embodied robots. arXiv preprint arXiv:2604.07039 (2026)

work page internal anchor Pith review Pith/arXiv arXiv 2026
[14]

Harness Engineering,

M. Fowler, “Harness Engineering,” martinfowler.com, 2025. [Online]. Available:https:// martinfowler.com/articles/exploring-gen-ai/harness-engineering.html

work page 2025
[15]

A survey on large language model based autonomous agents,

L. Wang, C. Ma, X. Feng, Z. Zhang, H. Yang, J. Zhang, Z. Chen, J. Tang, X. Chen, Y . Lin, W. X. Zhao, Z. Wei, and J. Wen, “A survey on large language model based autonomous agents,”Frontiers of Com- puter Science, vol. 18, no. 6, 2024

work page 2024
[16]

Chameleon: Plug-and-play compositional reasoning with large lan- guage models

P. Lu, B. Peng, H. Cheng, M. Galley, K.-W. Chang, Y . N. Wu, S.-C. Zhu, and J. Gao, “Chameleon: Plug-and-play compositional reasoning with large language models,”arXiv preprint arXiv:2304.09842, 2023

work page arXiv 2023
[17]

From assistant to double agent: Formalizing and benchmarking attacks on OpenClaw for personalized local AI agent,

Y . Wang, F. Xu, Z. Lin, G. He, Y . Huang, H. Gao, Z. Niu, S. Lian, and Z. Liu, “From assistant to double agent: Formalizing and benchmarking attacks on OpenClaw for personalized local AI agent,” arXiv preprint arXiv:2602.08412, 2026

work page arXiv 2026
[18]

Generative agents: Interactive simulacra of human behavior,

J. S. Park, J. C. O’Brien, C. J. Cai, M. R. Morris, P. Liang, and M. S. Bernstein, “Generative agents: Interactive simulacra of human behavior,” inProc. ACM UIST, 2023

work page 2023
[19]

ProgPrompt: Generating situated robot task plans using large language models,

I. Singh, V . Blukis, A. Mousavian, A. Goyal, D. Xu, J. Tremblay, D. Fox, J. Thomason, and A. Garg, “ProgPrompt: Generating situated robot task plans using large language models,” inProc. IEEE ICRA, 2023

work page 2023
[20]

Voyager: An Open-Ended Embodied Agent with Large Language Models

G. Wang, Y . Xie, Y . Jiang, A. Mandlekar, C. Xiao, Y . Zhu, L. Fan, and A. Anandkumar, “V oyager: An open-ended embodied agent with large language models,”arXiv preprint arXiv:2305.16291, 2023

work page internal anchor Pith review Pith/arXiv arXiv 2023
[21]

Aligning cyber space with physical world: A comprehensive survey on embodied ai.arXiv preprint arXiv:2407.06886, 2024

Y . Liu, Q. Guo, H. Yang, L. Li, H. Wang, J. Feng, G. Yin, and D. Shen, “Aligning cyber space with physical world: A comprehensive survey on embodied AI,”arXiv preprint arXiv:2407.06886, 2024

work page arXiv 2024
[22]

CARLA: An open urban driving simulator,

A. Dosovitskiy, G. Ros, F. Codevilla, A. Lopez, and V . Koltun, “CARLA: An open urban driving simulator,” inConf. Robot Learning (CoRL), 2017

work page 2017
[23]

Design and use paradigms for Gazebo, an open-source multi-robot simu- lator,

N. Koenig and A. Howard, “Design and use paradigms for Gazebo, an open-source multi-robot simu- lator,” inProc. IEEE/RSJ IROS, 2004

work page 2004
[24]

EARBench: Towards evaluating physical risk awareness for task planning of foundation model-based embodied AI agents,

Z. Zhu, R. Zhao, Q. Zhao, Z. Wang, and H. Zhao, “EARBench: Towards evaluating physical risk awareness for task planning of foundation model-based embodied AI agents,”arXiv preprint arXiv:2408.04449, 2024. 33

work page arXiv 2024
[25]

SafeAgentBench: A benchmark for safe task planning of embodied LLM agents,

S. Yin, Z. Pang, and others, “SafeAgentBench: A benchmark for safe task planning of embodied LLM agents,”arXiv preprint arXiv:2412.13178, 2024

work page arXiv 2024
[26]

FDIR methods for space robots: A comprehensive survey and prospects,

Q. Li, R. Gross, and S. Yin, “FDIR methods for space robots: A comprehensive survey and prospects,” Acta Astronautica, vol. 209, pp. 243–262, 2023

work page 2023
[27]

Safe reinforcement learning via shielding,

M. Alshiekh, R. Bloem, R. Ehlers, B. Könighofer, S. Niekum, and U. Topcu, “Safe reinforcement learning via shielding,” inProc. AAAI Conf. Artificial Intelligence, 2018

work page 2018
[28]

Shield synthesis for reinforcement learning,

B. Könighofer, F. Lorber, N. Jansen, and R. Bloem, “Shield synthesis for reinforcement learning,” in Proc. Int. Symp. Leveraging Applications of Formal Methods (ISoLA), 2020

work page 2020
[29]

Control barrier functions: Theory and applications,

A. D. Ames, S. Coogan, M. Egerstedt, G. Notomista, K. Sreenath, and P. Tabuada, “Control barrier functions: Theory and applications,” inEuropean Control Conference (ECC), 2019

work page 2019
[30]

A general safety framework for learning-based control in uncertain robotic systems,

J. F. Fisac, A. K. Akametalu, M. N. Zeilinger, S. Kaynama, J. Gillula, and C. J. Tomlin, “A general safety framework for learning-based control in uncertain robotic systems,”IEEE Transactions on Au- tomatic Control, vol. 64, no. 7, pp. 2737–2752, 2019

work page 2019
[31]

Constrained policy optimization,

J. Achiam, D. Held, A. Tamar, and P. Abbeel, “Constrained policy optimization,” inProc. Int. Conf. Machine Learning (ICML), 2017

work page 2017
[32]

Safe Exploration in Continuous Action Spaces

G. Dalal, K. Dvijotham, M. Vecerik, T. Hester, C. Paduraru, and Y . Tassa, “Safe exploration in contin- uous action spaces,”arXiv preprint arXiv:1801.08757, 2018

work page internal anchor Pith review Pith/arXiv arXiv 2018
[33]

Combining model checking and runtime verification for safe robotics,

A. Desai, T. Dreossi, and S. A. Seshia, “Combining model checking and runtime verification for safe robotics,” inProc. Int. Conf. Runtime Verification (RV), 2017

work page 2017
[34]

A comprehensive survey on safe reinforcement learning,

J. García and F. Fernández, “A comprehensive survey on safe reinforcement learning,”Journal of Ma- chine Learning Research, vol. 16, no. 42, pp. 1437–1480, 2015

work page 2015
[35]

Safe learning in robotics: From learning-based control to safe reinforcement learning,

L. Brunke, M. Greeff, A. W. Hall, Z. Yuan, S. Zhou, J. Panerati, and A. P. Schoellig, “Safe learning in robotics: From learning-based control to safe reinforcement learning,”Annual Review of Control, Robotics, and Autonomous Systems, vol. 5, pp. 411–444, 2022

work page 2022
[36]

A tour of reinforcement learning: The view from continuous control,

B. Recht, “A tour of reinforcement learning: The view from continuous control,”Annual Review of Control, Robotics, and Autonomous Systems, vol. 2, pp. 253–279, 2019

work page 2019
[37]

Using simplicity to control complexity,

L. Sha, “Using simplicity to control complexity,”IEEE Software, vol. 18, no. 4, pp. 20–28, 2001

work page 2001
[38]

Preventing undesirable behavior of intelligent machines,

P. S. Thomas, B. C. da Silva, A. G. Barto, S. Giguere, Y . Brun, and E. Brunskill, “Preventing undesirable behavior of intelligent machines,”Science, vol. 366, no. 6468, pp. 999–1004, 2019

work page 2019
[39]

Robots and robotic devices — Safety requirements for industrial robots,

ISO 10218-1:2011, “Robots and robotic devices — Safety requirements for industrial robots,” Interna- tional Organization for Standardization, 2011. 34

work page 2011
[40]

Robots and robotic devices — Safety requirements for personal care robots,

ISO 13482:2014, “Robots and robotic devices — Safety requirements for personal care robots,” Inter- national Organization for Standardization, 2014

work page 2014
[41]

N. G. Leveson,Engineering a Safer World: Systems Thinking Applied to Safety. MIT Press, 2011

work page 2011
[42]

Autonomous vehicle safety: An interdisciplinary challenge,

P. Koopman and M. Wagner, “Autonomous vehicle safety: An interdisciplinary challenge,”IEEE Intel- ligent Transportation Systems Magazine, vol. 9, no. 1, pp. 90–96, 2017

work page 2017
[43]

Real-time anomaly detection and reactive planning with large language models,

R. Sinha, A. Elhafsi, C. Agia, M. Foutter, E. Schmerling, and M. Pavone, “Real-time anomaly detection and reactive planning with large language models,” inRobotics: Science and Systems (RSS), 2024

work page 2024
[44]

Autort: Embodied foundation models for large scale orchestration of robotic agents.arXiv preprint arXiv:2401.12963, 2024

M. Ahn, D. Dwibedi, C. Finn, M. G. Arenas, K. Gopalakrishnan, K. Hausman, B. Ichter, A. Irpan, N. Joshi, R. Julian,et al., “AutoRT: Embodied foundation models for large scale orchestration of robotic agents,”arXiv preprint arXiv:2401.12963, 2024

work page arXiv 2024
[45]

SafeEmbodAI: A safety framework for mobile robots in embodied AI systems,

W. Zhang, X. Kong, T. Braunl, and J. B. Hong, “SafeEmbodAI: A safety framework for mobile robots in embodied AI systems,”arXiv preprint arXiv:2409.01630, 2024

work page arXiv 2024
[46]

GuardAgent: Safeguard LLM Agents by a Guard Agent via Knowledge-Enabled Reasoning

Z. Xiang, Z. Liu, P. Bian, P. Mittal, and W. Xu, “GuardAgent: Safeguard LLM agents by a guard agent via knowledge-enabled reasoning,”arXiv preprint arXiv:2406.09187, 2024

work page internal anchor Pith review Pith/arXiv arXiv 2024
[47]

RoboGuard: Safety guardrails for LLM-enabled robots,

Z. Ravichandran, A. Robey, V . Kumar, G. J. Pappas, and H. Hassani, “RoboGuard: Safety guardrails for LLM-enabled robots,”arXiv preprint arXiv:2503.07885, 2025

work page arXiv 2025
[48]

AgentSpec: Customizable runtime enforcement for safe and reliable LLM agents,

H. Wang, C. M. Poskitt, and J. Sun, “AgentSpec: Customizable runtime enforcement for safe and reliable LLM agents,” inProc. IEEE/ACM Int. Conf. Software Engineering (ICSE), 2026

work page 2026
[49]

NeMo Guardrails: A toolkit for control- lable and safe LLM applications with programmable rails,

T. Rebedea, R. Dinu, M. Sreedhar, C. Parisien, and J. Cohen, “NeMo Guardrails: A toolkit for control- lable and safe LLM applications with programmable rails,” inProc. EMNLP System Demonstrations, 2023

work page 2023
[50]

Safety guardrails for LLM-enabled robots,

A. Ravichandran, A. Robey, Z. Sun, and H. Hassani, “Safety guardrails for LLM-enabled robots,”IEEE Robotics and Automation Letters, 2026

work page 2026
[51]

TrustAgent: Towards safe and trustworthy LLM-based agents,

W. Hua, X. Yang, M. Jin, Z. Li, W. Cheng, R. Tang, and Y . Zhang, “TrustAgent: Towards safe and trustworthy LLM-based agents,” inFindings of EMNLP, 2024

work page 2024
[52]

Pro2Guard: Proactive runtime enforcement of LLM agent safety via probabilistic model checking,

H. Wang, C. M. Poskitt, J. Sun, and J. Wei, “Pro2Guard: Proactive runtime enforcement of LLM agent safety via probabilistic model checking,”arXiv preprint arXiv:2508.00500, 2025

work page arXiv 2025
[53]

A brief account of runtime verification,

M. Leucker and C. Schallhart, “A brief account of runtime verification,”J. Logic and Algebraic Pro- gramming, vol. 78, no. 5, pp. 293–303, 2009

work page 2009
[54]

Diffusion policy: Visuomotor policy learning via action diffusion,

C. Chi, S. Feng, Y . Du, Z. Xu, E. Cousineau, B. Burchfiel, and S. Song, “Diffusion policy: Visuomotor policy learning via action diffusion,” inRobotics: Science and Systems (RSS), 2023. 35

work page 2023
[55]

Domain randomization for transferring deep neural networks from simulation to the real world,

J. Tobin, R. Fong, A. Ray, J. Schneider, W. Zaremba, and P. Abbeel, “Domain randomization for transferring deep neural networks from simulation to the real world,” inProc. IEEE/RSJ IROS, 2017

work page 2017
[56]

A model for types and levels of human interaction with automation,

R. Parasuraman, T. B. Sheridan, and C. D. Wickens, “A model for types and levels of human interaction with automation,”IEEE Trans. Systems, Man, and Cybernetics—Part A, vol. 30, no. 3, pp. 286–297, 2000

work page 2000
[57]

Human-robot interaction: A survey,

M. A. Goodrich and A. C. Schultz, “Human-robot interaction: A survey,”Foundations and Trends in Human-Computer Interaction, vol. 1, no. 3, pp. 203–275, 2007

work page 2007
[58]

The challenge of crafting intelligible explanations,

D. S. Weld and G. Bansal, “The challenge of crafting intelligible explanations,”Communications of the ACM, vol. 62, no. 7, pp. 70–79, 2019

work page 2019
[59]

Regulation (EU) 2024/1689 laying down harmonised rules on artificial intelligence (AI Act),

European Parliament and Council of the European Union, “Regulation (EU) 2024/1689 laying down harmonised rules on artificial intelligence (AI Act),”Official Journal of the European Union, L series, 2024

work page 2024
[60]

Learning Without Losing Identity: Capability Evolution for Embodied Agents

Qin, X., Luan, S., See, J., Yang, C., Li, Z.: Learning without losing identity: Capability evolution for embodied agents. arXiv preprint arXiv:2604.07799 (2026) 36

work page internal anchor Pith review Pith/arXiv arXiv 2026

[1] [1]

RT-1: Robotics Transformer for real-world control at scale,

A. Brohan, N. Brown, J. Carbajal, Y . Chebotar, X. Chen, K. Choromanski, T. Ding,et al., “RT-1: Robotics Transformer for real-world control at scale,” inRobotics: Science and Systems (RSS), 2023

work page 2023

[2] [2]

RT-2: Vision-Language-Action Models Transfer Web Knowledge to Robotic Control

A. Brohan, N. Brown, J. Carbajal, Y . Chebotar, J. Dabis, C. Finn, K. Hausman,et al., “RT-2: Vision- language-action models transfer web knowledge to robotic control,”arXiv preprint arXiv:2307.15818, 2023

work page internal anchor Pith review Pith/arXiv arXiv 2023

[3] [3]

PaLM-E: An embodied multimodal language model,

D. Driess, F. Xia, M. S. M. Sajjadi, C. Lynch, A. Chowdhery, B. Ichter, A. Wahid, J. Tompson, Q. Vuong, T. Yu,et al., “PaLM-E: An embodied multimodal language model,” inProc. ICML, 2023

work page 2023

[4] [4]

Toolformer: Language Models Can Teach Themselves to Use Tools

T. Schick, J. Dwivedi-Yu, R. Dessì, R. Raileanu, M. Lomeli, L. Zettlemoyer, N. Cancedda, and T. Scialom, “Toolformer: Language models can teach themselves to use tools,”arXiv preprint arXiv:2302.04761, 2023

work page internal anchor Pith review Pith/arXiv arXiv 2023

[5] [5]

ReAct: Synergizing reasoning and acting in language models,

S. Yao, J. Zhao, D. Yu, N. Du, I. Shafran, K. Narasimhan, and Y . Cao, “ReAct: Synergizing reasoning and acting in language models,” inProc. Int. Conf. Learning Representations (ICLR), 2023

work page 2023

[6] [6]

ChatGPT for Robotics: Design principles and model abilities,

S. Vemprala, R. Bonatti, A. Bucker, and A. Kapoor, “ChatGPT for Robotics: Design principles and model abilities,”IEEE Access, vol. 12, pp. 36857–36872, 2024

work page 2024

[7] [7]

Do As I Can, Not As I Say: Grounding Language in Robotic Affordances

M. Ahn, A. Brohan, N. Brown, Y . Chebotar, O. Cortes, B. David, C. Finn,et al., “Do as I can, not as I say: Grounding language in robotic affordances,”arXiv preprint arXiv:2204.01691, 2022

work page internal anchor Pith review Pith/arXiv arXiv 2022

[8] [8]

Code as policies: Language model programs for embodied control,

J. Liang, W. Huang, F. Xia, P. Xu, K. Hausman, B. Ichter, P. Florence, and A. Zeng, “Code as policies: Language model programs for embodied control,” inProc. IEEE ICRA, 2023

work page 2023

[9] [9]

RoboClaw: Agentic robotic framework for scalable and long-horizon task execution with VLMs,

Y . Yang, Z. Jiang, Y . Zhang, J. Xu, and J. Zhu, “RoboClaw: Agentic robotic framework for scalable and long-horizon task execution with VLMs,”arXiv preprint arXiv:2503.07833, 2025

work page arXiv 2025

[10] [10]

Inner Monologue: Embodied Reasoning through Planning with Language Models

W. Huang, F. Xia, T. Xiao, H. Chan, J. Liang, P. Florence, A. Zeng, J. Tompson, I. Mordatch, Y . Cheb- otar,et al., “Inner monologue: Embodied reasoning through planning with language models,”arXiv preprint arXiv:2207.05608, 2022

work page internal anchor Pith review Pith/arXiv arXiv 2022

[11] [11]

VoxPoser: Composable 3D Value Maps for Robotic Manipulation with Language Models

W. Huang, C. Wang, R. Zhang, Y . Li, J. Wu, and L. Fei-Fei, “V oxPoser: Composable 3D value maps for robotic manipulation with language models,”arXiv preprint arXiv:2307.05973, 2023. 32

work page internal anchor Pith review Pith/arXiv arXiv 2023

[12] [12]

Embodied foundation models at the edge: A survey of deployment constraints and mitigation strategies,

U. Grover, R. Ranjan, M. Mao, T. T. Dong, S. Praveen, Z. Wu, J. M. Chang, T. Mohsenin, Y . Sheng, A. Polyzou, E. Kanjo, and X. Lin, “Embodied foundation models at the edge: A survey of deployment constraints and mitigation strategies,”arXiv preprint arXiv:2603.16952, 2026

work page arXiv 2026

[13] [13]

AEROS: A Single-Agent Operating Architecture with Embodied Capability Modules

Qin, X., Luan, S., See, J., Yang, C., Li, Z.: AEROS: Agent execution runtime operating system for embodied robots. arXiv preprint arXiv:2604.07039 (2026)

work page internal anchor Pith review Pith/arXiv arXiv 2026

[14] [14]

Harness Engineering,

M. Fowler, “Harness Engineering,” martinfowler.com, 2025. [Online]. Available:https:// martinfowler.com/articles/exploring-gen-ai/harness-engineering.html

work page 2025

[15] [15]

A survey on large language model based autonomous agents,

L. Wang, C. Ma, X. Feng, Z. Zhang, H. Yang, J. Zhang, Z. Chen, J. Tang, X. Chen, Y . Lin, W. X. Zhao, Z. Wei, and J. Wen, “A survey on large language model based autonomous agents,”Frontiers of Com- puter Science, vol. 18, no. 6, 2024

work page 2024

[16] [16]

Chameleon: Plug-and-play compositional reasoning with large lan- guage models

P. Lu, B. Peng, H. Cheng, M. Galley, K.-W. Chang, Y . N. Wu, S.-C. Zhu, and J. Gao, “Chameleon: Plug-and-play compositional reasoning with large language models,”arXiv preprint arXiv:2304.09842, 2023

work page arXiv 2023

[17] [17]

From assistant to double agent: Formalizing and benchmarking attacks on OpenClaw for personalized local AI agent,

Y . Wang, F. Xu, Z. Lin, G. He, Y . Huang, H. Gao, Z. Niu, S. Lian, and Z. Liu, “From assistant to double agent: Formalizing and benchmarking attacks on OpenClaw for personalized local AI agent,” arXiv preprint arXiv:2602.08412, 2026

work page arXiv 2026

[18] [18]

Generative agents: Interactive simulacra of human behavior,

J. S. Park, J. C. O’Brien, C. J. Cai, M. R. Morris, P. Liang, and M. S. Bernstein, “Generative agents: Interactive simulacra of human behavior,” inProc. ACM UIST, 2023

work page 2023

[19] [19]

ProgPrompt: Generating situated robot task plans using large language models,

I. Singh, V . Blukis, A. Mousavian, A. Goyal, D. Xu, J. Tremblay, D. Fox, J. Thomason, and A. Garg, “ProgPrompt: Generating situated robot task plans using large language models,” inProc. IEEE ICRA, 2023

work page 2023

[20] [20]

Voyager: An Open-Ended Embodied Agent with Large Language Models

G. Wang, Y . Xie, Y . Jiang, A. Mandlekar, C. Xiao, Y . Zhu, L. Fan, and A. Anandkumar, “V oyager: An open-ended embodied agent with large language models,”arXiv preprint arXiv:2305.16291, 2023

work page internal anchor Pith review Pith/arXiv arXiv 2023

[21] [21]

Aligning cyber space with physical world: A comprehensive survey on embodied ai.arXiv preprint arXiv:2407.06886, 2024

Y . Liu, Q. Guo, H. Yang, L. Li, H. Wang, J. Feng, G. Yin, and D. Shen, “Aligning cyber space with physical world: A comprehensive survey on embodied AI,”arXiv preprint arXiv:2407.06886, 2024

work page arXiv 2024

[22] [22]

CARLA: An open urban driving simulator,

A. Dosovitskiy, G. Ros, F. Codevilla, A. Lopez, and V . Koltun, “CARLA: An open urban driving simulator,” inConf. Robot Learning (CoRL), 2017

work page 2017

[23] [23]

Design and use paradigms for Gazebo, an open-source multi-robot simu- lator,

N. Koenig and A. Howard, “Design and use paradigms for Gazebo, an open-source multi-robot simu- lator,” inProc. IEEE/RSJ IROS, 2004

work page 2004

[24] [24]

EARBench: Towards evaluating physical risk awareness for task planning of foundation model-based embodied AI agents,

Z. Zhu, R. Zhao, Q. Zhao, Z. Wang, and H. Zhao, “EARBench: Towards evaluating physical risk awareness for task planning of foundation model-based embodied AI agents,”arXiv preprint arXiv:2408.04449, 2024. 33

work page arXiv 2024

[25] [25]

SafeAgentBench: A benchmark for safe task planning of embodied LLM agents,

S. Yin, Z. Pang, and others, “SafeAgentBench: A benchmark for safe task planning of embodied LLM agents,”arXiv preprint arXiv:2412.13178, 2024

work page arXiv 2024

[26] [26]

FDIR methods for space robots: A comprehensive survey and prospects,

Q. Li, R. Gross, and S. Yin, “FDIR methods for space robots: A comprehensive survey and prospects,” Acta Astronautica, vol. 209, pp. 243–262, 2023

work page 2023

[27] [27]

Safe reinforcement learning via shielding,

M. Alshiekh, R. Bloem, R. Ehlers, B. Könighofer, S. Niekum, and U. Topcu, “Safe reinforcement learning via shielding,” inProc. AAAI Conf. Artificial Intelligence, 2018

work page 2018

[28] [28]

Shield synthesis for reinforcement learning,

B. Könighofer, F. Lorber, N. Jansen, and R. Bloem, “Shield synthesis for reinforcement learning,” in Proc. Int. Symp. Leveraging Applications of Formal Methods (ISoLA), 2020

work page 2020

[29] [29]

Control barrier functions: Theory and applications,

A. D. Ames, S. Coogan, M. Egerstedt, G. Notomista, K. Sreenath, and P. Tabuada, “Control barrier functions: Theory and applications,” inEuropean Control Conference (ECC), 2019

work page 2019

[30] [30]

A general safety framework for learning-based control in uncertain robotic systems,

J. F. Fisac, A. K. Akametalu, M. N. Zeilinger, S. Kaynama, J. Gillula, and C. J. Tomlin, “A general safety framework for learning-based control in uncertain robotic systems,”IEEE Transactions on Au- tomatic Control, vol. 64, no. 7, pp. 2737–2752, 2019

work page 2019

[31] [31]

Constrained policy optimization,

J. Achiam, D. Held, A. Tamar, and P. Abbeel, “Constrained policy optimization,” inProc. Int. Conf. Machine Learning (ICML), 2017

work page 2017

[32] [32]

Safe Exploration in Continuous Action Spaces

G. Dalal, K. Dvijotham, M. Vecerik, T. Hester, C. Paduraru, and Y . Tassa, “Safe exploration in contin- uous action spaces,”arXiv preprint arXiv:1801.08757, 2018

work page internal anchor Pith review Pith/arXiv arXiv 2018

[33] [33]

Combining model checking and runtime verification for safe robotics,

A. Desai, T. Dreossi, and S. A. Seshia, “Combining model checking and runtime verification for safe robotics,” inProc. Int. Conf. Runtime Verification (RV), 2017

work page 2017

[34] [34]

A comprehensive survey on safe reinforcement learning,

J. García and F. Fernández, “A comprehensive survey on safe reinforcement learning,”Journal of Ma- chine Learning Research, vol. 16, no. 42, pp. 1437–1480, 2015

work page 2015

[35] [35]

Safe learning in robotics: From learning-based control to safe reinforcement learning,

L. Brunke, M. Greeff, A. W. Hall, Z. Yuan, S. Zhou, J. Panerati, and A. P. Schoellig, “Safe learning in robotics: From learning-based control to safe reinforcement learning,”Annual Review of Control, Robotics, and Autonomous Systems, vol. 5, pp. 411–444, 2022

work page 2022

[36] [36]

A tour of reinforcement learning: The view from continuous control,

B. Recht, “A tour of reinforcement learning: The view from continuous control,”Annual Review of Control, Robotics, and Autonomous Systems, vol. 2, pp. 253–279, 2019

work page 2019

[37] [37]

Using simplicity to control complexity,

L. Sha, “Using simplicity to control complexity,”IEEE Software, vol. 18, no. 4, pp. 20–28, 2001

work page 2001

[38] [38]

Preventing undesirable behavior of intelligent machines,

P. S. Thomas, B. C. da Silva, A. G. Barto, S. Giguere, Y . Brun, and E. Brunskill, “Preventing undesirable behavior of intelligent machines,”Science, vol. 366, no. 6468, pp. 999–1004, 2019

work page 2019

[39] [39]

Robots and robotic devices — Safety requirements for industrial robots,

ISO 10218-1:2011, “Robots and robotic devices — Safety requirements for industrial robots,” Interna- tional Organization for Standardization, 2011. 34

work page 2011

[40] [40]

Robots and robotic devices — Safety requirements for personal care robots,

ISO 13482:2014, “Robots and robotic devices — Safety requirements for personal care robots,” Inter- national Organization for Standardization, 2014

work page 2014

[41] [41]

N. G. Leveson,Engineering a Safer World: Systems Thinking Applied to Safety. MIT Press, 2011

work page 2011

[42] [42]

Autonomous vehicle safety: An interdisciplinary challenge,

P. Koopman and M. Wagner, “Autonomous vehicle safety: An interdisciplinary challenge,”IEEE Intel- ligent Transportation Systems Magazine, vol. 9, no. 1, pp. 90–96, 2017

work page 2017

[43] [43]

Real-time anomaly detection and reactive planning with large language models,

R. Sinha, A. Elhafsi, C. Agia, M. Foutter, E. Schmerling, and M. Pavone, “Real-time anomaly detection and reactive planning with large language models,” inRobotics: Science and Systems (RSS), 2024

work page 2024

[44] [44]

Autort: Embodied foundation models for large scale orchestration of robotic agents.arXiv preprint arXiv:2401.12963, 2024

M. Ahn, D. Dwibedi, C. Finn, M. G. Arenas, K. Gopalakrishnan, K. Hausman, B. Ichter, A. Irpan, N. Joshi, R. Julian,et al., “AutoRT: Embodied foundation models for large scale orchestration of robotic agents,”arXiv preprint arXiv:2401.12963, 2024

work page arXiv 2024

[45] [45]

SafeEmbodAI: A safety framework for mobile robots in embodied AI systems,

W. Zhang, X. Kong, T. Braunl, and J. B. Hong, “SafeEmbodAI: A safety framework for mobile robots in embodied AI systems,”arXiv preprint arXiv:2409.01630, 2024

work page arXiv 2024

[46] [46]

GuardAgent: Safeguard LLM Agents by a Guard Agent via Knowledge-Enabled Reasoning

Z. Xiang, Z. Liu, P. Bian, P. Mittal, and W. Xu, “GuardAgent: Safeguard LLM agents by a guard agent via knowledge-enabled reasoning,”arXiv preprint arXiv:2406.09187, 2024

work page internal anchor Pith review Pith/arXiv arXiv 2024

[47] [47]

RoboGuard: Safety guardrails for LLM-enabled robots,

Z. Ravichandran, A. Robey, V . Kumar, G. J. Pappas, and H. Hassani, “RoboGuard: Safety guardrails for LLM-enabled robots,”arXiv preprint arXiv:2503.07885, 2025

work page arXiv 2025

[48] [48]

AgentSpec: Customizable runtime enforcement for safe and reliable LLM agents,

H. Wang, C. M. Poskitt, and J. Sun, “AgentSpec: Customizable runtime enforcement for safe and reliable LLM agents,” inProc. IEEE/ACM Int. Conf. Software Engineering (ICSE), 2026

work page 2026

[49] [49]

NeMo Guardrails: A toolkit for control- lable and safe LLM applications with programmable rails,

T. Rebedea, R. Dinu, M. Sreedhar, C. Parisien, and J. Cohen, “NeMo Guardrails: A toolkit for control- lable and safe LLM applications with programmable rails,” inProc. EMNLP System Demonstrations, 2023

work page 2023

[50] [50]

Safety guardrails for LLM-enabled robots,

A. Ravichandran, A. Robey, Z. Sun, and H. Hassani, “Safety guardrails for LLM-enabled robots,”IEEE Robotics and Automation Letters, 2026

work page 2026

[51] [51]

TrustAgent: Towards safe and trustworthy LLM-based agents,

W. Hua, X. Yang, M. Jin, Z. Li, W. Cheng, R. Tang, and Y . Zhang, “TrustAgent: Towards safe and trustworthy LLM-based agents,” inFindings of EMNLP, 2024

work page 2024

[52] [52]

Pro2Guard: Proactive runtime enforcement of LLM agent safety via probabilistic model checking,

H. Wang, C. M. Poskitt, J. Sun, and J. Wei, “Pro2Guard: Proactive runtime enforcement of LLM agent safety via probabilistic model checking,”arXiv preprint arXiv:2508.00500, 2025

work page arXiv 2025

[53] [53]

A brief account of runtime verification,

M. Leucker and C. Schallhart, “A brief account of runtime verification,”J. Logic and Algebraic Pro- gramming, vol. 78, no. 5, pp. 293–303, 2009

work page 2009

[54] [54]

Diffusion policy: Visuomotor policy learning via action diffusion,

C. Chi, S. Feng, Y . Du, Z. Xu, E. Cousineau, B. Burchfiel, and S. Song, “Diffusion policy: Visuomotor policy learning via action diffusion,” inRobotics: Science and Systems (RSS), 2023. 35

work page 2023

[55] [55]

Domain randomization for transferring deep neural networks from simulation to the real world,

J. Tobin, R. Fong, A. Ray, J. Schneider, W. Zaremba, and P. Abbeel, “Domain randomization for transferring deep neural networks from simulation to the real world,” inProc. IEEE/RSJ IROS, 2017

work page 2017

[56] [56]

A model for types and levels of human interaction with automation,

R. Parasuraman, T. B. Sheridan, and C. D. Wickens, “A model for types and levels of human interaction with automation,”IEEE Trans. Systems, Man, and Cybernetics—Part A, vol. 30, no. 3, pp. 286–297, 2000

work page 2000

[57] [57]

Human-robot interaction: A survey,

M. A. Goodrich and A. C. Schultz, “Human-robot interaction: A survey,”Foundations and Trends in Human-Computer Interaction, vol. 1, no. 3, pp. 203–275, 2007

work page 2007

[58] [58]

The challenge of crafting intelligible explanations,

D. S. Weld and G. Bansal, “The challenge of crafting intelligible explanations,”Communications of the ACM, vol. 62, no. 7, pp. 70–79, 2019

work page 2019

[59] [59]

Regulation (EU) 2024/1689 laying down harmonised rules on artificial intelligence (AI Act),

European Parliament and Council of the European Union, “Regulation (EU) 2024/1689 laying down harmonised rules on artificial intelligence (AI Act),”Official Journal of the European Union, L series, 2024

work page 2024

[60] [60]

Learning Without Losing Identity: Capability Evolution for Embodied Agents

Qin, X., Luan, S., See, J., Yang, C., Li, Z.: Learning without losing identity: Capability evolution for embodied agents. arXiv preprint arXiv:2604.07799 (2026) 36

work page internal anchor Pith review Pith/arXiv arXiv 2026