RoboJailBench creates a taxonomy-based benchmark, intent-contrast datasets, and evaluation framework for jailbreak attacks and defenses in embodied robotic AI systems.
EARBench: Towards evaluating physical risk awareness for task planning of foundation model-based embodied AI agents
5 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
verdicts
UNVERDICTED 5roles
background 1polarities
background 1representative citing papers
SafetyALFRED shows multimodal LLMs recognize kitchen hazards accurately in QA tests but achieve low success rates when required to mitigate those hazards through embodied planning.
LLMs for robotic health attendant control violate safety rules in 54.4% of harmful scenarios on average, with proprietary models at 23.7% median violation versus 72.8% for open-weight models, indicating they are not yet safe for clinical use.
A runtime governance framework for embodied agents intercepts 96.2% of unauthorized actions and achieves 91.4% recovery success in 1000 simulation trials while outperforming baselines.
The paper proposes a paradigm of provable probabilistic safety to enable scalable, safe deployment of embodied AI in critical applications.
citing papers explorer
-
RoboJailBench: Benchmarking Adversarial Attacks and Defenses in Embodied Robotic Agents
RoboJailBench creates a taxonomy-based benchmark, intent-contrast datasets, and evaluation framework for jailbreak attacks and defenses in embodied robotic AI systems.
-
SafetyALFRED: Evaluating Safety-Conscious Planning of Multimodal Large Language Models
SafetyALFRED shows multimodal LLMs recognize kitchen hazards accurately in QA tests but achieve low success rates when required to mitigate those hazards through embodied planning.
-
Benchmarking the Safety of Large Language Models for Robotic Health Attendant Control
LLMs for robotic health attendant control violate safety rules in 54.4% of harmful scenarios on average, with proprietary models at 23.7% median violation versus 72.8% for open-weight models, indicating they are not yet safe for clinical use.
-
Harnessing Embodied Agents: Runtime Governance for Policy-Constrained Execution
A runtime governance framework for embodied agents intercepts 96.2% of unauthorized actions and achieves 91.4% recovery success in 1000 simulation trials while outperforming baselines.
-
Towards provable probabilistic safety for scalable embodied AI systems
The paper proposes a paradigm of provable probabilistic safety to enable scalable, safe deployment of embodied AI in critical applications.