arXiv preprint arXiv:2506.14697 , year=

· 2025 · arXiv 2506.14697

8 Pith papers cite this work. Polarity classification is still indexing.

8 Pith papers citing it

read on arXiv browse 8 citing papers

citation-role summary

background 4

citation-polarity summary

background 4

representative citing papers

Taxonomy and Consistency Analysis of Safety Benchmarks for AI Agents

cs.CY · 2026-04-11 · accept · novelty 8.0

This paper delivers the first systematic taxonomy and cross-benchmark consistency analysis of 40 agent safety benchmarks, finding broad but shallow risk coverage, no ranking concordance across evaluations, and that benchmark choice systematically alters reported safety.

Using large language models for embodied planning introduces systematic safety risks

cs.AI · 2026-04-20 · unverdicted · novelty 7.0

LLM planners for robots often produce dangerous plans even when planning succeeds, with safety awareness staying flat as model scale improves planning ability.

HazardArena: Evaluating Semantic Safety in Vision-Language-Action Models

cs.RO · 2026-04-14 · unverdicted · novelty 7.0

HazardArena shows VLA models trained on safe data frequently produce unsafe actions in semantically risky but visually similar settings, and a training-free Safety Option Layer reduces those failures with little performance cost.

SecureWebArena: A Holistic Security Evaluation Benchmark for LVLM-based Web Agents

cs.CR · 2025-10-11 · unverdicted · novelty 7.0

SecureWebArena is a new benchmark suite for holistic security evaluation of LVLM-based web agents using diverse simulated environments, attack taxonomies, and multi-layered failure analysis across reasoning, behavior, and outcomes.

GuardAD: Safeguarding Autonomous Driving MLLMs via Markovian Safety Logic

cs.AI · 2026-05-11 · unverdicted · novelty 6.0

GuardAD reduces accident rates by 32% in autonomous driving MLLMs by using n-th order Markovian logic to infer latent hazards and revise actions.

A Patch-based Cross-view Regularized Framework for Backdoor Defense in Multimodal Large Language Models

cs.CV · 2026-04-06 · unverdicted · novelty 5.0

A patch-augmented cross-view regularization method reduces backdoor attack success rates in multimodal LLMs by enforcing output differences between original and perturbed views while using entropy constraints to preserve benign generation quality.

Vision-Language-Action Safety: Threats, Challenges, Evaluations, and Mechanisms

cs.RO · 2026-04-26 · accept · novelty 4.0

A literature survey that unifies fragmented work on attacks, defenses, evaluations, and deployment challenges for Vision-Language-Action models in robotics.

Safety in Embodied AI: A Survey of Risks, Attacks, and Defenses

cs.CR · 2026-03-28

citing papers explorer

Showing 8 of 8 citing papers.

Taxonomy and Consistency Analysis of Safety Benchmarks for AI Agents cs.CY · 2026-04-11 · accept · none · ref 30
This paper delivers the first systematic taxonomy and cross-benchmark consistency analysis of 40 agent safety benchmarks, finding broad but shallow risk coverage, no ranking concordance across evaluations, and that benchmark choice systematically alters reported safety.
Using large language models for embodied planning introduces systematic safety risks cs.AI · 2026-04-20 · unverdicted · none · ref 49
LLM planners for robots often produce dangerous plans even when planning succeeds, with safety awareness staying flat as model scale improves planning ability.
HazardArena: Evaluating Semantic Safety in Vision-Language-Action Models cs.RO · 2026-04-14 · unverdicted · none · ref 12
HazardArena shows VLA models trained on safe data frequently produce unsafe actions in semantically risky but visually similar settings, and a training-free Safety Option Layer reduces those failures with little performance cost.
SecureWebArena: A Holistic Security Evaluation Benchmark for LVLM-based Web Agents cs.CR · 2025-10-11 · unverdicted · none · ref 28
SecureWebArena is a new benchmark suite for holistic security evaluation of LVLM-based web agents using diverse simulated environments, attack taxonomies, and multi-layered failure analysis across reasoning, behavior, and outcomes.
GuardAD: Safeguarding Autonomous Driving MLLMs via Markovian Safety Logic cs.AI · 2026-05-11 · unverdicted · none · ref 48
GuardAD reduces accident rates by 32% in autonomous driving MLLMs by using n-th order Markovian logic to infer latent hazards and revise actions.
A Patch-based Cross-view Regularized Framework for Backdoor Defense in Multimodal Large Language Models cs.CV · 2026-04-06 · unverdicted · none · ref 38
A patch-augmented cross-view regularization method reduces backdoor attack success rates in multimodal LLMs by enforcing output differences between original and perturbed views while using entropy constraints to preserve benign generation quality.
Vision-Language-Action Safety: Threats, Challenges, Evaluations, and Mechanisms cs.RO · 2026-04-26 · accept · none · ref 85
A literature survey that unifies fragmented work on attacks, defenses, evaluations, and deployment challenges for Vision-Language-Action models in robotics.
Safety in Embodied AI: A Survey of Risks, Attacks, and Defenses cs.CR · 2026-03-28 · unreviewed · ref 220

arXiv preprint arXiv:2506.14697 , year=

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer