SABER uses a trained ReAct agent to produce bounded adversarial edits to robot instructions, cutting task success by 20.6% and increasing execution length and violations on the LIBERO benchmark across six VLA models.
Deciphering the chaos: Enhancing jailbreak attacks via adversarial prompt translation
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
years
2026 2verdicts
UNVERDICTED 2representative citing papers
Detect-and-misdirect defenses bound asymptotic attacker success rates in model-guided jailbreaks on agentic AI, unlike detect-and-block which permit near-certain success with sufficient queries.
citing papers explorer
-
SABER: A Stealthy Agentic Black-Box Attack Framework for Vision-Language-Action Models
SABER uses a trained ReAct agent to produce bounded adversarial edits to robot instructions, cutting task success by 20.6% and increasing execution length and violations on the LIBERO benchmark across six VLA models.
-
Analyzing Defensive Misdirection Against Model-Guided Automated Attacks on Agentic AI Systems
Detect-and-misdirect defenses bound asymptotic attacker success rates in model-guided jailbreaks on agentic AI, unlike detect-and-block which permit near-certain success with sufficient queries.