Deciphering the chaos: Enhancing jailbreak attacks via adversarial prompt translation

Qizhang Li, Xiaochen Yang, Wangmeng Zuo, Yiwen Guo · 2024 · arXiv 2410.11317

2 Pith papers cite this work. Polarity classification is still indexing.

2 Pith papers citing it

representative citing papers

SABER: A Stealthy Agentic Black-Box Attack Framework for Vision-Language-Action Models

cs.RO · 2026-03-26 · unverdicted · novelty 6.0

SABER uses a trained ReAct agent to produce bounded adversarial edits to robot instructions, cutting task success by 20.6% and increasing execution length and violations on the LIBERO benchmark across six VLA models.

Analyzing Defensive Misdirection Against Model-Guided Automated Attacks on Agentic AI Systems

cs.CR · 2026-06-18 · unverdicted · novelty 5.0

Detect-and-misdirect defenses bound asymptotic attacker success rates in model-guided jailbreaks on agentic AI, unlike detect-and-block which permit near-certain success with sufficient queries.

citing papers explorer

Showing 2 of 2 citing papers after filters.

SABER: A Stealthy Agentic Black-Box Attack Framework for Vision-Language-Action Models cs.RO · 2026-03-26 · unverdicted · none · ref 19
SABER uses a trained ReAct agent to produce bounded adversarial edits to robot instructions, cutting task success by 20.6% and increasing execution length and violations on the LIBERO benchmark across six VLA models.
Analyzing Defensive Misdirection Against Model-Guided Automated Attacks on Agentic AI Systems cs.CR · 2026-06-18 · unverdicted · none · ref 9
Detect-and-misdirect defenses bound asymptotic attacker success rates in model-guided jailbreaks on agentic AI, unlike detect-and-block which permit near-certain success with sufficient queries.

Deciphering the chaos: Enhancing jailbreak attacks via adversarial prompt translation

fields

years

verdicts

representative citing papers

citing papers explorer