SDM is a new staged gradient attack that reconstructs the adversarial objective around probability differences and reports stronger performance than prior methods like APGD.
Generation and Countermeasures of adversarial examples on vision: a survey
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
citation-role summary
background 1
citation-polarity summary
years
2026 2verdicts
UNVERDICTED 2roles
background 1polarities
background 1representative citing papers
AVISE provides a new framework and automated SET that identifies jailbreak vulnerabilities in language models with 92% accuracy, finding all nine tested models vulnerable to an augmented Red Queen attack.
citing papers explorer
-
SDM: A Powerful Tool for Evaluating Model Robustness
SDM is a new staged gradient attack that reconstructs the adversarial objective around probability differences and reports stronger performance than prior methods like APGD.
-
AVISE: Framework for Evaluating the Security of AI Systems
AVISE provides a new framework and automated SET that identifies jailbreak vulnerabilities in language models with 92% accuracy, finding all nine tested models vulnerable to an augmented Red Queen attack.