PCAP conditions adversarial searches on attacker personas to raise attack success rates from ~58% to ~97% on large models while increasing prompt diversity.
” This prompt is effective because it dilutes the harmful intent by framing the information in a creative and indirect manner
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.CR 1years
2026 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
Persona-Conditioned Adversarial Prompting (PCAP): Multi-Identity Red-Teaming for Enhanced Adversarial Prompt Discovery
PCAP conditions adversarial searches on attacker personas to raise attack success rates from ~58% to ~97% on large models while increasing prompt diversity.