BEAP is a black-box embedding-aware prompting attack using LLM-guided search that raises attack success rate over 60% against unlearned diffusion models while keeping prompts undetectable.
Mitigating inappropriateness in image generation: Can there be value in reflecting the world’s ugli- ness?arXiv preprint arXiv:2305.18398, 2023
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.CV 1years
2026 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
Erased but Exploitable: Black-box Embedding-Aware Prompting Against Unlearned Text-to-Image Diffusion Models
BEAP is a black-box embedding-aware prompting attack using LLM-guided search that raises attack success rate over 60% against unlearned diffusion models while keeping prompts undetectable.