BEAP is a black-box embedding-aware prompting attack using LLM-guided search that raises attack success rate over 60% against unlearned diffusion models while keeping prompts undetectable.
Erasediff: Erasing data influence in diffusion models
4 Pith papers cite this work. Polarity classification is still indexing.
years
2026 4verdicts
UNVERDICTED 4representative citing papers
Gaussian probing infers harmful model specialization from parameter perturbations and internal representation responses to Gaussian latent ensembles rather than from generated outputs.
TICoE achieves more precise and faithful concept erasure in text-to-image models by collaborating text and image data through a convex manifold and hierarchical learning, outperforming prior methods.
Unlearning methods that strongly erase concepts from text-to-image diffusion models consistently degrade performance on attribute binding, spatial reasoning, and counting tasks.
citing papers explorer
-
Erased but Exploitable: Black-box Embedding-Aware Prompting Against Unlearned Text-to-Image Diffusion Models
BEAP is a black-box embedding-aware prompting attack using LLM-guided search that raises attack success rate over 60% against unlearned diffusion models while keeping prompts undetectable.
-
Evaluation without Generation: Non-Generative Assessment of Harmful Model Specialization with Applications to CSAM
Gaussian probing infers harmful model specialization from parameter perturbations and internal representation responses to Gaussian latent ensembles rather than from generated outputs.
-
Beyond Text Prompts: Precise Concept Erasure through Text-Image Collaboration
TICoE achieves more precise and faithful concept erasure in text-to-image models by collaborating text and image data through a convex manifold and hierarchical learning, outperforming prior methods.
-
Erasure or Erosion? Evaluating Compositional Degradation in Unlearned Text-To-Image Diffusion Models
Unlearning methods that strongly erase concepts from text-to-image diffusion models consistently degrade performance on attribute binding, spatial reasoning, and counting tasks.