Erasing Thousands of Concepts: Towards Scalable and Practical Concept Erasure for Text-to-Image Diffusion Models

Byung Hyun Lee; Hoigi Seo; Jaehyun Cho; Se Young Chun; Sungjin Lim

arxiv: 2604.16481 · v1 · submitted 2026-04-12 · 💻 cs.CV · cs.AI

Erasing Thousands of Concepts: Towards Scalable and Practical Concept Erasure for Text-to-Image Diffusion Models

Hoigi Seo , Byung Hyun Lee , Jaehyun Cho , Sungjin Lim , Se Young Chun This is my paper

Pith reviewed 2026-05-10 15:35 UTC · model grok-4.3

classification 💻 cs.CV cs.AI

keywords concept erasuretext-to-image diffusionscalable safetymixture modeloptimal transportmixture of expertsrobustnessembedding manipulation

0 comments

The pith

Text-to-image diffusion models can have thousands of unwanted concepts erased while keeping generation quality and resisting attacks.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper presents a framework that models clusters of concept information inside text embeddings as mixtures of Student's t-distributions. It then applies affine optimal transport to shift away the target clusters while holding the boundaries of the remaining clusters in place, without needing hand-picked anchor examples. A mixture-of-experts module is trained to perform the removal on the embeddings and is hardened against removal attacks by adding noise to the projector during fine-tuning. If the approach works at the claimed scale, safety teams could clean large public models of many prohibited or copyrighted concepts at once instead of handling them one by one or a few hundred at a time.

Core claim

Low-rank concept distributions in text embeddings are captured by a Student's t-distribution Mixture Model that supports pin-point erasure of target concepts through affine optimal transport; boundaries of non-target distributions are preserved without pre-defined anchors. A Mixture-of-Experts module called MoEraser is then trained to delete the target embeddings while retaining the anchor embeddings, with noise injected into the text embedding projector during fine-tuning to confer robustness against white-box attacks such as module removal. Experiments across more than two thousand concepts and multiple diffusion models show that the combined procedure maintains generation quality.

What carries the argument

Student's t-distribution Mixture Model for low-rank concept distributions, combined with affine optimal transport for targeted shifts and a noise-hardened Mixture-of-Experts eraser module that selectively removes target embeddings while anchoring the rest.

If this is right

Thousands of concepts can be removed from a single model in one training pass instead of sequential small-scale edits.
The same model continues to produce high-quality images on unrelated prompts after the large-scale edits.
The erasure survives direct attempts to strip out the added module because noise training forces the underlying network to internalize the change.
No separate list of safe anchor concepts is required to protect wanted outputs during the process.
The procedure transfers across different diffusion architectures and across visual domains without per-model redesign.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Similar embedding-space modeling could be applied to video or 3D generators if their text conditioning follows comparable low-rank structure.
Layering this erasure step with prompt filters or output classifiers would create defense-in-depth against both accidental and adversarial misuse.
The upper limit on simultaneous erasures may be set by how many distinct t-distribution components the embedding space can support before overlap becomes unavoidable.
Once the mixture parameters are learned, the method might allow selective re-introduction of erased concepts by reversing the transport map without full retraining.

Load-bearing premise

The Student's t-distribution Mixture Model must accurately capture the low-rank structure of concept distributions in the text embeddings so that targets can be moved without distorting the surrounding concepts or the overall image-generation capability.

What would settle it

Run the method on a model, then measure the fraction of prompts that still produce the erased concept and compare FID or CLIP scores on standard image benchmarks before and after; if either the concept reappears at high rates or quality metrics drop substantially, the central claim fails.

Figures

Figures reproduced from arXiv: 2604.16481 by Byung Hyun Lee, Hoigi Seo, Jaehyun Cho, Se Young Chun, Sungjin Lim.

**Figure 2.** Figure 2: , the concept embedding distribution empirically exhibits a heavy-tailed behavior. Intuitively, since the embeddings are inherently “in-distribution”, genuine out-ofdistribution samples that reflect true variability are scarce. The tMM naturally models heavier tails, enabling better modeling of variability within the concept. Remark 1. Concept distribution modeling. A concept in a T2I diffusion model c… view at source ↗

**Figure 4.** Figure 4: Qualitative rationale on NIR. We generated images with the prompt “a photo of Morgan Freeman” using the original text-embedding projection Wproj. (left) and a corrupted weight Wcor. (right) on SDv1.4 and SDv3.5-L. When sufficient noise is injected, the models fail to produce high-fidelity images; without the MoEraser module to restore the generation, the model becomes unusable, enhancing robustness to whi… view at source ↗

**Figure 5.** Figure 5: MoEraser architecture and training. (a) A MoE with GLU experts scales to heterogeneous domain concepts; training maps ftar to fmap while leaving fanc unchanged. (b) To make the module non-removable, we inject structured noise into the text embedding projector and fine-tune the safety module to reconstruct the original embedding, improving robustness to white-box attacks such as module removal. Remark 4. No… view at source ↗

**Figure 6.** Figure 6: Qualitative results on erasing 2,072 concepts from SDv1.4. Among baseline methods (MACE, UCE, CPE, SPEED, and SAFREE), most remove the target concept but often degrade image fidelity, and SAFREE struggles to erase concepts at the large scale. For the preservation of remaining concepts, baseline methods typically alter the original composition or distort remaining concepts. ETC achieves precise removal of t… view at source ↗

**Figure 7.** Figure 7: Qualitative results on erasing 515 concepts from SDv3.5-L. SAFREE reproduces the original image and fails to remove the target concept. SPEED removes the target concept but degrades fidelity, and this degradation also affects the remaining concepts. In contrast, ETC achieves accurate concept erasure while preserving remaining concepts on the SDv3.5-L, demonstrating applicability. 4.6. Ablation studies We c… view at source ↗

**Figure 8.** Figure 8: Load heatmap of experts. We visualize the frequency ratio of selection of each expert for three domains where each column represents an expert, and each row corresponds to a domain. The relatively uniform load distribution across experts suggests that the router network effectively balances expert utilization. all noise types caused similar degradation of the target concept, but structured noise better p… view at source ↗

read the original abstract

Large-scale text-to-image (T2I) diffusion models deliver remarkable visual fidelity but pose safety risks due to their capacity to reproduce undesirable content, such as copyrighted ones. Concept erasure has emerged as a mitigation strategy, yet existing approaches struggle to balance scalability, precision, and robustness, which restricts their applicability to erasing only a few hundred concepts. To address these limitations, we present Erasing Thousands of Concepts (ETC), a scalable framework capable of erasing thousands of concepts while preserving generation quality. Our method first models low-rank concept distributions via a Student's t-distribution Mixture Model (tMM). It enables pin-point erasure of target concepts via affine optimal transport while preserving others by anchoring the boundaries of target concept distributions without pre-defined anchor concepts. We then train a Mixture-of-Experts (MoE)-based module, termed MoEraser, which removes target embeddings while preserving the anchor embeddings. By injecting noise into the text embedding projector and fine-tuning MoEraser for recovery, our framework achieves robustness to white-box attack such as module removal. Extensive experiments on over 2,000 concepts across heterogeneous domains and diffusion models demerate state-of-the-art scalability and precision in large-scale concept erasure.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

ETC scales concept erasure to thousands using tMM and MoEraser, but the tMM modeling lacks sufficient validation for the claimed scale.

read the letter

The main thing your colleague should know is that this paper proposes ETC, a framework to erase thousands of concepts from text-to-image diffusion models by modeling distributions with a Student's t-mixture model, using affine optimal transport for targeted erasure without anchors, and training a MoE module called MoEraser with noise for robustness. It does well in extending the scale of concept erasure beyond the typical hundreds to over 2000 concepts, testing across different domains and models. The anchor-free approach via optimal transport and the robustness mechanism through noise injection and fine-tuning are practical additions that address limitations in earlier work. The overall architecture shows a clear attempt to balance erasure precision with maintaining generation quality. The soft spots are in the foundational modeling and evidence. The assumption that the t-distribution mixture accurately captures the low-rank structure in text embeddings for such a large number of heterogeneous concepts is not backed by any reported fits, ablations on distribution family, or analysis of overlaps. This matches the stress-test concern, and without that, the subsequent steps may not achieve the claimed precision and preservation. The experiments are described as extensive and SOTA, but the lack of specific quantitative results, error bars, or detailed ablations in the provided information makes it hard to confirm the claims hold up. If the full paper has strong data here, that would change things, but as is, the central argument rests on unverified assumptions. This paper would be of interest to researchers in AI safety and generative model editing who need methods that work at larger scales. A reader focused on practical interventions might find the MoEraser design worth exploring. I would recommend it for peer review since it tackles an important scalability issue with a novel combination of techniques, even though it will probably require more rigorous validation to be fully convincing.

Referee Report

3 major / 2 minor

Summary. The paper introduces Erasing Thousands of Concepts (ETC), a framework for scalable concept erasure in text-to-image diffusion models. It first fits low-rank concept distributions in CLIP text embeddings using a Student's t-distribution Mixture Model (tMM), then applies affine optimal transport to erase target concepts while anchoring distribution boundaries to preserve non-targets without requiring pre-defined anchors. A Mixture-of-Experts module (MoEraser) is trained to remove target embeddings while retaining anchors, with noise injection into the text embedding projector during fine-tuning to confer robustness against white-box attacks such as module removal. The authors claim state-of-the-art scalability and precision based on experiments involving over 2,000 concepts across heterogeneous domains and multiple diffusion models.

Significance. If the central claims hold, the work would be significant for enabling practical, large-scale safety interventions in deployed T2I systems by addressing the scalability bottleneck of prior erasure methods (limited to hundreds of concepts). The tMM-plus-affine-transport construction for anchor-free boundary preservation and the MoE-based recovery with attack robustness represent technically interesting modeling choices that could generalize beyond the reported setting. The scale of the claimed evaluation (2000+ concepts) would also provide a useful benchmark for the community if accompanied by reproducible metrics.

major comments (3)

[Abstract and §3] Abstract and §3 (tMM modeling): The central claim that the Student's t-distribution Mixture Model accurately captures low-rank structure in text embeddings to enable precise anchoring and erasure at 2000+ scale lacks any supporting quantitative evidence such as per-component likelihoods, Kolmogorov-Smirnov statistics, ablation on distribution family (t vs. Gaussian), or embedding dimensionality analysis. Without these, it is impossible to verify that the subsequent affine optimal transport step actually achieves selective removal while preserving anchors.
[Abstract and §4] Abstract and §4 (MoEraser and experiments): The abstract asserts SOTA scalability, precision, and robustness on >2000 concepts yet supplies no numerical results, error bars, ablation tables, or attack success rates. This renders the claims of preserved generation quality and white-box robustness unverifiable and load-bearing for the paper's contribution.
[§3.2] §3.2 (affine optimal transport): The assertion that affine optimal transport can erase targets while anchoring boundaries without pre-defined anchors is presented as a key innovation, but no derivation, closed-form solution, or proof of boundary preservation is referenced; if the transport map is learned rather than parameter-free, the 'anchor-free' claim requires explicit justification against baselines that do use anchors.

minor comments (2)

[Abstract] Abstract: 'demerate' is a typographical error and should be 'demonstrate'.
[§4] Notation: The distinction between 'target embeddings' and 'anchor embeddings' in the MoEraser description is introduced without a formal definition or diagram; a small schematic would improve clarity.

Simulated Author's Rebuttal

3 responses · 1 unresolved

We thank the referee for the constructive and detailed feedback. We have addressed each major comment below and will incorporate revisions to strengthen the manuscript's clarity, evidence, and rigor.

read point-by-point responses

Referee: [Abstract and §3] Abstract and §3 (tMM modeling): The central claim that the Student's t-distribution Mixture Model accurately captures low-rank structure in text embeddings to enable precise anchoring and erasure at 2000+ scale lacks any supporting quantitative evidence such as per-component likelihoods, Kolmogorov-Smirnov statistics, ablation on distribution family (t vs. Gaussian), or embedding dimensionality analysis. Without these, it is impossible to verify that the subsequent affine optimal transport step actually achieves selective removal while preserving anchors.

Authors: We agree that direct quantitative validation of the tMM would improve verifiability. In the revised manuscript we will add per-component log-likelihoods for the fitted mixtures, Kolmogorov-Smirnov goodness-of-fit statistics on the CLIP embeddings, an explicit ablation replacing the t-distribution with a Gaussian mixture model (reporting effects on erasure precision and anchor preservation), and an analysis of effective embedding dimensionality and rank. These additions will directly support the modeling choice before the affine transport step. revision: yes
Referee: [Abstract and §4] Abstract and §4 (MoEraser and experiments): The abstract asserts SOTA scalability, precision, and robustness on >2000 concepts yet supplies no numerical results, error bars, ablation tables, or attack success rates. This renders the claims of preserved generation quality and white-box robustness unverifiable and load-bearing for the paper's contribution.

Authors: We acknowledge that the abstract and experimental reporting should be more explicit. We will revise the abstract to state the key quantitative outcomes (erasure success rates, FID scores for generation quality, and white-box attack success rates) and expand §4 with complete tables that include error bars (standard deviation across runs), full ablation studies on MoEraser components, and statistical comparisons. The existing experiments already cover >2000 concepts; the revision will make all supporting numbers and variability measures prominent and reproducible. revision: yes
Referee: [§3.2] §3.2 (affine optimal transport): The assertion that affine optimal transport can erase targets while anchoring boundaries without pre-defined anchors is presented as a key innovation, but no derivation, closed-form solution, or proof of boundary preservation is referenced; if the transport map is learned rather than parameter-free, the 'anchor-free' claim requires explicit justification against baselines that do use anchors.

Authors: We thank the referee for this observation. The transport map is obtained in closed form from the parameters of the fitted tMM; we will add a self-contained derivation in the appendix that shows how the affine map is computed to shift only the target component while fixing the boundary points defined by the mixture. We will also include a direct comparison against anchor-based baselines to justify the anchor-free formulation. A fully general proof of boundary preservation under arbitrary distribution shifts lies beyond the scope of the current work. revision: partial

standing simulated objections not resolved

A complete, general proof of boundary preservation for the affine optimal transport under all possible distribution shifts.

Circularity Check

0 steps flagged

No circularity: ETC constructs new modeling, transport, and training steps from data without self-referential reduction.

full rationale

The derivation begins with fitting a tMM to low-rank text embeddings (a data-driven modeling step), applies affine optimal transport to define erasure targets while anchoring distribution boundaries (a geometric operation on the fitted model), and trains MoEraser via noise injection and recovery fine-tuning (an optimization procedure). None of these steps reduce by definition to their inputs or to a fitted parameter renamed as a prediction; the framework adds independent components rather than deriving results tautologically. No load-bearing self-citations or uniqueness theorems from prior author work appear in the abstract or description. The chain is self-contained as a constructive pipeline.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 2 invented entities

The approach rests on domain assumptions about concept distributions in embedding space and introduces new modeling entities without independent evidence or prior validation provided in the abstract.

axioms (2)

domain assumption Low-rank concept distributions in text embeddings of diffusion models can be accurately modeled by a Student's t-distribution Mixture Model
Invoked as the first step for pinpoint erasure via optimal transport.
ad hoc to paper Affine optimal transport can erase target concepts while anchoring boundaries to preserve non-target concepts without predefined anchors
Central mechanism claimed to enable scalability and precision.

invented entities (2)

MoEraser no independent evidence
purpose: Mixture-of-Experts module that removes target embeddings while preserving anchor embeddings
New component trained to perform selective erasure with robustness via noise injection.
tMM no independent evidence
purpose: Student's t-distribution Mixture Model for modeling low-rank concept distributions
New modeling choice to enable the erasure technique.

pith-pipeline@v0.9.0 · 5526 in / 1360 out tokens · 70595 ms · 2026-05-10T15:35:43.438655+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

73 extracted references · 73 canonical work pages · 4 internal anchors

[1]

Persistent anti-muslim bias in large language models.AAAI/ACM on AI, Ethics, and Society, 2021

Abubakar Abid, Maheen Farooqi, and James Zou. Persistent anti-muslim bias in large language models.AAAI/ACM on AI, Ethics, and Society, 2021. 2

work page 2021
[2]

Erasing more than intended? how concept erasure degrades the generation of non-target concepts.Proceedings of the IEEE/CVF International Conference on Computer Vision,

Ibtihel Amara, Ahmed Imtiaz Humayun, Ivana Kajic, Zarana Parekh, Natalie Harris, Sarah Young, Chirag Nagpal, Na- joung Kim, Junfeng He, Cristina Nader Vasconcelos, et al. Erasing more than intended? how concept erasure degrades the generation of non-target concepts.Proceedings of the IEEE/CVF International Conference on Computer Vision,

work page
[3]

Model-based classification via mixtures of mul- tivariate t-distributions.Computational Statistics & Data Analysis, 2011

Jeffrey L Andrews, Paul D McNicholas, and Sanjeena Subedi. Model-based classification via mixtures of mul- tivariate t-distributions.Computational Statistics & Data Analysis, 2011. 1, 2

work page 2011
[4]

Multimodal word distributions.ACL, 2017

Ben Athiwaratkun and Andrew Wilson. Multimodal word distributions.ACL, 2017. 2

work page 2017
[5]

Probabilistic fasttext for multi-sense word embed- dings.ACL, 2018

Ben Athiwaratkun, Andrew Wilson, and Animashree Anand- kumar. Probabilistic fasttext for multi-sense word embed- dings.ACL, 2018. 2

work page 2018
[6]

Qwen2.5-VL Technical Report

Shuai Bai, Keqin Chen, Xuejing Liu, Jialin Wang, Wenbin Ge, Sibo Song, Kai Dang, Peng Wang, Shijie Wang, Jun Tang, Humen Zhong, Yuanzhi Zhu, Mingkun Yang, Zhao- hai Li, Jianqiang Wan, Pengfei Wang, Wei Ding, Zheren Fu, Yiheng Xu, Jiabo Ye, Xi Zhang, Tianbao Xie, Zesen Cheng, Hang Zhang, Zhibo Yang, Haiyang Xu, and Jun- yang Lin. Qwen2.5-vl technical repor...

work page internal anchor Pith review Pith/arXiv arXiv 2025
[7]

Nudenet: Neural nets for nudity de- tection and censoring.https://nudenet.notai

Praneeth Bedapudi. Nudenet: Neural nets for nudity de- tection and censoring.https://nudenet.notai. tech/, 2022. 2

work page 2022
[8]

Large image datasets: A pyrrhic win for computer vision?WACV, 2021

Abeba Birhane and Vinay Uday Prabhu. Large image datasets: A pyrrhic win for computer vision?WACV, 2021. 2

work page 2021
[9]

Erasing undesir- able concepts in diffusion models with adversarial preserva- tion.NeurIPS, 2024

Anh Bui, Long Vuong, Khanh Doan, Trung Le, Paul Mon- tague, Tamas Abraham, and Dinh Phung. Erasing undesir- able concepts in diffusion models with adversarial preserva- tion.NeurIPS, 2024. 1

work page 2024
[10]

Fantastic targets for concept erasure in diffusion models and where to find them.ICLR, 2025

Anh Tuan Bui, Thuy-Trang Vu, Long Tung Vuong, Trung Le, Paul Montague, Tamas Abraham, Junae Kim, and Dinh Phung. Fantastic targets for concept erasure in diffusion models and where to find them.ICLR, 2025. 3, 12

work page 2025
[11]

Pixart-σ: Weak-to-strong training of dif- fusion transformer for 4k text-to-image generation.ECCV,

Junsong Chen, Chongjian Ge, Enze Xie, Yue Wu, Lewei Yao, Xiaozhe Ren, Zhongdao Wang, Ping Luo, Huchuan Lu, and Zhenguo Li. Pixart-σ: Weak-to-strong training of dif- fusion transformer for 4k text-to-image generation.ECCV,

work page
[12]

Word2vec.Natural Language Engi- neering, 2017

Kenneth Ward Church. Word2vec.Natural Language Engi- neering, 2017. 15

work page 2017
[13]

Gemini 2.5: Pushing the Frontier with Advanced Reasoning, Multimodality, Long Context, and Next Generation Agentic Capabilities

Gheorghe Comanici, Eric Bieber, Mike Schaekermann, Ice Pasupat, Noveen Sachdeva, Inderjit Dhillon, Marcel Blis- tein, Ori Ram, Dan Zhang, Evan Rosen, et al. Gemini 2.5: Pushing the frontier with advanced reasoning, multimodality, long context, and next generation agentic capabilities.arXiv preprint arXiv:2507.06261, 2025. 1, 2

work page internal anchor Pith review Pith/arXiv arXiv 2025
[14]

Language modeling with gated convolutional net- works.ICML, 2017

Yann N Dauphin, Angela Fan, Michael Auli, and David Grangier. Language modeling with gated convolutional net- works.ICML, 2017. 4

work page 2017
[15]

Measuring and mitigating unintended bias in text classification.AAAI/ACM on AI, Ethics, and Society,

Lucas Dixon, John Li, Jeffrey Sorensen, Nithum Thain, and Lucy Vasserman. Measuring and mitigating unintended bias in text classification.AAAI/ACM on AI, Ethics, and Society,

work page
[16]

Scaling recti- fied flow transformers for high-resolution image synthesis

Patrick Esser, Sumith Kulal, Andreas Blattmann, Rahim Entezari, Jonas Müller, Harry Saini, Yam Levi, Dominik Lorenz, Axel Sauer, Frederic Boesel, et al. Scaling recti- fied flow transformers for high-resolution image synthesis. ICML, 2024. 2, 5

work page 2024
[17]

Salun: Empowering machine un- learning via gradient-based weight saliency in both image classification and generation.ICLR, 2024

Chongyu Fan, Jiancheng Liu, Yihua Zhang, Dennis Wei, Eric Wong, and Sijia Liu. Salun: Empowering machine un- learning via gradient-based weight saliency in both image classification and generation.ICLR, 2024. 1

work page 2024
[18]

Erasing concepts from diffusion models.ICCV, 2023

Rohit Gandikota, Joanna Materzynska, Jaden Fiotto- Kaufman, and David Bau. Erasing concepts from diffusion models.ICCV, 2023. 1, 2, 5, 6, 19

work page 2023
[19]

Unified concept editing in dif- fusion models.WACV, 2024

Rohit Gandikota, Hadas Orgad, Yonatan Belinkov, Joanna Materzy´nska, and David Bau. Unified concept editing in dif- fusion models.WACV, 2024. 1, 2, 5, 6, 15, 16, 19

work page 2024
[20]

Reliable and efficient concept erasure of text- to-image diffusion models.ECCV, 2024

Chao Gong, Kai Chen, Zhipeng Wei, Jingjing Chen, and Yu- Gang Jiang. Reliable and efficient concept erasure of text- to-image diffusion models.ECCV, 2024. 3, 19

work page 2024
[21]

Giphy celebrity detector.https://github

Nick Hasty, Ihor Kroosh, Dmitry V oitekh, and Dmytro Ko- rduban. Giphy celebrity detector.https://github. com/Giphy/celeb-detection-oss, 2024. 6, 15, 16

work page 2024
[22]

Selective amnesia: A continual learning approach to forgetting in deep generative models

Alvin Heng and Harold Soh. Selective amnesia: A continual learning approach to forgetting in deep generative models. NeurIPS, 2023. 1, 2

work page 2023
[23]

Clipscore: A reference-free evaluation met- ric for image captioning.EMNLP, 2021

Jack Hessel, Ari Holtzman, Maxwell Forbes, Ronan Le Bras, and Yejin Choi. Clipscore: A reference-free evaluation met- ric for image captioning.EMNLP, 2021. 5, 6, 16

work page 2021
[24]

Gans trained by a two time-scale update rule converge to a local nash equilib- rium.NeurIPS, 2017

Martin Heusel, Hubert Ramsauer, Thomas Unterthiner, Bernhard Nessler, and Sepp Hochreiter. Gans trained by a two time-scale update rule converge to a local nash equilib- rium.NeurIPS, 2017. 6, 16

work page 2017
[25]

Lora: Low-rank adaptation of large language models.ICLR, 2021

Edward J Hu, Phillip Wallis, Zeyuan Allen-Zhu, Yuanzhi Li, Shean Wang, Lu Wang, Weizhu Chen, et al. Lora: Low-rank adaptation of large language models.ICLR, 2021. 2

work page 2021
[26]

Token merging for training- free semantic binding in text-to-image synthesis.NeurIPS,

Taihang Hu, Linxuan Li, Joost van de Weijer, Hongcheng Gao, Fahad Shahbaz Khan, Jian Yang, Ming-Ming Cheng, Kai Wang, and Yaxing Wang. Token merging for training- free semantic binding in text-to-image synthesis.NeurIPS,

work page
[27]

Receler: Reli- able concept erasing of text-to-image diffusion models via lightweight erasers.ECCV, 2023

Chi-Pin Huang, Kai-Po Chang, Chung-Ting Tsai, Yung- Hsuan Lai, and Yu-Chiang Frank Wang. Receler: Reli- able concept erasing of text-to-image diffusion models via lightweight erasers.ECCV, 2023. 1, 4, 5

work page 2023
[28]

Robo-writers: the rise and risks of language-generating ai.Nature, 2021

Matthew Hutson. Robo-writers: the rise and risks of language-generating ai.Nature, 2021. 2

work page 2021
[29]

Mixtral of Experts

Albert Q Jiang, Alexandre Sablayrolles, Antoine Roux, Arthur Mensch, Blanche Savary, Chris Bamford, Deven- dra Singh Chaplot, Diego de las Casas, Emma Bou Hanna, Florian Bressand, et al. Mixtral of experts.arXiv preprint arXiv:2401.04088, 2024. 1, 4

work page internal anchor Pith review Pith/arXiv arXiv 2024
[30]

Overcoming catastrophic forgetting in neu- ral networks.Proceedings of the national academy of sci- ences, 2017

James Kirkpatrick, Razvan Pascanu, Neil Rabinowitz, Joel Veness, Guillaume Desjardins, Andrei A Rusu, Kieran Milan, John Quan, Tiago Ramalho, Agnieszka Grabska- Barwinska, et al. Overcoming catastrophic forgetting in neu- ral networks.Proceedings of the national academy of sci- ences, 2017. 2

work page 2017
[31]

Ablating con- cepts in text-to-image diffusion models.ICCV, 2023

Nupur Kumari, Bingliang Zhang, Sheng-Yu Wang, Eli Shechtman, Richard Zhang, and Jun-Yan Zhu. Ablating con- cepts in text-to-image diffusion models.ICCV, 2023. 1, 2

work page 2023
[32]

Nsfw detection machine learning model

Gant Laborde. Nsfw detection machine learning model. https : / / github . com / GantMan / nsfw _ model,

work page
[33]

Flux.1-dev.https : / / huggingface

Black Forest Labs. Flux.1-dev.https : / / huggingface . co / black - forest - labs / FLUX.1-dev, 2024. 1, 2

work page 2024
[34]

Online continual learning on hierarchical label expansion.ICCV, 2023

Byung Hyun Lee, Okchul Jung, Jonghyun Choi, and Se Young Chun. Online continual learning on hierarchical label expansion.ICCV, 2023. 2

work page 2023
[35]

Dou- bly perturbed task free continual learning.AAAI, 2024

Byung Hyun Lee, Min-hwan Oh, and Se Young Chun. Dou- bly perturbed task free continual learning.AAAI, 2024. 2

work page 2024
[36]

Lo- calized concept erasure for text-to-image diffusion models using training-free gated low-rank adaptation.CVPR, 2025

Byung Hyun Lee, Sungjin Lim, and Se Young Chun. Lo- calized concept erasure for text-to-image diffusion models using training-free gated low-rank adaptation.CVPR, 2025. 1, 2, 3, 4, 5

work page 2025
[37]

Concept pinpoint eraser for text- to-image diffusion models via residual attention gate.ICLR,

Byung Hyun Lee, Sungjin Lim, Seunggyu Lee, Dong Un Kang, and Se Young Chun. Concept pinpoint eraser for text- to-image diffusion models via residual attention gate.ICLR,

work page
[38]

1, 2, 3, 4, 5, 6, 15, 16, 19

work page
[39]

Gshard: Scaling giant models with conditional computation and automatic sharding.ICLR,

Dmitry Lepikhin, HyoukJoong Lee, Yuanzhong Xu, Dehao Chen, Orhan Firat, Yanping Huang, Maxim Krikun, Noam Shazeer, and Zhifeng Chen. Gshard: Scaling giant models with conditional computation and automatic sharding.ICLR,

work page
[40]

Speed: Scalable, precise, and efficient concept erasure for diffusion models.arXiv preprint arXiv:2503.07392,

Ouxiang Li, Yuan Wang, Xinting Hu, Houcheng Jiang, Tao Liang, Yanbin Hao, Guojun Ma, and Fuli Feng. Speed: Scal- able, precise, and efficient concept erasure for diffusion mod- els.arXiv preprint arXiv:2503.07392, 2025. 2, 4, 5, 6, 17

work page arXiv 2025
[41]

Safetydpo: Scalable safety alignment for text-to-image generation,

Runtao Liu, Chen I Chieh, Jindong Gu, Jipeng Zhang, Renjie Pi, Qifeng Chen, Philip Torr, Ashkan Khakzar, and Fabio Pizzati. Safetydpo: Scalable safety alignment for text-to- image generation.arXiv preprint arXiv:2412.10493, 2024. 1

work page arXiv 2024
[42]

Mace: Mass concept erasure in diffusion models.CVPR, 2024

Shilin Lu, Zilan Wang, Leyang Li, Yanzhu Liu, and Adams Wai-Kin Kong. Mace: Mass concept erasure in diffusion models.CVPR, 2024. 1, 2, 3, 4, 5, 6, 15, 16, 19

work page 2024
[43]

One-dimensional adapter to rule them all: Concepts, diffusion models and erasing applications.CVPR, 2024

Mengyao Lyu, Yuhong Yang, Haiwen Hong, Hui Chen, Xuan Jin, Yuan He, Hui Xue, Jungong Han, and Guiguang Ding. One-dimensional adapter to rule them all: Concepts, diffusion models and erasing applications.CVPR, 2024. 1, 2

work page 2024
[44]

Holistic unlearning benchmark: A multi-faceted eval- uation for text-to-image diffusion model unlearning.ICCV,

Saemi Moon, Minjong Lee, Sangdon Park, and Dongwoo Kim. Holistic unlearning benchmark: A multi-faceted eval- uation for text-to-image diffusion model unlearning.ICCV,

work page
[45]

Glide: Towards photorealis- tic image generation and editing with text-guided diffusion models.ICML, 2022

Alexander Quinn Nichol, Prafulla Dhariwal, Aditya Ramesh, Pranav Shyam, Pamela Mishkin, Bob Mcgrew, Ilya Sutskever, and Mark Chen. Glide: Towards photorealis- tic image generation and editing with text-guided diffusion models.ICML, 2022. 1

work page 2022
[46]

Stable diffusion 1 vs 2 - what you need to know.https://www.assemblyai.com/blog/ stable- diffusion- 1- vs- 2- what- you- need- to-know/, 2022

Ryan O’connor. Stable diffusion 1 vs 2 - what you need to know.https://www.assemblyai.com/blog/ stable- diffusion- 1- vs- 2- what- you- need- to-know/, 2022. 2

work page 2022
[47]

Dall-e 2 preview - risks and limitations, 2022

OpenAI. Dall-e 2 preview - risks and limitations, 2022. 1, 2

work page 2022
[48]

Edit- ing implicit assumptions in text-to-image diffusion models

Hadas Orgad, Bahjat Kawar, and Yonatan Belinkov. Edit- ing implicit assumptions in text-to-image diffusion models. ICCV, 2023. 2

work page 2023
[49]

Scalable diffusion models with transformers.ICCV, 2023

William Peebles and Saining Xie. Scalable diffusion models with transformers.ICCV, 2023. 14

work page 2023
[50]

Robust mixture mod- elling using the t distribution.Statistics and computing,

David Peel and Geoffrey J McLachlan. Robust mixture mod- elling using the t distribution.Statistics and computing,

work page
[51]

Learn- ing transferable visual models from natural language super- vision.ICML, 2021

Alec Radford, Jong Wook Kim, Chris Hallacy, Aditya Ramesh, Gabriel Goh, Sandhini Agarwal, Girish Sastry, Amanda Askell, Pamela Mishkin, Jack Clark, et al. Learn- ing transferable visual models from natural language super- vision.ICML, 2021. 16

work page 2021
[52]

Hierarchical Text-Conditional Image Generation with CLIP Latents

Aditya Ramesh, Prafulla Dhariwal, Alex Nichol, Casey Chu, and Mark Chen. Hierarchical text-conditional image gen- eration with clip latents.arXiv preprint arXiv:2204.06125,

work page internal anchor Pith review arXiv
[53]

Red-teaming the stable diffusion safety filter.NeurIPS ML Safety Workshop, 2022

Javier Rando, Daniel Paleka, David Lindner, Lennart Heim, and Florian Tramer. Red-teaming the stable diffusion safety filter.NeurIPS ML Safety Workshop, 2022. 1, 2

work page 2022
[54]

Scaling vision with sparse mix- ture of experts.NeurIPS, 2021

Carlos Riquelme, Joan Puigcerver, Basil Mustafa, Maxim Neumann, Rodolphe Jenatton, André Susano Pinto, Daniel Keysers, and Neil Houlsby. Scaling vision with sparse mix- ture of experts.NeurIPS, 2021. 1, 4

work page 2021
[55]

Experience replay for continual learning.NeurIPS, 2019

David Rolnick, Arun Ahuja, Jonathan Schwarz, Timothy Lil- licrap, and Gregory Wayne. Experience replay for continual learning.NeurIPS, 2019. 2

work page 2019
[56]

Stable diffusion 2.0 release.https:// stability

Robin Rombach. Stable diffusion 2.0 release.https:// stability . ai / news / stable - diffusion - v2 - release, 2022. 2, 19

work page 2022
[57]

High-resolution image syn- thesis with latent diffusion models.CVPR, 2022

Robin Rombach, Andreas Blattmann, Dominik Lorenz, Patrick Esser, and Björn Ommer. High-resolution image syn- thesis with latent diffusion models.CVPR, 2022. 1, 2, 5

work page 2022
[58]

Stable diffusion v1 model card.https://huggingface.co/CompVis/ stable-diffusion-v1-4, 2022

Robin Rombach, Andreas Blattmann, Dominik Lorenz, Patrick Esser, and Björn Ommer. Stable diffusion v1 model card.https://huggingface.co/CompVis/ stable-diffusion-v1-4, 2022. 2, 6, 8, 12, 13, 14, 19

work page 2022
[59]

Photorealistic text-to-image diffusion models with deep language understanding.NeurIPS, 2022

Chitwan Saharia, William Chan, Saurabh Saxena, Lala Li, Jay Whang, Emily L Denton, Kamyar Ghasemipour, Raphael Gontijo Lopes, Burcu Karagol Ayan, Tim Salimans, et al. Photorealistic text-to-image diffusion models with deep language understanding.NeurIPS, 2022. 1

work page 2022
[60]

Safe latent diffusion: Mitigating inappro- priate degeneration in diffusion models.CVPR, 2023

Patrick Schramowski, Manuel Brack, Björn Deiseroth, and Kristian Kersting. Safe latent diffusion: Mitigating inappro- priate degeneration in diffusion models.CVPR, 2023. 1, 2, 6

work page 2023
[61]

Laion-5b: An open large-scale dataset for training next generation image-text models.NeurIPS, 2022

Christoph Schuhmann, Romain Beaumont, Richard Vencu, Cade Gordon, Ross Wightman, Mehdi Cherti, Theo Coombes, Aarush Katta, Clayton Mullis, Mitchell Worts- man, et al. Laion-5b: An open large-scale dataset for training next generation image-text models.NeurIPS, 2022. 2

work page 2022
[62]

Geometrical prop- erties of text token embeddings for strong semantic binding in text-to-image generation, 2025

Hoigi Seo, Junseo Bang, Haechang Lee, Joohoon Lee, Byung Hyun Lee, and Se Young Chun. Geometrical prop- erties of text token embeddings for strong semantic binding in text-to-image generation, 2025. 13

work page 2025
[63]

Outra- geously large neural networks: The sparsely-gated mixture- of-experts layer.ICLR, 2017

Noam Shazeer, Azalia Mirhoseini, Krzysztof Maziarz, Andy Davis, Quoc Le, Geoffrey Hinton, and Jeff Dean. Outra- geously large neural networks: The sparsely-gated mixture- of-experts layer.ICLR, 2017. 1

work page 2017
[64]

Diffusion art or digital forgery? investigating data replication in diffusion models.CVPR,

Gowthami Somepalli, Vasu Singla, Micah Goldblum, Jonas Geiping, and Tom Goldstein. Diffusion art or digital forgery? investigating data replication in diffusion models.CVPR,

work page
[65]

Stereo: A two- stage framework for adversarially robust concept erasing from text-to-image diffusion models.CVPR, 2025

Koushik Srivatsan, Fahad Shamshad, Muzammal Naseer, Vishal M Patel, and Karthik Nandakumar. Stereo: A two- stage framework for adversarially robust concept erasing from text-to-image diffusion models.CVPR, 2025. 2, 3

work page 2025
[66]

Demys- tifying mmd gans.ICLR, 2018

JD Sutherland, Michael Arbel, and Arthur Gretton. Demys- tifying mmd gans.ICLR, 2018. 6, 16

work page 2018
[67]

Ring-a-bell! how reliable are concept removal methods for diffusion models?ICLR, 2024

Yu-Lin Tsai, Chia-yi Hsu, Chulin Xie, Chih-hsun Lin, Jia You Chen, Bo Li, Pin-Yu Chen, Chia-Mu Yu, and Chun- ying Huang. Ring-a-bell! how reliable are concept removal methods for diffusion models?ICLR, 2024. 6, 18, 19

work page 2024
[68]

Word representations via gaussian embedding.ICLR, 2015

Luke Vilnis and Andrew McCallum. Word representations via gaussian embedding.ICLR, 2015. 2

work page 2015
[69]

Safree: Training-free and adaptive guard for safe text-to-image and video generation.ICLR, 2025

Jaehong Yoon, Shoubin Yu, Vaidehi Patil, Huaxiu Yao, and Mohit Bansal. Safree: Training-free and adaptive guard for safe text-to-image and video generation.ICLR, 2025. 2, 5, 6, 16

work page 2025
[70]

arXiv preprint arXiv:2303.17591 , year=

Eric Zhang, Kai Wang, Xingqian Xu, Zhangyang Wang, and Humphrey Shi. Forget-me-not: Learning to forget in text-to- image diffusion models.arXiv preprint arXiv:2303.17591,

work page arXiv
[71]

The unreasonable effectiveness of deep features as a perceptual metric

Richard Zhang, Phillip Isola, Alexei A Efros, Eli Shechtman, and Oliver Wang. The unreasonable effectiveness of deep features as a perceptual metric. InCVPR, 2018. 5

work page 2018
[72]

for now , author=

Yimeng Zhang, Jinghan Jia, Xin Chen, Aochuan Chen, Yi- hua Zhang, Jiancheng Liu, Ke Ding, and Sijia Liu. To gener- ate or not? safety-driven unlearned diffusion models are still easy to generate unsafe images... for now.arXiv preprint arXiv:2310.11868, 2023. 18, 19

work page arXiv 2023
[73]

Defensive unlearning with adversarial training for robust concept erasure in diffusion models.NeurIPS, 2024

Yimeng Zhang, Xin Chen, Jinghan Jia, Yihua Zhang, Chongyu Fan, Jiancheng Liu, Mingyi Hong, Ke Ding, and Sijia Liu. Defensive unlearning with adversarial training for robust concept erasure in diffusion models.NeurIPS, 2024. 1, 6 Erasing Thousands of Concepts: Towards Scalable and Practical Concept Erasure for Text-to-Image Diffusion Model Supplementary Ma...

work page 2024

[1] [1]

Persistent anti-muslim bias in large language models.AAAI/ACM on AI, Ethics, and Society, 2021

Abubakar Abid, Maheen Farooqi, and James Zou. Persistent anti-muslim bias in large language models.AAAI/ACM on AI, Ethics, and Society, 2021. 2

work page 2021

[2] [2]

Erasing more than intended? how concept erasure degrades the generation of non-target concepts.Proceedings of the IEEE/CVF International Conference on Computer Vision,

Ibtihel Amara, Ahmed Imtiaz Humayun, Ivana Kajic, Zarana Parekh, Natalie Harris, Sarah Young, Chirag Nagpal, Na- joung Kim, Junfeng He, Cristina Nader Vasconcelos, et al. Erasing more than intended? how concept erasure degrades the generation of non-target concepts.Proceedings of the IEEE/CVF International Conference on Computer Vision,

work page

[3] [3]

Model-based classification via mixtures of mul- tivariate t-distributions.Computational Statistics & Data Analysis, 2011

Jeffrey L Andrews, Paul D McNicholas, and Sanjeena Subedi. Model-based classification via mixtures of mul- tivariate t-distributions.Computational Statistics & Data Analysis, 2011. 1, 2

work page 2011

[4] [4]

Multimodal word distributions.ACL, 2017

Ben Athiwaratkun and Andrew Wilson. Multimodal word distributions.ACL, 2017. 2

work page 2017

[5] [5]

Probabilistic fasttext for multi-sense word embed- dings.ACL, 2018

Ben Athiwaratkun, Andrew Wilson, and Animashree Anand- kumar. Probabilistic fasttext for multi-sense word embed- dings.ACL, 2018. 2

work page 2018

[6] [6]

Qwen2.5-VL Technical Report

Shuai Bai, Keqin Chen, Xuejing Liu, Jialin Wang, Wenbin Ge, Sibo Song, Kai Dang, Peng Wang, Shijie Wang, Jun Tang, Humen Zhong, Yuanzhi Zhu, Mingkun Yang, Zhao- hai Li, Jianqiang Wan, Pengfei Wang, Wei Ding, Zheren Fu, Yiheng Xu, Jiabo Ye, Xi Zhang, Tianbao Xie, Zesen Cheng, Hang Zhang, Zhibo Yang, Haiyang Xu, and Jun- yang Lin. Qwen2.5-vl technical repor...

work page internal anchor Pith review Pith/arXiv arXiv 2025

[7] [7]

Nudenet: Neural nets for nudity de- tection and censoring.https://nudenet.notai

Praneeth Bedapudi. Nudenet: Neural nets for nudity de- tection and censoring.https://nudenet.notai. tech/, 2022. 2

work page 2022

[8] [8]

Large image datasets: A pyrrhic win for computer vision?WACV, 2021

Abeba Birhane and Vinay Uday Prabhu. Large image datasets: A pyrrhic win for computer vision?WACV, 2021. 2

work page 2021

[9] [9]

Erasing undesir- able concepts in diffusion models with adversarial preserva- tion.NeurIPS, 2024

Anh Bui, Long Vuong, Khanh Doan, Trung Le, Paul Mon- tague, Tamas Abraham, and Dinh Phung. Erasing undesir- able concepts in diffusion models with adversarial preserva- tion.NeurIPS, 2024. 1

work page 2024

[10] [10]

Fantastic targets for concept erasure in diffusion models and where to find them.ICLR, 2025

Anh Tuan Bui, Thuy-Trang Vu, Long Tung Vuong, Trung Le, Paul Montague, Tamas Abraham, Junae Kim, and Dinh Phung. Fantastic targets for concept erasure in diffusion models and where to find them.ICLR, 2025. 3, 12

work page 2025

[11] [11]

Pixart-σ: Weak-to-strong training of dif- fusion transformer for 4k text-to-image generation.ECCV,

Junsong Chen, Chongjian Ge, Enze Xie, Yue Wu, Lewei Yao, Xiaozhe Ren, Zhongdao Wang, Ping Luo, Huchuan Lu, and Zhenguo Li. Pixart-σ: Weak-to-strong training of dif- fusion transformer for 4k text-to-image generation.ECCV,

work page

[12] [12]

Word2vec.Natural Language Engi- neering, 2017

Kenneth Ward Church. Word2vec.Natural Language Engi- neering, 2017. 15

work page 2017

[13] [13]

Gemini 2.5: Pushing the Frontier with Advanced Reasoning, Multimodality, Long Context, and Next Generation Agentic Capabilities

Gheorghe Comanici, Eric Bieber, Mike Schaekermann, Ice Pasupat, Noveen Sachdeva, Inderjit Dhillon, Marcel Blis- tein, Ori Ram, Dan Zhang, Evan Rosen, et al. Gemini 2.5: Pushing the frontier with advanced reasoning, multimodality, long context, and next generation agentic capabilities.arXiv preprint arXiv:2507.06261, 2025. 1, 2

work page internal anchor Pith review Pith/arXiv arXiv 2025

[14] [14]

Language modeling with gated convolutional net- works.ICML, 2017

Yann N Dauphin, Angela Fan, Michael Auli, and David Grangier. Language modeling with gated convolutional net- works.ICML, 2017. 4

work page 2017

[15] [15]

Measuring and mitigating unintended bias in text classification.AAAI/ACM on AI, Ethics, and Society,

Lucas Dixon, John Li, Jeffrey Sorensen, Nithum Thain, and Lucy Vasserman. Measuring and mitigating unintended bias in text classification.AAAI/ACM on AI, Ethics, and Society,

work page

[16] [16]

Scaling recti- fied flow transformers for high-resolution image synthesis

Patrick Esser, Sumith Kulal, Andreas Blattmann, Rahim Entezari, Jonas Müller, Harry Saini, Yam Levi, Dominik Lorenz, Axel Sauer, Frederic Boesel, et al. Scaling recti- fied flow transformers for high-resolution image synthesis. ICML, 2024. 2, 5

work page 2024

[17] [17]

Salun: Empowering machine un- learning via gradient-based weight saliency in both image classification and generation.ICLR, 2024

Chongyu Fan, Jiancheng Liu, Yihua Zhang, Dennis Wei, Eric Wong, and Sijia Liu. Salun: Empowering machine un- learning via gradient-based weight saliency in both image classification and generation.ICLR, 2024. 1

work page 2024

[18] [18]

Erasing concepts from diffusion models.ICCV, 2023

Rohit Gandikota, Joanna Materzynska, Jaden Fiotto- Kaufman, and David Bau. Erasing concepts from diffusion models.ICCV, 2023. 1, 2, 5, 6, 19

work page 2023

[19] [19]

Unified concept editing in dif- fusion models.WACV, 2024

Rohit Gandikota, Hadas Orgad, Yonatan Belinkov, Joanna Materzy´nska, and David Bau. Unified concept editing in dif- fusion models.WACV, 2024. 1, 2, 5, 6, 15, 16, 19

work page 2024

[20] [20]

Reliable and efficient concept erasure of text- to-image diffusion models.ECCV, 2024

Chao Gong, Kai Chen, Zhipeng Wei, Jingjing Chen, and Yu- Gang Jiang. Reliable and efficient concept erasure of text- to-image diffusion models.ECCV, 2024. 3, 19

work page 2024

[21] [21]

Giphy celebrity detector.https://github

Nick Hasty, Ihor Kroosh, Dmitry V oitekh, and Dmytro Ko- rduban. Giphy celebrity detector.https://github. com/Giphy/celeb-detection-oss, 2024. 6, 15, 16

work page 2024

[22] [22]

Selective amnesia: A continual learning approach to forgetting in deep generative models

Alvin Heng and Harold Soh. Selective amnesia: A continual learning approach to forgetting in deep generative models. NeurIPS, 2023. 1, 2

work page 2023

[23] [23]

Clipscore: A reference-free evaluation met- ric for image captioning.EMNLP, 2021

Jack Hessel, Ari Holtzman, Maxwell Forbes, Ronan Le Bras, and Yejin Choi. Clipscore: A reference-free evaluation met- ric for image captioning.EMNLP, 2021. 5, 6, 16

work page 2021

[24] [24]

Gans trained by a two time-scale update rule converge to a local nash equilib- rium.NeurIPS, 2017

Martin Heusel, Hubert Ramsauer, Thomas Unterthiner, Bernhard Nessler, and Sepp Hochreiter. Gans trained by a two time-scale update rule converge to a local nash equilib- rium.NeurIPS, 2017. 6, 16

work page 2017

[25] [25]

Lora: Low-rank adaptation of large language models.ICLR, 2021

Edward J Hu, Phillip Wallis, Zeyuan Allen-Zhu, Yuanzhi Li, Shean Wang, Lu Wang, Weizhu Chen, et al. Lora: Low-rank adaptation of large language models.ICLR, 2021. 2

work page 2021

[26] [26]

Token merging for training- free semantic binding in text-to-image synthesis.NeurIPS,

Taihang Hu, Linxuan Li, Joost van de Weijer, Hongcheng Gao, Fahad Shahbaz Khan, Jian Yang, Ming-Ming Cheng, Kai Wang, and Yaxing Wang. Token merging for training- free semantic binding in text-to-image synthesis.NeurIPS,

work page

[27] [27]

Receler: Reli- able concept erasing of text-to-image diffusion models via lightweight erasers.ECCV, 2023

Chi-Pin Huang, Kai-Po Chang, Chung-Ting Tsai, Yung- Hsuan Lai, and Yu-Chiang Frank Wang. Receler: Reli- able concept erasing of text-to-image diffusion models via lightweight erasers.ECCV, 2023. 1, 4, 5

work page 2023

[28] [28]

Robo-writers: the rise and risks of language-generating ai.Nature, 2021

Matthew Hutson. Robo-writers: the rise and risks of language-generating ai.Nature, 2021. 2

work page 2021

[29] [29]

Mixtral of Experts

Albert Q Jiang, Alexandre Sablayrolles, Antoine Roux, Arthur Mensch, Blanche Savary, Chris Bamford, Deven- dra Singh Chaplot, Diego de las Casas, Emma Bou Hanna, Florian Bressand, et al. Mixtral of experts.arXiv preprint arXiv:2401.04088, 2024. 1, 4

work page internal anchor Pith review Pith/arXiv arXiv 2024

[30] [30]

Overcoming catastrophic forgetting in neu- ral networks.Proceedings of the national academy of sci- ences, 2017

James Kirkpatrick, Razvan Pascanu, Neil Rabinowitz, Joel Veness, Guillaume Desjardins, Andrei A Rusu, Kieran Milan, John Quan, Tiago Ramalho, Agnieszka Grabska- Barwinska, et al. Overcoming catastrophic forgetting in neu- ral networks.Proceedings of the national academy of sci- ences, 2017. 2

work page 2017

[31] [31]

Ablating con- cepts in text-to-image diffusion models.ICCV, 2023

Nupur Kumari, Bingliang Zhang, Sheng-Yu Wang, Eli Shechtman, Richard Zhang, and Jun-Yan Zhu. Ablating con- cepts in text-to-image diffusion models.ICCV, 2023. 1, 2

work page 2023

[32] [32]

Nsfw detection machine learning model

Gant Laborde. Nsfw detection machine learning model. https : / / github . com / GantMan / nsfw _ model,

work page

[33] [33]

Flux.1-dev.https : / / huggingface

Black Forest Labs. Flux.1-dev.https : / / huggingface . co / black - forest - labs / FLUX.1-dev, 2024. 1, 2

work page 2024

[34] [34]

Online continual learning on hierarchical label expansion.ICCV, 2023

Byung Hyun Lee, Okchul Jung, Jonghyun Choi, and Se Young Chun. Online continual learning on hierarchical label expansion.ICCV, 2023. 2

work page 2023

[35] [35]

Dou- bly perturbed task free continual learning.AAAI, 2024

Byung Hyun Lee, Min-hwan Oh, and Se Young Chun. Dou- bly perturbed task free continual learning.AAAI, 2024. 2

work page 2024

[36] [36]

Lo- calized concept erasure for text-to-image diffusion models using training-free gated low-rank adaptation.CVPR, 2025

Byung Hyun Lee, Sungjin Lim, and Se Young Chun. Lo- calized concept erasure for text-to-image diffusion models using training-free gated low-rank adaptation.CVPR, 2025. 1, 2, 3, 4, 5

work page 2025

[37] [37]

Concept pinpoint eraser for text- to-image diffusion models via residual attention gate.ICLR,

Byung Hyun Lee, Sungjin Lim, Seunggyu Lee, Dong Un Kang, and Se Young Chun. Concept pinpoint eraser for text- to-image diffusion models via residual attention gate.ICLR,

work page

[38] [38]

1, 2, 3, 4, 5, 6, 15, 16, 19

work page

[39] [39]

Gshard: Scaling giant models with conditional computation and automatic sharding.ICLR,

Dmitry Lepikhin, HyoukJoong Lee, Yuanzhong Xu, Dehao Chen, Orhan Firat, Yanping Huang, Maxim Krikun, Noam Shazeer, and Zhifeng Chen. Gshard: Scaling giant models with conditional computation and automatic sharding.ICLR,

work page

[40] [40]

Speed: Scalable, precise, and efficient concept erasure for diffusion models.arXiv preprint arXiv:2503.07392,

Ouxiang Li, Yuan Wang, Xinting Hu, Houcheng Jiang, Tao Liang, Yanbin Hao, Guojun Ma, and Fuli Feng. Speed: Scal- able, precise, and efficient concept erasure for diffusion mod- els.arXiv preprint arXiv:2503.07392, 2025. 2, 4, 5, 6, 17

work page arXiv 2025

[41] [41]

Safetydpo: Scalable safety alignment for text-to-image generation,

Runtao Liu, Chen I Chieh, Jindong Gu, Jipeng Zhang, Renjie Pi, Qifeng Chen, Philip Torr, Ashkan Khakzar, and Fabio Pizzati. Safetydpo: Scalable safety alignment for text-to- image generation.arXiv preprint arXiv:2412.10493, 2024. 1

work page arXiv 2024

[42] [42]

Mace: Mass concept erasure in diffusion models.CVPR, 2024

Shilin Lu, Zilan Wang, Leyang Li, Yanzhu Liu, and Adams Wai-Kin Kong. Mace: Mass concept erasure in diffusion models.CVPR, 2024. 1, 2, 3, 4, 5, 6, 15, 16, 19

work page 2024

[43] [43]

One-dimensional adapter to rule them all: Concepts, diffusion models and erasing applications.CVPR, 2024

Mengyao Lyu, Yuhong Yang, Haiwen Hong, Hui Chen, Xuan Jin, Yuan He, Hui Xue, Jungong Han, and Guiguang Ding. One-dimensional adapter to rule them all: Concepts, diffusion models and erasing applications.CVPR, 2024. 1, 2

work page 2024

[44] [44]

Holistic unlearning benchmark: A multi-faceted eval- uation for text-to-image diffusion model unlearning.ICCV,

Saemi Moon, Minjong Lee, Sangdon Park, and Dongwoo Kim. Holistic unlearning benchmark: A multi-faceted eval- uation for text-to-image diffusion model unlearning.ICCV,

work page

[45] [45]

Glide: Towards photorealis- tic image generation and editing with text-guided diffusion models.ICML, 2022

Alexander Quinn Nichol, Prafulla Dhariwal, Aditya Ramesh, Pranav Shyam, Pamela Mishkin, Bob Mcgrew, Ilya Sutskever, and Mark Chen. Glide: Towards photorealis- tic image generation and editing with text-guided diffusion models.ICML, 2022. 1

work page 2022

[46] [46]

Stable diffusion 1 vs 2 - what you need to know.https://www.assemblyai.com/blog/ stable- diffusion- 1- vs- 2- what- you- need- to-know/, 2022

Ryan O’connor. Stable diffusion 1 vs 2 - what you need to know.https://www.assemblyai.com/blog/ stable- diffusion- 1- vs- 2- what- you- need- to-know/, 2022. 2

work page 2022

[47] [47]

Dall-e 2 preview - risks and limitations, 2022

OpenAI. Dall-e 2 preview - risks and limitations, 2022. 1, 2

work page 2022

[48] [48]

Edit- ing implicit assumptions in text-to-image diffusion models

Hadas Orgad, Bahjat Kawar, and Yonatan Belinkov. Edit- ing implicit assumptions in text-to-image diffusion models. ICCV, 2023. 2

work page 2023

[49] [49]

Scalable diffusion models with transformers.ICCV, 2023

William Peebles and Saining Xie. Scalable diffusion models with transformers.ICCV, 2023. 14

work page 2023

[50] [50]

Robust mixture mod- elling using the t distribution.Statistics and computing,

David Peel and Geoffrey J McLachlan. Robust mixture mod- elling using the t distribution.Statistics and computing,

work page

[51] [51]

Learn- ing transferable visual models from natural language super- vision.ICML, 2021

Alec Radford, Jong Wook Kim, Chris Hallacy, Aditya Ramesh, Gabriel Goh, Sandhini Agarwal, Girish Sastry, Amanda Askell, Pamela Mishkin, Jack Clark, et al. Learn- ing transferable visual models from natural language super- vision.ICML, 2021. 16

work page 2021

[52] [52]

Hierarchical Text-Conditional Image Generation with CLIP Latents

Aditya Ramesh, Prafulla Dhariwal, Alex Nichol, Casey Chu, and Mark Chen. Hierarchical text-conditional image gen- eration with clip latents.arXiv preprint arXiv:2204.06125,

work page internal anchor Pith review arXiv

[53] [53]

Red-teaming the stable diffusion safety filter.NeurIPS ML Safety Workshop, 2022

Javier Rando, Daniel Paleka, David Lindner, Lennart Heim, and Florian Tramer. Red-teaming the stable diffusion safety filter.NeurIPS ML Safety Workshop, 2022. 1, 2

work page 2022

[54] [54]

Scaling vision with sparse mix- ture of experts.NeurIPS, 2021

Carlos Riquelme, Joan Puigcerver, Basil Mustafa, Maxim Neumann, Rodolphe Jenatton, André Susano Pinto, Daniel Keysers, and Neil Houlsby. Scaling vision with sparse mix- ture of experts.NeurIPS, 2021. 1, 4

work page 2021

[55] [55]

Experience replay for continual learning.NeurIPS, 2019

David Rolnick, Arun Ahuja, Jonathan Schwarz, Timothy Lil- licrap, and Gregory Wayne. Experience replay for continual learning.NeurIPS, 2019. 2

work page 2019

[56] [56]

Stable diffusion 2.0 release.https:// stability

Robin Rombach. Stable diffusion 2.0 release.https:// stability . ai / news / stable - diffusion - v2 - release, 2022. 2, 19

work page 2022

[57] [57]

High-resolution image syn- thesis with latent diffusion models.CVPR, 2022

Robin Rombach, Andreas Blattmann, Dominik Lorenz, Patrick Esser, and Björn Ommer. High-resolution image syn- thesis with latent diffusion models.CVPR, 2022. 1, 2, 5

work page 2022

[58] [58]

Stable diffusion v1 model card.https://huggingface.co/CompVis/ stable-diffusion-v1-4, 2022

Robin Rombach, Andreas Blattmann, Dominik Lorenz, Patrick Esser, and Björn Ommer. Stable diffusion v1 model card.https://huggingface.co/CompVis/ stable-diffusion-v1-4, 2022. 2, 6, 8, 12, 13, 14, 19

work page 2022

[59] [59]

Photorealistic text-to-image diffusion models with deep language understanding.NeurIPS, 2022

Chitwan Saharia, William Chan, Saurabh Saxena, Lala Li, Jay Whang, Emily L Denton, Kamyar Ghasemipour, Raphael Gontijo Lopes, Burcu Karagol Ayan, Tim Salimans, et al. Photorealistic text-to-image diffusion models with deep language understanding.NeurIPS, 2022. 1

work page 2022

[60] [60]

Safe latent diffusion: Mitigating inappro- priate degeneration in diffusion models.CVPR, 2023

Patrick Schramowski, Manuel Brack, Björn Deiseroth, and Kristian Kersting. Safe latent diffusion: Mitigating inappro- priate degeneration in diffusion models.CVPR, 2023. 1, 2, 6

work page 2023

[61] [61]

Laion-5b: An open large-scale dataset for training next generation image-text models.NeurIPS, 2022

Christoph Schuhmann, Romain Beaumont, Richard Vencu, Cade Gordon, Ross Wightman, Mehdi Cherti, Theo Coombes, Aarush Katta, Clayton Mullis, Mitchell Worts- man, et al. Laion-5b: An open large-scale dataset for training next generation image-text models.NeurIPS, 2022. 2

work page 2022

[62] [62]

Geometrical prop- erties of text token embeddings for strong semantic binding in text-to-image generation, 2025

Hoigi Seo, Junseo Bang, Haechang Lee, Joohoon Lee, Byung Hyun Lee, and Se Young Chun. Geometrical prop- erties of text token embeddings for strong semantic binding in text-to-image generation, 2025. 13

work page 2025

[63] [63]

Outra- geously large neural networks: The sparsely-gated mixture- of-experts layer.ICLR, 2017

Noam Shazeer, Azalia Mirhoseini, Krzysztof Maziarz, Andy Davis, Quoc Le, Geoffrey Hinton, and Jeff Dean. Outra- geously large neural networks: The sparsely-gated mixture- of-experts layer.ICLR, 2017. 1

work page 2017

[64] [64]

Diffusion art or digital forgery? investigating data replication in diffusion models.CVPR,

Gowthami Somepalli, Vasu Singla, Micah Goldblum, Jonas Geiping, and Tom Goldstein. Diffusion art or digital forgery? investigating data replication in diffusion models.CVPR,

work page

[65] [65]

Stereo: A two- stage framework for adversarially robust concept erasing from text-to-image diffusion models.CVPR, 2025

Koushik Srivatsan, Fahad Shamshad, Muzammal Naseer, Vishal M Patel, and Karthik Nandakumar. Stereo: A two- stage framework for adversarially robust concept erasing from text-to-image diffusion models.CVPR, 2025. 2, 3

work page 2025

[66] [66]

Demys- tifying mmd gans.ICLR, 2018

JD Sutherland, Michael Arbel, and Arthur Gretton. Demys- tifying mmd gans.ICLR, 2018. 6, 16

work page 2018

[67] [67]

Ring-a-bell! how reliable are concept removal methods for diffusion models?ICLR, 2024

Yu-Lin Tsai, Chia-yi Hsu, Chulin Xie, Chih-hsun Lin, Jia You Chen, Bo Li, Pin-Yu Chen, Chia-Mu Yu, and Chun- ying Huang. Ring-a-bell! how reliable are concept removal methods for diffusion models?ICLR, 2024. 6, 18, 19

work page 2024

[68] [68]

Word representations via gaussian embedding.ICLR, 2015

Luke Vilnis and Andrew McCallum. Word representations via gaussian embedding.ICLR, 2015. 2

work page 2015

[69] [69]

Safree: Training-free and adaptive guard for safe text-to-image and video generation.ICLR, 2025

Jaehong Yoon, Shoubin Yu, Vaidehi Patil, Huaxiu Yao, and Mohit Bansal. Safree: Training-free and adaptive guard for safe text-to-image and video generation.ICLR, 2025. 2, 5, 6, 16

work page 2025

[70] [70]

arXiv preprint arXiv:2303.17591 , year=

Eric Zhang, Kai Wang, Xingqian Xu, Zhangyang Wang, and Humphrey Shi. Forget-me-not: Learning to forget in text-to- image diffusion models.arXiv preprint arXiv:2303.17591,

work page arXiv

[71] [71]

The unreasonable effectiveness of deep features as a perceptual metric

Richard Zhang, Phillip Isola, Alexei A Efros, Eli Shechtman, and Oliver Wang. The unreasonable effectiveness of deep features as a perceptual metric. InCVPR, 2018. 5

work page 2018

[72] [72]

for now , author=

Yimeng Zhang, Jinghan Jia, Xin Chen, Aochuan Chen, Yi- hua Zhang, Jiancheng Liu, Ke Ding, and Sijia Liu. To gener- ate or not? safety-driven unlearned diffusion models are still easy to generate unsafe images... for now.arXiv preprint arXiv:2310.11868, 2023. 18, 19

work page arXiv 2023

[73] [73]

Defensive unlearning with adversarial training for robust concept erasure in diffusion models.NeurIPS, 2024

Yimeng Zhang, Xin Chen, Jinghan Jia, Yihua Zhang, Chongyu Fan, Jiancheng Liu, Mingyi Hong, Ke Ding, and Sijia Liu. Defensive unlearning with adversarial training for robust concept erasure in diffusion models.NeurIPS, 2024. 1, 6 Erasing Thousands of Concepts: Towards Scalable and Practical Concept Erasure for Text-to-Image Diffusion Model Supplementary Ma...

work page 2024