pith. sign in

arxiv: 2605.30825 · v1 · pith:ASLANSVTnew · submitted 2026-05-29 · 💻 cs.LG · cs.AI· math.OC· stat.ML

Unlearning in Diffusion Models: A Unified Framework with KL Divergence and Likelihood Constraints

Pith reviewed 2026-06-28 23:37 UTC · model grok-4.3

classification 💻 cs.LG cs.AImath.OCstat.ML
keywords unlearningdiffusion modelsKL divergenceconstrained optimizationstrong dualityconcept unlearningdata unlearninglikelihood constraints
0
0 comments X

The pith

A KL-constrained optimization framework for diffusion model unlearning admits strong duality and explicit optimal targets.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper formulates unlearning in diffusion models as three constrained optimization problems that minimize deviation from a pretrained model while enforcing separation from unwanted data or concepts via reverse KL, forward KL, and likelihood constraints. The first two generalize prior work on concept and data removal; the likelihood version supplies a new natural formulation. Despite nonconvex constraints, strong duality is shown to hold, which yields explicit characterizations of the optimal unlearning solutions and permits primal-dual algorithms. Experiments indicate that the KL-based versions produce better retention-unlearning tradeoffs than weight-based baselines, while the likelihood version matches removal strength yet preserves retained concepts more effectively.

Core claim

Unlearning is cast as minimizing deviation from the pretrained diffusion model subject to explicit separation constraints from the unlearning distributions. Three concrete problems are defined using reverse KL divergence, forward KL divergence, and likelihood constraints. Strong duality holds for all three despite nonconvexity, so the optimal solutions can be characterized explicitly as unlearning targets and primal-dual algorithms can be derived for each case.

What carries the argument

Three constrained optimization problems that minimize deviation from the pretrained model subject to KL or likelihood separation constraints from the unlearning distributions.

If this is right

  • Optimal unlearning targets can be written in closed form from the dual solutions.
  • Primal-dual algorithms become available for each of the three formulations.
  • KL-constrained versions achieve superior retention-unlearning tradeoffs relative to weight-based baselines.
  • The likelihood-constrained version matches removal effectiveness while preserving retained concepts better than baselines.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same duality technique could be tested on other score-based or flow-based generative models to check whether explicit targets remain available.
  • When exact samples from the unlearning distribution are unavailable, one could substitute empirical approximations and measure the resulting degradation in the achieved tradeoff.
  • The explicit targets supply a benchmark that future heuristic unlearning methods can be compared against directly.

Load-bearing premise

The unlearning distributions are known exactly and can be sampled from during optimization.

What would settle it

A concrete instance in which the duality gap remains strictly positive for any of the three nonconvex problems, or in which the derived primal-dual algorithms fail to reach the claimed retention-unlearning tradeoffs.

Figures

Figures reproduced from arXiv: 2605.30825 by Alejandro Ribeiro, Dongsheng Ding, Shervin Khalafi.

Figure 1
Figure 1. Figure 1: Heat maps of a three-Gaussian mixture before/after unlearning one mode (Right up). From left to right: pretrained model, reverse KL-constrained unlearning, forward KL-constrained unlearning, and likelihood-constrained unlearning. to unlearning at the data level. Finally, we formulate a distribution optimization problem (LU) that constrains the expected likelihood of the concepts to be unlearned to remain b… view at source ↗
Figure 2
Figure 2. Figure 2 [PITH_FULL_IMAGE:figures/full_fig_p008_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Retention–unlearning tradeoff for forward KL￾constrained unlearning and unconstrained baseline. Our con￾strained model deviates less from the pretrained model at the same level of unlearning (max SSCD). ditional DDPM trained on the CelebA-HQ dataset as our pretrained model (Ho et al., 2020), and randomly select three images from the dataset to unlearn. As shown in [PITH_FULL_IMAGE:figures/full_fig_p008_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: Performance of likelihood-constrained unlearning on a three-Gaussian mixture. From left to right: pretrained model, reverse￾KL–constrained unlearning, likelihood-constrained unlearning, and the KL divergence to the retained model versus the likelihood of the unlearning mode [PITH_FULL_IMAGE:figures/full_fig_p009_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: Performance of likelihood-constrained unlearning on a text-to-image model. From left to right: images generated by likelihood￾constrained unlearning and baseline for unlearning the concept of ‘horse’ (Left), for retaining the concept of ‘cowboy’ (Middle left), the retention KID (Middle right) and the KL divergence (Right) to the retained model versus the likelihood of unlearning concept. 4.3. Reverse KL-Co… view at source ↗
Figure 6
Figure 6. Figure 6: Retention–unlearning tradeoff for reverse KL￾constrained unlearning and unconstrained baseline. The KL diver￾gence to the pretrained model versus the KL constraint violation (Left) and the unlearning concept CLIP score (Right). characterize optimal unlearning targets and develop pri￾mal–dual algorithms to approximate them. Empirically, we demonstrate our constrained unlearning approach on image generation … view at source ↗
read the original abstract

Unlearning in diffusion models aims to remove undesirable data or concepts while preserving the utility of pretrained models -- two fundamentally conflicting objectives. We propose a principled constrained optimization framework that formulates unlearning as minimizing the deviation from a pretrained model, subject to explicit separation constraints from the unlearning distributions. Specifically, we formulate three constrained optimization problems based on reverse and forward KL divergences, and likelihood constraints. The first two generalize existing approaches for concept and data unlearning, while the third offers a novel and natural formulation for unlearning. Despite the nonconvexity of the KL constraints, we establish strong duality for all three problems, enabling us to explicitly characterize their optimal solutions as unlearning targets and develop primal-dual algorithms for each formulation. Experimental results demonstrate that our KL-constrained approach achieves superior retention-unlearning tradeoffs compared to weight-based baselines for concept and data unlearning, and that our likelihood-based approach matches unlearning effectiveness while better preserving retained concepts compared to baselines.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

1 major / 1 minor

Summary. The paper proposes a constrained optimization framework for unlearning in diffusion models, formulating three problems (reverse-KL, forward-KL, and likelihood) that minimize deviation from a pretrained model subject to separation constraints from unlearning distributions. It asserts that strong duality holds for these nonconvex programs despite the nonconvex KL constraints, enabling explicit characterization of optimal unlearning targets and primal-dual algorithms; experiments are claimed to show superior retention-unlearning tradeoffs versus weight-based baselines.

Significance. If the strong-duality claim is rigorously established with verifiable constraint qualifications and the empirical results are reproducible with error bars, the framework would unify existing unlearning methods under a principled optimization lens and supply explicit targets plus algorithms that could improve practical tradeoffs in diffusion-model safety.

major comments (1)
  1. [Abstract] Abstract: the central claim that 'strong duality' holds for the three nonconvex KL-constrained programs (enabling closed-form optimal targets and primal-dual algorithms) is load-bearing, yet the manuscript supplies no derivation, no statement of the constraint qualification invoked (Slater, MFCQ, or LICQ), and no check that it is satisfied on the manifold of diffusion score functions; without this the duality gap may be positive and the claimed solutions cease to be optimal.
minor comments (1)
  1. The experimental section reports no error bars, no description of how nonconvexity was handled during optimization, and no details on sampling from the unlearning distributions; these omissions weaken the empirical tradeoff claims but are not load-bearing for the duality result.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for the detailed review and for highlighting the need for a rigorous treatment of the strong duality result. We agree that this is a central claim and that the manuscript as submitted does not supply a complete derivation or constraint-qualification verification. We address this point below and will revise accordingly.

read point-by-point responses
  1. Referee: [Abstract] Abstract: the central claim that 'strong duality' holds for the three nonconvex KL-constrained programs (enabling closed-form optimal targets and primal-dual algorithms) is load-bearing, yet the manuscript supplies no derivation, no statement of the constraint qualification invoked (Slater, MFCQ, or LICQ), and no check that it is satisfied on the manifold of diffusion score functions; without this the duality gap may be positive and the claimed solutions cease to be optimal.

    Authors: We acknowledge that the submitted manuscript states the strong-duality result without providing the full derivation, the invoked constraint qualification, or an explicit verification on the score-function manifold. In the revision we will add a dedicated appendix that (i) states the precise constraint qualification (Slater’s condition, which is satisfied under the boundedness assumptions we already place on the score functions), (ii) supplies the complete duality proof for each of the three programs, and (iii) verifies that the qualification holds for the diffusion-model parameterizations used in the experiments. These additions will confirm that the duality gap is zero and that the derived unlearning targets are optimal. revision: yes

Circularity Check

0 steps flagged

No circularity; derivation applies standard duality to external distributions

full rationale

The paper formulates three constrained optimization problems using external unlearning distributions as inputs and claims to establish strong duality despite nonconvexity, yielding explicit optimal targets and primal-dual algorithms. No equations reduce the claimed optima to quantities fitted from the same data, no self-citation chain bears the load for the duality result, and no ansatz or renaming is smuggled in. The framework remains self-contained against external benchmarks with the unlearning distributions treated as given.

Axiom & Free-Parameter Ledger

1 free parameters · 2 axioms · 0 invented entities

Framework rests on standard constrained optimization duality plus the modeling choice that unlearning targets can be expressed as explicit distributional constraints; no new physical entities or ad-hoc constants are introduced.

free parameters (1)
  • Lagrange multipliers (dual variables)
    Introduced to enforce the separation constraints in the primal-dual algorithms; their values are determined during optimization rather than preset.
axioms (2)
  • domain assumption Strong duality holds for the three nonconvex constrained problems
    Invoked in the abstract to characterize optimal unlearning targets despite nonconvex KL constraints.
  • domain assumption Unlearning distributions are known and samplable
    Required to define the explicit separation constraints in all three formulations.

pith-pipeline@v0.9.1-grok · 5707 in / 1412 out tokens · 18949 ms · 2026-06-28T23:37:57.050834+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

29 extracted references · 19 canonical work pages · 9 internal anchors

  1. [1]

    Data unlearning in diffusion models.arXiv preprint arXiv:2503.01034,

    Alberti, S., Hasanaliyev, K., Shah, M., and Ermon, S. Data unlearning in diffusion models.arXiv preprint arXiv:2503.01034,

  2. [2]

    URL https:// arxiv.org/abs/1801.01401. Boyd, S. and Vandenberghe, L.Convex optimization. Cam- bridge university press,

  3. [3]

    Score forgetting distilla- tion: A swift, data-free method for machine unlearning in diffusion models.arXiv preprint arXiv:2409.11219,

    Chen, T., Zhang, S., and Zhou, M. Score forgetting distilla- tion: A swift, data-free method for machine unlearning in diffusion models.arXiv preprint arXiv:2409.11219,

  4. [4]

    K., Seamann, A., Cui, J., Khare, S., and Fioretto, F

    Christopher, J. K., Seamann, A., Cui, J., Khare, S., and Fioretto, F. Constrained diffusion for protein de- sign with hard structural constraints.arXiv preprint arXiv:2510.14989,

  5. [5]

    General Data Protection Regulation (GDPR): Regulation (EU) 2016/679

    European Union. General Data Protection Regulation (GDPR): Regulation (EU) 2016/679. https:// gdpr-info.eu,

  6. [6]

    SalUn: Empowering Machine Unlearning via Gradient-based Weight Saliency in Both Image Classification and Generation

    Fan, C., Liu, J., Zhang, Y ., Wong, E., Wei, D., and Liu, S. Salun: Empowering machine unlearning via gradient- based weight saliency in both image classification and generation.arXiv preprint arXiv:2310.12508, 2023a. Fan, Y ., Watkins, O., Du, Y ., Liu, H., Ryu, M., Boutilier, C., Abbeel, P., Ghavamzadeh, M., Lee, K., and Lee, K. Dpok: Reinforcement lear...

  7. [7]

    URL https:// arxiv.org/abs/1512.03385. Heng, A. and Soh, H. Selective amnesia: A continual learning approach to forgetting in deep generative models. 10 Unlearning in Diffusion Models: A Unified Framework with KL Divergence and Likelihood Constraints Advances in Neural Information Processing Systems, 36: 17170–17194,

  8. [8]

    CLIPScore: A Reference-free Evaluation Metric for Image Captioning

    URL https://arxiv.org/ abs/2104.08718. Ho, J., Jain, A., and Abbeel, P. Denoising diffusion prob- abilistic models,

  9. [9]

    URL https://arxiv.org/ abs/2006.11239. Hu, E. J., Shen, Y ., Wallis, P., Allen-Zhu, Z., Li, Y ., Wang, S., Wang, L., and Chen, W. Lora: Low-rank adaptation of large language models,

  10. [10]

    LoRA: Low-Rank Adaptation of Large Language Models

    URL https://arxiv. org/abs/2106.09685. Khalafi, S., Ding, D., and Ribeiro, A. Constrained diffusion models via dual training.Advances in Neural Information Processing Systems, 37:26543–26576,

  11. [11]

    org/abs/2302.03792

    URL https://arxiv. org/abs/2302.03792. Koyejo, O. and Ghosh, J. A representation approach for rel- ative entropy minimization with expectation constraints. InICML WDDL workshop,

  12. [12]

    The Principles of Diffusion Models

    Lai, C.-H., Song, Y ., Kim, D., Mitsufuji, Y ., and Ermon, S. The principles of diffusion models.arXiv preprint arXiv:2510.21890,

  13. [13]

    Towards resilient safety-driven unlearning for diffusion models against downstream fine-tuning.arXiv preprint arXiv:2507.16302,

    Li, B., Gu, R., Wang, J., Qi, L., Li, Y ., Wang, R., Qin, Z., and Zhang, T. Towards resilient safety-driven unlearning for diffusion models against downstream fine-tuning.arXiv preprint arXiv:2507.16302,

  14. [14]

    K., Koenig, S., and Fioretto, F

    Liang, J., Christopher, J. K., Koenig, S., and Fioretto, F. Multi-agent path finding in continuous spaces with projected diffusion models.arXiv preprint arXiv:2412.17993,

  15. [15]

    K., Koenig, S., and Fioretto, F

    Liang, J., Christopher, J. K., Koenig, S., and Fioretto, F. Si- multaneous multi-robot motion planning with projected diffusion models.arXiv preprint arXiv:2502.03607,

  16. [16]

    Projected Coupled Diffusion for Test-Time Constrained Joint Generation

    Luan, H., Goh, Y . X., Ng, S.-K., and Ling, C. K. Pro- jected coupled diffusion for test-time constrained joint generation.arXiv preprint arXiv:2508.10531,

  17. [17]

    URL https://arxiv.org/abs/2208. 11970. Narasimhan, S. S., Agarwal, S., Rout, L., Shakkottai, S., and Chinchali, S. P. Constrained posterior sampling: Time series generation with hard constraints.arXiv preprint arXiv:2410.12652,

  18. [18]

    and Park, M

    Park, J. and Park, M. Data unlearning beyond uniform for- getting via diffusion time and frequency selection.arXiv preprint arXiv:2510.17917,

  19. [19]

    High-Resolution Image Synthesis with Latent Diffusion Models

    URL https://arxiv.org/ abs/2112.10752. Schuhmann, C., Beaumont, R., Vencu, R., Gordon, C., Wightman, R., Cherti, M., Coombes, T., Katta, A., Mullis, C., Wortsman, M., et al. Laion-5b: An open large-scale dataset for training next generation image-text models. Advances in neural information processing systems, 35: 25278–25294,

  20. [20]

    K., Oneto, L., Anguita, D., and Fioretto, F

    Zampini, S., Christopher, J. K., Oneto, L., Anguita, D., and Fioretto, F. Training-free constrained generation with sta- ble diffusion models.arXiv preprint arXiv:2502.05625,

  21. [21]

    Related Work Unlearning in diffusion models.Our constrained unlearning framework is related to prior work on unlearning in diffusion models

    12 Unlearning in Diffusion Models: A Unified Framework with KL Divergence and Likelihood Constraints A. Related Work Unlearning in diffusion models.Our constrained unlearning framework is related to prior work on unlearning in diffusion models. In this setting, given a pretrained diffusion model, unlearning aims to preserve the ability to generate diverse...

  22. [22]

    Our constrained unlearning approach falls into the second category

    and (ii) soft constraints imposed in expectation (Khalafi et al., 2024; Chamon et al., 2024; Khalafi et al., 2025). Our constrained unlearning approach falls into the second category. In contrast to prior work (Khalafi et al., 2024; 2025), our constrained unlearning problem is nonconvex even in the distribution space, which constitutes the main challenge ...

  23. [23]

    Thus, they share the optimal solution p⋆ rev and the optimal value P ⋆ rev

    (12) We note that the key difference between Problem (RU) and Problem (12) is the explicit constraint Eµ [p(x) ] = 1 . Thus, they share the optimal solution p⋆ rev and the optimal value P ⋆ rev. We define the Lagrangian for Problem (12) as bLrev(p, λ, ρ) :=L rev(p, λ) +ρ(E µ [p(x) ]−1) . The associated dual function is bDrev(λ, ρ) = minimizep∈ P bLrev(p, ...

  24. [24]

    Second, we translate the strong duality for Problem (16) to the original problem (FU)

    First, we employ Lyapunov’s convexity theorem to show that the epigraph for Problem (16) is non-empty and convex, implying that (16) is strongly dual. Second, we translate the strong duality for Problem (16) to the original problem (FU). B.4. Proof of Corollary 2 Proof.For anyλ≥0, we rearrange the LagrangianL fw(p, λ)as follows, Lfw(p, λ) =D KL(q∥p) + mX ...

  25. [25]

    Second, we translate the strong duality for Problem (22) to the original problem (21)

    First, we employ Lyapunov’s convexity theorem to show that the epigraph for Problem (22) is non-empty and convex, implying that (22) is strongly dual. Second, we translate the strong duality for Problem (22) to the original problem (21). To prove strong duality for Problem (11), we introduce some notation as follows. Let ¯p⋆ 0:T,revl be a solution to Prob...

  26. [26]

    22 Unlearning in Diffusion Models: A Unified Framework with KL Divergence and Likelihood Constraints D

    we know: ∇logp(x t) = √¯αtx0 −x t 1−¯αt (27) whereα t represents the noise schedule, allowing us to compute the integral. 22 Unlearning in Diffusion Models: A Unified Framework with KL Divergence and Likelihood Constraints D. Algorithms Here we detail the primal-dual algorithms discussed in Section 3 of the main paper. D.1. Reverse KL-Constrained Unlearni...

  27. [27]

    added to the cross-attention layers in the diffusion U-Net following the esd-x approach from (Gandikota et al., 2023). For computing KID to the retain distribution, we construct a retain set by sampling images from the pretrained model and keeping those whose likelihood/CLIP score for the unlearn concept is lower than a certain threshold i.e., samples in ...

  28. [28]

    Forward KL-Constrained Unlearning For forward KL constrained unlearning we unlearn specific samples from a model pretrained on the CelebA-HQ dataset

    E.2. Forward KL-Constrained Unlearning For forward KL constrained unlearning we unlearn specific samples from a model pretrained on the CelebA-HQ dataset. Our implementation is based on modifying that of (Alberti et al., 2025), namely, SISS without importance sampling, to incorporate dual updates. For the unconstrained baseline, we use the same fixed mult...

  29. [29]

    Reverse KL-Constrained Unlearning Dual-Only Algorithm.For the reverse KL experiments we modify the algorithm slightly

    Category Hyperparameter Value General Dataset CelebA-HQ (256×256) Pretrained modelgoogle/ddpm-celebahq-256 Diffusion time stepsT1000 Training Epochs200 Train batch size2 Gradient accumulation steps16 Effective batch size32 Learning rate (ηp)5×10 −6 Dual learning rate (ηd)5×10 −2 Optimizer Adam Constraints Constraint type Forward KL divergence Threshold va...