pith. sign in

On the edge of memorization in diffusion models

4 Pith papers cite this work. Polarity classification is still indexing.

4 Pith papers citing it

citation-role summary

background 1

citation-polarity summary

fields

cs.LG 4

years

2026 4

verdicts

UNVERDICTED 4

roles

background 1

polarities

unclear 1

representative citing papers

Grokking of Diffusion Models: Case Study on Modular Addition

cs.LG · 2026-04-20 · unverdicted · novelty 7.0

Diffusion models show grokking on modular addition by composing periodic operand representations in simple data regimes or by separating arithmetic computation from visual denoising across timesteps in varied regimes.

Diffusion Processes on Implicit Manifolds

cs.LG · 2026-04-08 · unverdicted · novelty 7.0 · 2 refs

Defines diffusion processes on implicit data manifolds via proximity-graph approximations to the infinitesimal generator and carré-du-champ operator, proves convergence in law to the continuous manifold process, and provides an Euler-Maruyama integrator validated on synthetic and MNIST manifolds.

citing papers explorer

Showing 4 of 4 citing papers.

  • Grokking of Diffusion Models: Case Study on Modular Addition cs.LG · 2026-04-20 · unverdicted · none · ref 2

    Diffusion models show grokking on modular addition by composing periodic operand representations in simple data regimes or by separating arithmetic computation from visual denoising across timesteps in varied regimes.

  • Diffusion Processes on Implicit Manifolds cs.LG · 2026-04-08 · unverdicted · none · ref 15 · 2 links

    Defines diffusion processes on implicit data manifolds via proximity-graph approximations to the infinitesimal generator and carré-du-champ operator, proves convergence in law to the continuous manifold process, and provides an Euler-Maruyama integrator validated on synthetic and MNIST manifolds.

  • Diffusion Models Memorize in Training -- and Generalize in Inference cs.LG · 2026-03-12 · unverdicted · none · ref 8

    Diffusion models overfit denoising loss at intermediate noise but generalize in inference as model error smooths the flow field and sampling paths avoid memorized noisy training data.

  • Adynamical systems view of training generativemodels and the memorization phenomenon cs.LG · 2026-05-19 · unverdicted · none · ref 20

    A dynamical systems analysis of constant-step SGD explains memorization in generative models by combining two-time-scale dynamics with a collapse model.