Higher-variance classes are learned first in diffusion models; strong class imbalance reverses the order and imposes distinct delayed learning times on minority classes.
arXiv preprint arXiv:2410.08727 , year=
5 Pith papers cite this work. Polarity classification is still indexing.
years
2026 5verdicts
UNVERDICTED 5representative citing papers
Uniform-based discrete diffusion models behave as associative memories that retrieve unseen data, with a dataset-size-driven memorization-to-generalization transition detectable via conditional entropy of token predictions.
Discrete diffusion models on Ising-like data exhibit analytically predictable speciation and collapse transitions in backward dynamics via high-temperature expansion and Random Energy Model condensation, with scaling matching continuous cases when noise varies with time.
Diffusion models overfit denoising loss at intermediate noise but generalize in inference as model error smooths the flow field and sampling paths avoid memorized noisy training data.
Reusing source latent spaces in diffusion models under distribution shift produces target score error set by principal-angle misalignment and diffusion-time-amplified ambient noise.
citing papers explorer
-
The Interplay of Data Structure and Imbalance in the Learning Dynamics of Diffusion Models
Higher-variance classes are learned first in diffusion models; strong class imbalance reverses the order and imposes distinct delayed learning times on minority classes.
-
Language Diffusion Models are Associative Memories Capable of Retrieving Unseen Data
Uniform-based discrete diffusion models behave as associative memories that retrieve unseen data, with a dataset-size-driven memorization-to-generalization transition detectable via conditional entropy of token predictions.
-
Dynamical Regimes of Discrete Diffusion Models
Discrete diffusion models on Ising-like data exhibit analytically predictable speciation and collapse transitions in backward dynamics via high-temperature expansion and Random Energy Model condensation, with scaling matching continuous cases when noise varies with time.
-
Diffusion Models Memorize in Training -- and Generalize in Inference
Diffusion models overfit denoising loss at intermediate noise but generalize in inference as model error smooths the flow field and sampling paths avoid memorized noisy training data.
-
On the Limits of Latent Reuse in Diffusion Models
Reusing source latent spaces in diffusion models under distribution shift produces target score error set by principal-angle misalignment and diffusion-time-amplified ambient noise.