Higher-variance classes are learned first in diffusion models; strong class imbalance reverses the order and imposes distinct delayed learning times on minority classes.
arXiv preprint arXiv:2410.08727 , year=
5 Pith papers cite this work. Polarity classification is still indexing.
years
2026 5verdicts
UNVERDICTED 5representative citing papers
Uniform-based discrete diffusion models behave as associative memories that retrieve unseen data, with a dataset-size-driven memorization-to-generalization transition detectable via conditional entropy of token predictions.
Discrete diffusion models on Ising-like data exhibit analytically predictable speciation and collapse transitions in backward dynamics via high-temperature expansion and Random Energy Model condensation, with scaling matching continuous cases when noise varies with time.
Diffusion models overfit denoising loss at intermediate noise but generalize in inference as model error smooths the flow field and sampling paths avoid memorized noisy training data.
Reusing source latent spaces in diffusion models under distribution shift produces target score error set by principal-angle misalignment and diffusion-time-amplified ambient noise.
citing papers explorer
-
Dynamical Regimes of Discrete Diffusion Models
Discrete diffusion models on Ising-like data exhibit analytically predictable speciation and collapse transitions in backward dynamics via high-temperature expansion and Random Energy Model condensation, with scaling matching continuous cases when noise varies with time.