Higher-variance classes are learned first in diffusion models; strong class imbalance reverses the order and imposes distinct delayed learning times on minority classes.
arXiv preprint arXiv:2505.16959 (2025)
4 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
years
2026 4verdicts
UNVERDICTED 4roles
method 1polarities
use method 1representative citing papers
Linear generative models memorize at small data loads but converge continuously once samples scale linearly with dimension; this convergence is insensitive to sharp recovery of principal latent factors.
diffGHOST is a conditional diffusion model that segments learned latent space to identify and mitigate memorization of critical trajectory samples, aiming to deliver privacy guarantees alongside data utility.
Diffusion models require new generalization frameworks because memorization and novel generation are incompatible, so research should focus on what models learn before memorization begins.
citing papers explorer
-
The Interplay of Data Structure and Imbalance in the Learning Dynamics of Diffusion Models
Higher-variance classes are learned first in diffusion models; strong class imbalance reverses the order and imposes distinct delayed learning times on minority classes.
-
Memorisation, convergence and generalisation in generative models
Linear generative models memorize at small data loads but converge continuously once samples scale linearly with dimension; this convergence is insensitive to sharp recovery of principal latent factors.
-
diffGHOST: Diffusion based Generative Hedged Oblivious Synthetic Trajectories
diffGHOST is a conditional diffusion model that segments learned latent space to identify and mitigate memorization of critical trajectory samples, aiming to deliver privacy guarantees alongside data utility.
-
Understanding diffusion models requires rethinking (again) generalization
Diffusion models require new generalization frameworks because memorization and novel generation are incompatible, so research should focus on what models learn before memorization begins.