Learning Mixtures of Gaussians Using the DDPM Objective

Adam Klivans; Kulin Shah; Sitan Chen

arxiv: 2307.01178 · v1 · pith:KN4DGT4Vnew · submitted 2023-07-03 · 💻 cs.DS · cs.LG· stat.ML

Learning Mixtures of Gaussians Using the DDPM Objective

Kulin Shah , Sitan Chen , Adam Klivans This is my paper

classification 💻 cs.DS cs.LGstat.ML

keywords descentdistributiongaussiansgradientmixturescentersddpmdiffusion

0 comments

read the original abstract

Recent works have shown that diffusion models can learn essentially any distribution provided one can perform score estimation. Yet it remains poorly understood under what settings score estimation is possible, let alone when practical gradient-based algorithms for this task can provably succeed. In this work, we give the first provably efficient results along these lines for one of the most fundamental distribution families, Gaussian mixture models. We prove that gradient descent on the denoising diffusion probabilistic model (DDPM) objective can efficiently recover the ground truth parameters of the mixture model in the following two settings: 1) We show gradient descent with random initialization learns mixtures of two spherical Gaussians in $d$ dimensions with $1/\text{poly}(d)$-separated centers. 2) We show gradient descent with a warm start learns mixtures of $K$ spherical Gaussians with $\Omega(\sqrt{\log(\min(K,d))})$-separated centers. A key ingredient in our proofs is a new connection between score-based methods and two other approaches to distribution learning, the EM algorithm and spectral methods.

This paper has not been read by Pith yet.

discussion (0)

Forward citations

Cited by 3 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

A theory of learning data statistics in diffusion models, from easy to hard
stat.ML 2026-03 unverdicted novelty 6.0

Diffusion models exhibit a distributional simplicity bias, learning pairwise input statistics at linear sample complexity while fourth-order cumulants require cubic complexity unless sharing correlated latent structure.
Statistical Properties of Training & Generalization
stat.ML 2026-06 unverdicted novelty 2.0

Neural scaling laws in deep learning interact with physics constraints and inductive biases beyond classical statistics.
Statistical Properties of Training & Generalization
stat.ML 2026-06 unverdicted novelty 1.0

Review of neural scaling laws and their relation to constraints and inductive biases when applying machine learning to physics problems.