Provably Learning Diffusion Models under the Manifold Hypothesis: Collapse and Refine
Pith reviewed 2026-05-21 07:32 UTC · model grok-4.3
The pith
Diffusion models learn manifold-supported data via score-driven collapse and refinement, making sample complexity depend on intrinsic dimension.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The geometry of the score function itself produces a collapse-and-refine mechanism: at small noise scales its diverging singularity drives rapid dimensional collapse of the induced denoising map onto the data manifold projection, while at moderate noise scales training refines the intrinsic density on the learned manifold. This principle is realized as Score-induced Latent Diffusion (SiLD), a two-stage framework in which both manifold learning and density estimation emerge from one denoising score matching objective, and it is proved that the sample complexity depends on the intrinsic dimension rather than the ambient dimension.
What carries the argument
The collapse-and-refine mechanism driven by the diverging singularity of the score function at small noise scales, which forces dimensional collapse of the denoising map onto the manifold projection.
If this is right
- Sample complexity for learning the score scales with the intrinsic dimension of the data manifold instead of ambient dimension.
- Manifold learning and density estimation both arise from a single denoising score matching objective without heuristic KL regularization.
- SiLD matches or exceeds generation quality of VAE-based latent diffusion models while improving reconstruction accuracy.
- The mechanism is validated on Stacked MNIST, CelebA variants, and molecular generation tasks.
Where Pith is reading between the lines
- If the singularity in the score is absent or removed, dimensional collapse may fail and the method could lose its efficiency advantage in high ambient dimensions.
- The same collapse-and-refine logic may extend to explain diffusion model performance on other structured domains such as graphs or time series.
- Conditional versions of SiLD could inherit the intrinsic-dimension scaling for tasks like class-conditional or text-to-image generation.
- Direct measurement of effective dimension of the denoising map across noise scales on synthetic manifolds would provide an immediate test of the predicted collapse.
Load-bearing premise
The data distribution is supported on a low-dimensional manifold and the score function exhibits a diverging singularity at small noise scales that induces dimensional collapse of the denoising map onto the manifold projection.
What would settle it
An experiment showing that the effective dimension of the learned denoising map remains close to ambient dimension at small noise scales, or that empirical sample complexity scales with ambient rather than intrinsic dimension on controlled low-intrinsic-dimensional data.
Figures
read the original abstract
Diffusion models generate high-dimensional data with remarkable quality, yet how their training efficiently learns the score function, bypassing the curse of dimensionality when data is supported on low-dimensional manifolds, remains theoretically unexplained. We identify a collapse-and-refine mechanism driven by the geometry of the score function itself: at small noise scales, the diverging singularity of the score drives a rapid dimensional collapse of the induced denoising map onto the data manifold projection; at moderate noise scales, training refines the intrinsic density on the learned manifold. We instantiate this principle as Score-induced Latent Diffusion (SiLD), a two-stage framework in which both manifold learning and density estimation emerge from a single denoising score matching objective, replacing the heuristic KL regularization of VAE-based latent diffusion models. We prove that the resulting sample complexity depends on the intrinsic dimension rather than the ambient dimension. Experiments on Stacked MNIST, CelebA variants, and molecular generation benchmarks show that SiLD matches or outperforms VAE-based LDMs in generation quality and consistently improves reconstruction, validating our theoretical predictions.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript identifies a collapse-and-refine mechanism in diffusion models under the manifold hypothesis: at small noise scales the diverging singularity of the score function drives dimensional collapse of the denoising map onto the manifold projection, while at moderate scales training refines the intrinsic density. This principle is instantiated as Score-induced Latent Diffusion (SiLD), a two-stage framework derived from a single denoising score matching objective that replaces heuristic KL regularization in VAE-based latent diffusion models. The authors prove that the resulting sample complexity depends on the intrinsic dimension rather than the ambient dimension, and report experiments on Stacked MNIST, CelebA variants, and molecular generation benchmarks showing that SiLD matches or outperforms VAE-based LDMs in generation quality while improving reconstruction.
Significance. If the central proof holds, the work supplies a geometric explanation for why diffusion models evade the curse of dimensionality on manifold-supported data and grounds the manifold hypothesis directly in score-function geometry. The SiLD construction is notable for deriving both manifold learning and density estimation from one objective without additional regularization terms or free parameters. The experimental results are presented as direct validation of the theoretical predictions, and the parameter-free character of the sample-complexity claim is a clear strength.
major comments (2)
- [Proof of sample-complexity result] Proof of the sample-complexity claim (the section deriving the end-to-end bound from the collapse-and-refine mechanism): the argument that the singularity-induced collapse propagates through score estimation to eliminate all ambient-dimension D dependence must be made explicit. Standard denoising-score-matching analyses bound empirical risk minimization over function classes whose covering numbers or Lipschitz constants scale with D; the manuscript needs to show, via the relevant generalization or optimization bound, that no residual D factor survives once the collapse onto the manifold projection is accounted for.
- [SiLD framework description] Definition of the SiLD framework and its relation to the single denoising objective: it is stated that both stages emerge from one score-matching loss, yet the separation into collapse (small-noise) and refine (moderate-noise) phases appears to rely on a noise schedule whose precise form could re-introduce D-dependent estimation rates if the function class remains defined in ambient space. The manuscript should clarify whether the schedule or the function-class restriction is chosen in a way that preserves the claimed D-independence.
minor comments (2)
- [Notation] Notation for the intrinsic dimension d versus ambient dimension D should be introduced once at the beginning and used consistently; occasional switches between capital and lower-case D in the theoretical sections reduce readability.
- [Experiments] The experimental section reports generation quality metrics but does not include an ablation that isolates the contribution of the collapse stage versus the refine stage; adding such a controlled comparison would strengthen the link between theory and experiments.
Simulated Author's Rebuttal
We thank the referee for the thoughtful and constructive report. The comments highlight important points for clarifying the propagation of the collapse mechanism in the sample-complexity proof and the precise role of the noise schedule in preserving dimension independence. We address each major comment below and have revised the manuscript to strengthen the exposition without altering the core claims or results.
read point-by-point responses
-
Referee: [Proof of sample-complexity result] Proof of the sample-complexity claim (the section deriving the end-to-end bound from the collapse-and-refine mechanism): the argument that the singularity-induced collapse propagates through score estimation to eliminate all ambient-dimension D dependence must be made explicit. Standard denoising-score-matching analyses bound empirical risk minimization over function classes whose covering numbers or Lipschitz constants scale with D; the manuscript needs to show, via the relevant generalization or optimization bound, that no residual D factor survives once the collapse onto the manifold projection is accounted for.
Authors: We agree that the propagation step merits a more explicit treatment. In the original manuscript, Lemma 3 establishes that the score singularity at small noise scales forces the denoising map to collapse onto the manifold projection, after which the effective function class for score estimation is supported only in a tubular neighborhood of the manifold. The end-to-end bound in Theorem 1 then invokes covering-number arguments on this restricted class, whose metric entropy scales with the intrinsic dimension d. To address the referee's concern directly, we have inserted a new paragraph immediately following the statement of Lemma 3 that explicitly traces how the collapse eliminates residual D factors in both the optimization error (via restricted Lipschitz constants) and the generalization error (via covering numbers of the projected function class). The revised proof now cites the relevant generalization bound from the score-matching literature and shows that the ambient dimension D appears only in transient terms that vanish once collapse occurs. revision: yes
-
Referee: [SiLD framework description] Definition of the SiLD framework and its relation to the single denoising objective: it is stated that both stages emerge from one score-matching loss, yet the separation into collapse (small-noise) and refine (moderate-noise) phases appears to rely on a noise schedule whose precise form could re-introduce D-dependent estimation rates if the function class remains defined in ambient space. The manuscript should clarify whether the schedule or the function-class restriction is chosen in a way that preserves the claimed D-independence.
Authors: The SiLD construction is obtained by partitioning the single denoising score-matching objective across noise scales without introducing extra regularization or parameters. The noise schedule is selected so that the small-noise regime triggers the geometric collapse proven in Lemma 3, after which the subsequent moderate-noise regime operates on the already-collapsed manifold. Because the training dynamics themselves enforce the restriction to the manifold (rather than an a-priori ambient function class), the covering numbers and Lipschitz constants in the generalization analysis remain governed by d. We have added a clarifying remark in Section 3.1 that explicitly states this point and cross-references the proof in Section 4 to confirm that no D-dependent rates are re-introduced by the schedule. revision: yes
Circularity Check
No circularity: proof derives sample complexity from geometric collapse without reducing to inputs by construction
full rationale
The paper presents a theoretical derivation of sample complexity depending on intrinsic dimension via the collapse-and-refine mechanism, where score singularity at small noise induces dimensional collapse of the denoising map onto the manifold projection, followed by refinement at moderate scales. The abstract and description frame this as emerging from a single denoising score matching objective instantiated as SiLD, with the proof claimed to follow from first-principles geometric analysis rather than any fitted parameter, self-citation chain, or definitional equivalence. No load-bearing step reduces the claimed result to a tautology or renamed input; the central claim retains independent mathematical content from the manifold hypothesis and score geometry. This qualifies as a self-contained theoretical contribution with no detected circularity.
Axiom & Free-Parameter Ledger
axioms (2)
- domain assumption Data distribution is supported on a low-dimensional manifold
- domain assumption Score function exhibits diverging singularity at small noise scales
invented entities (1)
-
Score-induced Latent Diffusion (SiLD)
no independent evidence
Reference graph
Works this paper leans on
-
[1]
Sgd learning on neural net- works: leap complexity and saddle-to-saddle dynamics
Emmanuel Abbe, Enric Boix Adsera, and Theodor Misiakiewicz. Sgd learning on neural net- works: leap complexity and saddle-to-saddle dynamics. InThe Thirty Sixth Annual Conference on Learning Theory, pages 2552–2623. PMLR, 2023
work page 2023
-
[2]
Iskander Azangulov, George Deligiannidis, and Judith Rousseau. Convergence of diffusion models under the manifold hypothesis in high-dimensions.arXiv preprint arXiv:2409.18804, 2024
-
[3]
Joe Benton, Valentin De Bortoli, Arnaud Doucet, and George Deligiannidis. Nearly d- linear convergence bounds for diffusion models via stochastic localization.arXiv preprint arXiv:2308.03686, 2023. 11
-
[4]
Quantifying the chemical beauty of drugs.Nature chemistry, 4(2):90–98, 2012
G Richard Bickerton, Gaia V Paolini, Jérémy Besnard, Sorel Muresan, and Andrew L Hopkins. Quantifying the chemical beauty of drugs.Nature chemistry, 4(2):90–98, 2012
work page 2012
-
[5]
Dynamical regimes of diffusion models.Nature Communications, 15(1):9957, 2024
Giulio Biroli, Tony Bonnaire, Valentin De Bortoli, and Marc Mézard. Dynamical regimes of diffusion models.Nature Communications, 15(1):9957, 2024
work page 2024
-
[6]
Nicholas M Boffi, Arthur Jacot, Stephen Tu, and Ingvar Ziemann. Shallow diffusion networks provably learn hidden low-dimensional structure.arXiv preprint arXiv:2410.11275, 2024
-
[7]
Tony Bonnaire, Raphaël Urfin, Giulio Biroli, and Marc Mézard. Why diffusion models don’t memorize: The role of implicit dynamical regularization in training.arXiv preprint arXiv:2505.17638, 2025
-
[8]
Generalization Properties of Score-matching Diffusion Models for Intrinsically Low-dimensional Data
Saptarshi Chakraborty, Quentin Berthet, and Peter L Bartlett. Generalization properties of score-matching diffusion models for intrinsically low-dimensional data.arXiv preprint arXiv:2603.03700, 2026
work page internal anchor Pith review Pith/arXiv arXiv 2026
-
[9]
Nisha Chandramoorthy and Adriaan de Clercq. When and how can inexact generative models still sample from the data manifold?arXiv preprint arXiv:2508.07581, 2025
-
[10]
Minshuo Chen, Kaixuan Huang, Tuo Zhao, and Mengdi Wang. Score approximation, estimation and distribution recovery of diffusion models on low-dimensional data. InInternational Conference on Machine Learning, pages 4672–4712. PMLR, 2023
work page 2023
-
[11]
arXiv preprint arXiv:2202.01009 , year=
Lénaïc Chizat. Mean-field langevin dynamics: Exponential convergence and annealing.arXiv preprint arXiv:2202.01009, 2022
-
[12]
Hugo Cui, Cengiz Pehlevan, and Yue M Lu. A precise asymptotic analysis of learning diffusion models: theory and insights.arXiv e-prints, pages arXiv–2501, 2025
work page 2025
-
[13]
High-dimensional asymptotics of denoising autoencoders
Hugo Cui and Lenka Zdeborová. High-dimensional asymptotics of denoising autoencoders. Advances in Neural Information Processing Systems, 36:11850–11890, 2023
work page 2023
-
[14]
Neural networks can learn represen- tations with gradient descent
Alexandru Damian, Jason Lee, and Mahdi Soltanolkotabi. Neural networks can learn represen- tations with gradient descent. InConference on Learning Theory, pages 5413–5452. PMLR, 2022
work page 2022
-
[15]
Convergence of denoising diffusion models under the manifold hypothesis
Valentin De Bortoli. Convergence of denoising diffusion models under the manifold hypothesis. Transactions on Machine Learning Research, 2022
work page 2022
-
[16]
Diffusion models and the manifold hypothesis: Log-domain smoothing is geometry adaptive
Tyler Farghly, Peter Potaptchik, Samuel Howard, George Deligiannidis, and Jakiw Pidstrigach. Diffusion models and the manifold hypothesis: Log-domain smoothing is geometry adaptive. arXiv preprint arXiv:2510.02305, 2025
-
[17]
Curvature measures.Transactions of the American Mathematical Society, 93(3):418–491, 1959
Herbert Federer. Curvature measures.Transactions of the American Mathematical Society, 93(3):418–491, 1959
work page 1959
-
[18]
Testing the manifold hypothesis
Charles Fefferman, Sanjoy Mitter, and Hariharan Narayanan. Testing the manifold hypothesis. Journal of the American Mathematical Society, 29(4):983–1049, 2016
work page 2016
-
[19]
Flow matching from viewpoint of proximal operators.arXiv preprint arXiv:2602.12683, 2026
Kenji Fukumizu, Wei Huang, Han Bao, Shuntuo Xu, and Nisha Chandramoothy. Flow matching from viewpoint of proximal operators.arXiv preprint arXiv:2602.12683, 2026
-
[20]
Weiguo Gao and Ming Li. How do flow matching models memorize and generalize in sample data subspaces?arXiv preprint arXiv:2410.23594, 2024
-
[21]
Asymptotic Learning Curves for Diffusion Models with Random Features Score and Manifold Data
Anand Jerry George and Nicolas Macris. Asymptotic learning curves for diffusion models with random features score and manifold data.arXiv preprint arXiv:2603.22962, 2026
work page internal anchor Pith review Pith/arXiv arXiv 2026
-
[22]
Rafael Gómez-Bombarelli, Jennifer N Wei, David Duvenaud, José Miguel Hernández-Lobato, Benjamín Sánchez-Lengeling, Dennis Sheberla, Jorge Aguilera-Iparraguirre, Timothy D Hirzel, Ryan P Adams, and Alán Aspuru-Guzik. Automatic chemical design using a data-driven continuous representation of molecules.ACS central science, 4(2):268–276, 2018. 12
work page 2018
-
[23]
On the feature learning in diffusion models
Andi Han, Wei Huang, Yuan Cao, and Difan Zou. On the feature learning in diffusion models. arXiv preprint arXiv:2412.01021, 2024
-
[24]
Neural Network-Based Score Estimation in Diffusion Models: Optimization and Generalization
Yinbin Han, Meisam Razaviyayn, and Renyuan Xu. Neural network-based score estimation in diffusion models: Optimization and generalization.arXiv preprint arXiv:2401.15604, 2024
work page internal anchor Pith review Pith/arXiv arXiv 2024
-
[25]
Unified latents (ul): How to train your latents.arXiv preprint arXiv:2602.17270, 2026
Jonathan Heek, Emiel Hoogeboom, Thomas Mensink, and Tim Salimans. Unified latents (ul): How to train your latents.arXiv preprint arXiv:2602.17270, 2026
-
[26]
Martin Heusel, Hubert Ramsauer, Thomas Unterthiner, Bernhard Nessler, and Sepp Hochreiter. Gans trained by a two time-scale update rule converge to a local nash equilibrium.Advances in neural information processing systems, 30, 2017
work page 2017
-
[27]
Jonathan Ho, Ajay Jain, and Pieter Abbeel. Denoising diffusion probabilistic models.Advances in neural information processing systems, 33:6840–6851, 2020
work page 2020
-
[28]
Zhihan Huang, Yuting Wei, and Yuxin Chen. Denoising diffusion probabilistic models are optimally adaptive to unknown low dimensionality.Mathematics of Operations Research, 2026
work page 2026
-
[29]
Arthur Jacot, Franck Gabriel, and Clément Hongler. Neural tangent kernel: Convergence and generalization in neural networks.Advances in neural information processing systems, 31, 2018
work page 2018
-
[30]
Auto-Encoding Variational Bayes
Diederik P Kingma and Max Welling. Auto-encoding variational bayes.arXiv preprint arXiv:1312.6114, 2013
work page internal anchor Pith review Pith/arXiv arXiv 2013
-
[31]
Self-referencing embedded strings (selfies): A 100% robust molecular string representation
Mario Krenn, Florian Häse, AkshatKumar Nigam, Pascal Friederich, and Alan Aspuru-Guzik. Self-referencing embedded strings (selfies): A 100% robust molecular string representation. Machine Learning: Science and Technology, 1(4):045024, 2020
work page 2020
-
[32]
Flow Matching is Adaptive to Manifold Structures
Shivam Kumar, Yixin Wang, and Lizhen Lin. Flow matching is adaptive to manifold structures. arXiv preprint arXiv:2602.22486, 2026
work page internal anchor Pith review Pith/arXiv arXiv 2026
-
[33]
Gen Li and Yuling Yan. Adapting to unknown low-dimensional structures in score-based diffusion models.Advances in Neural Information Processing Systems, 37:126297–126331, 2024
work page 2024
-
[34]
When scores learn geometry: Rate separations under the manifold hypothesis
Xiang Li, Zebang Shen, Ya-Ping Hsieh, and Niao He. When scores learn geometry: Rate separations under the manifold hypothesis. InThe Fourteenth International Conference on Learning Representations, 2026
work page 2026
-
[35]
Xiao Li, Zekai Zhang, Xiang Li, Siyi Chen, Zhihui Zhu, Peng Wang, and Qing Qu. Understand- ing representation dynamics of diffusion models via low-dimensional modeling.arXiv preprint arXiv:2502.05743, 2025
-
[36]
Zichen Liu, Wei Zhang, and Tiejun Li. Improving the euclidean diffusion generation of manifold data by mitigating score function singularity.arXiv preprint arXiv:2505.09922, 2025
-
[37]
Gabriel Loaiza-Ganem, Brendan Leigh Ross, Rasa Hosseinzadeh, Anthony L Caterini, and Jesse C Cresswell. Deep generative models through the lens of the manifold hypothesis: A survey and new connections.arXiv preprint arXiv:2404.02954, 2024
-
[38]
Song Mei, Andrea Montanari, and Phan-Minh Nguyen. A mean field view of the landscape of two-layer neural networks.Proceedings of the National Academy of Sciences, 115(33):E7665– E7671, 2018
work page 2018
-
[39]
[MMM22] Song Mei, Theodor Misiakiewicz, and Andrea Montanari
Alireza Mousavi-Hosseini, Sejun Park, Manuela Girotti, Ioannis Mitliagkas, and Murat A Erdogdu. Neural networks efficiently learn low-dimensional representations with sgd.arXiv preprint arXiv:2209.14863, 2022
-
[40]
Gotta be safe: a new framework for molecular design.Digital Discovery, 3(4):796–804, 2024
Emmanuel Noutahi, Cristian Gabellini, Michael Craig, Jonathan SC Lim, and Prudencio Tossou. Gotta be safe: a new framework for molecular design.Digital Discovery, 3(4):796–804, 2024. 13
work page 2024
-
[41]
Diffusion models are minimax optimal distribution estimators
Kazusato Oko, Shunta Akiyama, and Taiji Suzuki. Diffusion models are minimax optimal distribution estimators. InInternational Conference on Machine Learning, pages 26517–26582. PMLR, 2023
work page 2023
-
[42]
Jakiw Pidstrigach. Score-based generative models detect manifolds.Advances in Neural Information Processing Systems, 35:35852–35865, 2022
work page 2022
-
[43]
Approximation theory of the mlp model in neural networks.Acta numerica, 8:143–195, 1999
Allan Pinkus. Approximation theory of the mlp model in neural networks.Acta numerica, 8:143–195, 1999
work page 1999
-
[44]
Peter Potaptchik, Iskander Azangulov, and George Deligiannidis. Linear convergence of diffusion models under the manifold hypothesis.arXiv preprint arXiv:2410.09046, 2024
-
[45]
Fréchet chemnet distance: a metric for generative models for molecules in drug discovery
Kristina Preuer, Philipp Renz, Thomas Unterthiner, Sepp Hochreiter, and Gunter Klambauer. Fréchet chemnet distance: a metric for generative models for molecules in drug discovery. Journal of chemical information and modeling, 58(9):1736–1741, 2018
work page 2018
-
[46]
Oleksii Prykhodko, Simon Viet Johansson, Panagiotis-Christos Kotsias, Josep Arús-Pous, Esben Jannik Bjerrum, Ola Engkvist, and Hongming Chen. A de novo molecular generation method using latent vector based generative adversarial network.Journal of cheminformatics, 11(1):74, 2019
work page 2019
-
[47]
High- resolution image synthesis with latent diffusion models
Robin Rombach, Andreas Blattmann, Dominik Lorenz, Patrick Esser, and Björn Ommer. High- resolution image synthesis with latent diffusion models. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 10684–10695, 2022
work page 2022
-
[48]
Marwin HS Segler, Thierry Kogej, Christian Tyrchan, and Mark P Waller. Generating focused molecule libraries for drug discovery with recurrent neural networks.ACS central science, 4(1):120–131, 2018
work page 2018
-
[49]
Kulin Shah, Sitan Chen, and Adam Klivans. Learning mixtures of gaussians using the ddpm objective.Advances in Neural Information Processing Systems, 36:19636–19649, 2023
work page 2023
-
[50]
Deep unsuper- vised learning using nonequilibrium thermodynamics
Jascha Sohl-Dickstein, Eric Weiss, Niru Maheswaranathan, and Surya Ganguli. Deep unsuper- vised learning using nonequilibrium thermodynamics. InInternational conference on machine learning, pages 2256–2265. pmlr, 2015
work page 2015
-
[51]
Yang Song and Stefano Ermon. Generative modeling by estimating gradients of the data distribution.Advances in neural information processing systems, 32, 2019
work page 2019
-
[52]
Score-Based Generative Modeling through Stochastic Differential Equations
Yang Song, Jascha Sohl-Dickstein, Diederik P Kingma, Abhishek Kumar, Stefano Ermon, and Ben Poole. Score-based generative modeling through stochastic differential equations.arXiv preprint arXiv:2011.13456, 2020
work page internal anchor Pith review Pith/arXiv arXiv 2011
-
[53]
Diffusion models encode the intrinsic dimension of data manifolds
Jan Pawel Stanczuk, Georgios Batzolis, Teo Deveney, and Carola-Bibiane Schönlieb. Diffusion models encode the intrinsic dimension of data manifolds. InForty-first International Conference on Machine Learning, 2024
work page 2024
-
[54]
Taiji Suzuki, Denny Wu, and Atsushi Nitanda. Convergence of mean-field langevin dynamics: time-space discretization, stochastic gradient, and variance reduction.Advances in Neural Information Processing Systems, 36:15545–15577, 2023
work page 2023
-
[55]
Adaptivity of diffusion models to manifold structures
Rong Tang and Yun Yang. Adaptivity of diffusion models to manifold structures. InInternational conference on artificial intelligence and statistics, pages 1648–1656. PMLR, 2024
work page 2024
-
[56]
Score-based generative modeling in latent space
Arash Vahdat, Karsten Kreis, and Jan Kautz. Score-based generative modeling in latent space. Advances in neural information processing systems, 34:11287–11302, 2021
work page 2021
-
[57]
Pascal Vincent. A connection between score matching and denoising autoencoders.Neural computation, 23(7):1661–1674, 2011
work page 2011
-
[58]
Cambridge university press, 2019
Martin J Wainwright.High-dimensional statistics: A non-asymptotic viewpoint, volume 48. Cambridge university press, 2019. 14
work page 2019
-
[59]
An Analytical Theory of Spectral Bias in the Learning Dynamics of Diffusion Models
Binxu Wang and Cengiz Pehlevan. An analytical theory of spectral bias in the learning dynamics of diffusion models.arXiv preprint arXiv:2503.03206, 2025
work page internal anchor Pith review Pith/arXiv arXiv 2025
-
[60]
Diffusion models generate images like painters: an analytical theory of outline first, details later
Binxu Wang and John J Vastola. Diffusion models generate images like painters: an analytical theory of outline first, details later.arXiv preprint arXiv:2303.02490, 2023
-
[61]
Diffusion models learn low-dimensional distributions via subspace clustering
Peng Wang, Huijie Zhang, Zekai Zhang, Siyi Chen, Yi Ma, and Qing Qu. Diffusion models learn low-dimensional distributions via subspace clustering. In2025 IEEE 10th International Workshop on Computational Advances in Multi-Sensor Adaptive Processing (CAMSAP), pages 211–215. IEEE, 2025
work page 2025
-
[62]
Ye Wang, Honggang Zhao, Simone Sciabola, and Wenlu Wang. cmolgpt: a conditional generative pre-trained transformer for target-specific de novo molecular generation.Molecules, 28(11):4430, 2023
work page 2023
-
[63]
Chen Zeno, Hila Manor, Greg Ongie, Nir Weinberger, Tomer Michaeli, and Daniel Soudry. When diffusion models memorize: Inductive biases in probability flow of minimum-norm shallow neural nets.arXiv preprint arXiv:2506.19031, 2025
-
[64]
Fangzhao Zhang and Mert Pilanci. Analyzing neural network-based generative diffusion models through convex optimization.arXiv preprint arXiv:2402.01965, 2024
-
[65]
The unrea- sonable effectiveness of deep features as a perceptual metric
Richard Zhang, Phillip Isola, Alexei A Efros, Eli Shechtman, and Oliver Wang. The unrea- sonable effectiveness of deep features as a perceptual metric. InProceedings of the IEEE conference on computer vision and pattern recognition, pages 586–595, 2018. 15 A Limitations Our work is deliberately scoped to characterize the training dynamics of score matchin...
work page 2018
-
[66]
(Kernel regularity.)The kernel K belongs to C r(M × M), hence to H r−k/2(M × M) by Sobolev embedding
-
[67]
(Eigenvalue decay.)The eigenvalues {λj}j≥1 of the induced integral operator TK : L2(M)→L 2(M)satisfy λj ≤C M,r ∥K∥ Cr j−r/k,(38) whereC M,r depends only on(M, g)andr, not on the ambient dimensiond. 21 The polynomial decay rate j−r/k depends on theintrinsicdimension k rather than d, which is the key to controlling the ambient dependence in Stage 2. Proof.B...
-
[68]
Table 4 reports results across two model sizes and two training budgets. SiLD consistently outperforms LDM-CNN on reconstruction MSE across all settings, with the gap present at the smaller network (0.00440 vs. 0.00503) and persisting at the larger network (0.00345 vs. 0.00396). At 10× training, both methods converge to near-identical reconstruction MSE (...
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.