pith. machine review for the scientific record. sign in

arxiv: 2605.07020 · v1 · submitted 2026-05-07 · 💻 cs.LG · cs.AI

Recognition: no theorem link

FlashMol: High-Quality Molecule Generation in as Few as Four Steps

Authors on Pith no claims yet

Pith reviewed 2026-05-11 00:57 UTC · model grok-4.3

classification 💻 cs.LG cs.AI
keywords molecule generationdiffusion modelsmodel distillation3D molecular conformationsdrug discoveryfew-step generationJensen-Shannon divergence
0
0 comments X

The pith

FlashMol generates high-quality 3D molecular conformations in only four diffusion steps by distilling a 1000-step teacher model.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper shows that classical diffusion models for 3D molecules, which normally need hundreds of steps, can be accelerated to just four steps while keeping or improving quality. It adapts distribution matching distillation as the training objective, respace the timesteps to give the generator a stronger starting point, and adds Jensen-Shannon regularization to keep diversity from collapsing. Experiments on QM9 and GEOM-DRUG datasets confirm the four-step model matches or beats the original 1000-step teacher and delivers up to 250 times faster sampling. This matters for computational drug discovery because slow generation has made large-scale virtual screening impractical.

Core claim

FlashMol produces chemically valid 3D molecular conformations in as few as four steps. It adapts distribution matching distillation to minimize reverse KL divergence in the molecular domain, respace the generation timesteps for better initialization, and regularizes the objective with a Jensen-Shannon divergence term to balance mode-seeking and mean-seeking behavior. On QM9 and GEOM-DRUG the resulting model matches or surpasses the 1000-step GeoLDM teacher while achieving up to 250 times faster sampling.

What carries the argument

Distribution matching distillation adapted with timestep respacing and Jensen-Shannon regularization, which distills a slow diffusion teacher into a fast generator while preserving stability and diversity of 3D molecular conformations.

If this is right

  • Large-scale in silico screening for drug discovery becomes feasible because generation time drops by up to 250 times.
  • The distilled four-step model matches or exceeds the 1000-step teacher on standard quality metrics for 3D conformations.
  • Timestep respacing supplies a stronger initialization that makes the local minimization of distribution matching distillation effective.
  • Jensen-Shannon regularization counters the mode-seeking tendency of reverse KL and restores sample diversity.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same distillation recipe could be tested on diffusion models for protein structure or crystal generation to check whether four-step sampling generalizes.
  • If the regularization term proves robust across domains, similar few-step techniques might shorten inference in image or point-cloud diffusion models.
  • Running the model on larger, more diverse molecular libraries would test whether the speed-quality trade-off holds outside the QM9 and GEOM-DRUG regimes.

Load-bearing premise

Adapting distribution matching distillation with timestep respacing and Jensen-Shannon regularization will preserve sample stability and diversity when applied to 3D molecular conformations.

What would settle it

If the four-step model produces molecules with substantially lower validity rates or higher average strain energies than the 1000-step teacher on the QM9 test set, the claim of maintained quality collapses.

Figures

Figures reproduced from arXiv: 2605.07020 by Cai Zhou, Muhan Zhang, Shaoheng Yan, Xinyuan Wei, Zian Li.

Figure 1
Figure 1. Figure 1 [PITH_FULL_IMAGE:figures/full_fig_p002_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Training framework for FlashMol. In each iteration, the few-step generator G produces a batch of sampled molecules. Then, the µfake model which approximates fake score sfake and the discriminator D are first updated using diffusion loss and GAN loss respectively for 5 steps. After that, µfake and the discriminator D’s output are used to compute the DMD and Jensen-Shannon divergence gradient. Lastly, the gr… view at source ↗
Figure 4
Figure 4. Figure 4: Distribution matching distil￾lation training dynamics under different sampling noise schedules. performance. We choose GeoLDM [43], a strong latent molecule generative model, as the backbone of FlashMol. The overall training framework is illustrated in [PITH_FULL_IMAGE:figures/full_fig_p005_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: Noise schedules for differ￾ent values of the exponent ρ in the respaced timesteps in Equation (7). As presented in [PITH_FULL_IMAGE:figures/full_fig_p009_5.png] view at source ↗
read the original abstract

Generating chemically valid 3D molecular conformations is critical for computational drug discovery. Classical diffusion-based models like GeoLDM perform well but require hundreds of steps, making large-scale in silico screening impractical. Recent efforts on few-step molecular generation have accelerated this process to 12-50 steps, but they often largely sacrifice sample stability. In this work, we present FlashMol, an ultra-fast molecule generative model producing high-quality molecular conformations in as few as 4 steps. To achieve this, we adapt distribution matching distillation (DMD) - a reverse KL-divergence minimization objective - to the molecular domain for effective distillation. Considering the local minimization behavior of DMD, we respace the molecule generation timesteps, providing the generator with much better initialization and enables effective distillation. Additionally, to mitigate the mode-seeking behavior of DMD and improve diversity, we further regularize it with a Jensen-Shannon divergence term, which incorporates the mean-seeking behavior of the forward KL divergence. Extensive experiments on QM9 and GEOM-DRUG datasets demonstrate that FlashMol matches and even surpasses the original 1000-step teacher, achieving up to 250$\times$ acceleration in sampling speed while maintaining high molecular quality.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The paper presents FlashMol, an adaptation of distribution matching distillation (DMD) for 3D molecular conformation generation. By combining DMD (reverse KL minimization) with timestep respacing for better initialization and a Jensen-Shannon divergence term to mitigate mode-seeking collapse, the method claims to produce high-quality samples in as few as 4 steps. Experiments on QM9 and GEOM-DRUG report that FlashMol matches or exceeds the 1000-step GeoLDM teacher on stability, validity, and diversity metrics while delivering up to 250× sampling acceleration.

Significance. If the results prove robust, the work would be significant for computational drug discovery by removing the computational barrier of hundreds of diffusion steps in large-scale in silico screening. The targeted use of respacing and JS regularization to stabilize few-step distillation on constrained 3D molecular manifolds addresses a practical bottleneck in the field.

major comments (2)
  1. [Experiments] Experiments section: the central claim that the 4-step model matches or surpasses the 1000-step teacher rests on the joint effect of DMD, timestep respacing, and the JS regularization term, yet no component-wise ablations are provided (e.g., performance with DMD+respacing alone or with altered JS coefficient). Because the weighted objective is domain-specific for bond-length/angle constraints and conformer energies, the reported metrics could depend on hyperparameter choices tuned to the test sets rather than emerging from the method itself.
  2. [Method] Method section: the adaptation of DMD to molecular data, including the precise loss formulation after timestep respacing and the weighting of the JS term, is described at a high level. Without the explicit equations or pseudocode for the combined objective and the respacing schedule, it is difficult to verify that the 4-step results are stable and do not rely on post-hoc adjustments that affect the performance claims.
minor comments (2)
  1. [Abstract] Abstract: the claim of 'up to 250× acceleration' should specify the exact teacher sampling steps, hardware, and batch settings used for the timing comparison.
  2. Ensure all reported metrics (stability, validity, diversity) include explicit definitions or citations to the standard molecular-generation literature (e.g., how validity is assessed for 3D conformations).

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the thoughtful and constructive feedback on our manuscript. The comments highlight important aspects of experimental validation and methodological clarity that we will address in the revision to strengthen the presentation of FlashMol. We respond to each major comment below.

read point-by-point responses
  1. Referee: [Experiments] Experiments section: the central claim that the 4-step model matches or surpasses the 1000-step teacher rests on the joint effect of DMD, timestep respacing, and the JS regularization term, yet no component-wise ablations are provided (e.g., performance with DMD+respacing alone or with altered JS coefficient). Because the weighted objective is domain-specific for bond-length/angle constraints and conformer energies, the reported metrics could depend on hyperparameter choices tuned to the test sets rather than emerging from the method itself.

    Authors: We agree that component-wise ablations would better isolate the contributions of each element and address potential concerns about hyperparameter sensitivity. The manuscript focuses on the combined objective because individual components (DMD alone or respacing without JS) do not achieve the target 4-step performance on their own, as motivated by the mode-seeking behavior of reverse KL and the need for better initialization on the molecular manifold. However, to strengthen the claims, we will add ablation tables in the revised Experiments section showing results for DMD+respacing (without JS), JS with different coefficients, and variations in the weighting for bond/angle constraints. Hyperparameters were tuned on a validation split separate from the test sets used for final reporting, following standard practice; we will explicitly state this and include sensitivity analysis to confirm robustness. revision: yes

  2. Referee: [Method] Method section: the adaptation of DMD to molecular data, including the precise loss formulation after timestep respacing and the weighting of the JS term, is described at a high level. Without the explicit equations or pseudocode for the combined objective and the respacing schedule, it is difficult to verify that the 4-step results are stable and do not rely on post-hoc adjustments that affect the performance claims.

    Authors: We acknowledge that the Method section presents the adaptations at a conceptual level to maintain readability, but we agree that explicit formulations are necessary for full reproducibility and verification. In the revised manuscript, we will expand the Method section to include the precise combined loss equation (reverse KL from DMD plus weighted JS term), the mathematical definition of the respaced timestep schedule (including how it provides improved initialization for the generator), and pseudocode for the distillation training procedure. This will clarify that the 4-step results arise directly from the described objective without undisclosed post-hoc tuning. revision: yes

Circularity Check

0 steps flagged

No circularity: empirical adaptation validated against external teacher

full rationale

The paper adapts DMD (reverse KL minimization), timestep respacing, and a JS regularization term to distill a 1000-step GeoLDM teacher into a 4-step generator for 3D molecular conformations. All central claims are supported by direct empirical comparisons on QM9 and GEOM-DRUG using standard metrics (stability, validity, diversity) against the independent teacher model. No equation, objective, or performance result is shown to reduce by construction to fitted parameters, self-citations, or renamed inputs; the method description and results remain externally falsifiable and do not rely on internal self-reference for their validity.

Axiom & Free-Parameter Ledger

1 free parameters · 1 axioms · 0 invented entities

The central claim rests on the assumption that DMD's local minimization can be effectively countered by respacing and JS regularization in the molecular domain, plus standard ML training assumptions; no new physical entities or axioms are introduced.

free parameters (1)
  • number of sampling steps
    Chosen as 4 to achieve ultra-fast generation; value is a design choice rather than fitted to data.
axioms (1)
  • domain assumption DMD objective can be adapted to 3D molecular conformations without loss of chemical validity
    Invoked when stating that the reverse KL minimization transfers effectively to the molecular domain.

pith-pipeline@v0.9.0 · 5520 in / 1263 out tokens · 32798 ms · 2026-05-11T00:57:39.584173+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

52 extracted references · 24 canonical work pages · 7 internal anchors

  1. [1]

    Geom, energy-annotated molecular conforma- tions for property prediction and molecular generation.Scientific Data, 2022

    Simon Axelrod and Rafael Gómez-Bombarelli. Geom, energy-annotated molecular conforma- tions for property prediction and molecular generation.Scientific Data, 2022

  2. [2]

    Optimizing few-step generation with adaptive matching distillation.arXiv preprint arXiv:2602.07345, 2026

    Lichen Bai, Zikai Zhou, Shitong Shao, Wenliang Zhong, Shuo Yang, Shuo Chen, Bojun Chen, and Zeke Xie. Optimizing few-step generation with adaptive matching distillation.arXiv preprint arXiv:2602.07345, 2026

  3. [3]

    How to build a consistency model: Learning flow maps via self-distillation.arXiv preprint arXiv:2505.18825,

    Nicholas M. Boffi, Michael S. Albergo, and Eric Vanden-Eijnden. How to build a consistency model: Learning flow maps via self-distillation.arXiv preprint arXiv:2505.18825, 2025

  4. [4]

    Computational redesign of bacterial biotin carboxylase inhibitors using structure-based virtual screening of combinatorial libraries.Molecules, 2014

    Michal Brylinski and Grover Waldrop. Computational redesign of bacterial biotin carboxylase inhibitors using structure-based virtual screening of combinatorial libraries.Molecules, 2014

  5. [5]

    Ian Dunn and David R. Koes. Flowmol3: flow matching for 3d de novo small-molecule generation.Digital Discovery, 2026

  6. [6]

    Mean Flows for One-step Generative Modeling

    Zhengyang Geng, Mingyang Deng, Xingjian Bai, Jeremy Z. Kolter, and Kaiming He. Mean flows for one-step generative modeling.arXiv preprint arXiv:2505.13447, 2025

  7. [7]

    Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron Courville, and Yoshua Bengio

    Ian J. Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron Courville, and Yoshua Bengio. Generative adversarial nets.Advances in Neural Information Processing Systems, 2014

  8. [8]

    Greenaway and Kim E

    Rebecca L. Greenaway and Kim E. Jelfs. Integrating computational and experimental workflows for accelerated organic materials discovery.Advanced Materials, 2021

  9. [9]

    Equivariant flow matching for molecular conformer generation.ICML 2024 Workshop, 2024

    Majdi Hassan, Nikhil Shenoy, Jungyoon Lee, Hannes Stark, Stephan Thaler, and Dominique Beaini. Equivariant flow matching for molecular conformer generation.ICML 2024 Workshop, 2024

  10. [10]

    Accelerating 3d molecule generation via jointly geometric optimal transport.arXiv preprint arXiv:2405.15252, 2024

    Haokai Hong, Wanyu Lin, and Kay Chen Tan. Accelerating 3d molecule generation via jointly geometric optimal transport.arXiv preprint arXiv:2405.15252, 2024

  11. [11]

    Equivariant diffusion for molecule generation in 3d.Proceedings of the 39th International Conference on Machine Learning, 2022

    Emiel Hoogeboom, Victor Garcia Satorras, Clément Vignac, and Max Welling. Equivariant diffusion for molecule generation in 3d.Proceedings of the 39th International Conference on Machine Learning, 2022

  12. [12]

    Self Forcing: Bridging the Train-Test Gap in Autoregressive Video Diffusion

    Xun Huang, Zhengqi Li, Guande He, Mingyuan Zhou, and Eli Shechtman. Self forcing: Bridging the train-test gap in autoregressive video diffusion.arXiv preprint arXiv:2506.08009, 2025

  13. [13]

    SemlaFlow–Efficient 3D molecular generation with latent attention and equivariant flow matching.arXiv preprint arXiv:2406.07266, 2024

    Ross Irwin, Alessandro Tibo, Jon Paul Janet, and Simon Olsson. Semlaflow–efficient 3d molecular generation with latent attention and equivariant flow matching.arXiv preprint arXiv:2406.07266, 2024

  14. [14]

    Hierarchical graph generation with K 2-trees

    Yunhui Jang, Dongwoo Kim, and Sungsoo Ahn. Hierarchical graph generation with K 2-trees. InICML 2023 Workshop on Structured Probabilistic Inference & Generative Modeling, 2023. 10

  15. [15]

    Distribution matching distillation meets reinforcement learning.arXiv preprint arXiv:2511.13649, 2025

    Dengyang Jiang, Dongyang Liu, Zanyi Wang, Qilong Wu, Liuzhuozheng Li, Hengzhuang Li, Xin Jin, David Liu, Changsheng Lu, Zhen Li, Bo Zhang, Mengmeng Wang, Steven Hoi, Peng Gao, and Harry Yang. Distribution matching distillation meets reinforcement learning.arXiv preprint arXiv:2511.13649, 2025

  16. [16]

    Elucidating the design space of diffusion-based generative models.Advances in Neural Information Processing Systems, 2022

    Tero Karras, Miika Aittala, Timo Aila, and Samuli Laine. Elucidating the design space of diffusion-based generative models.Advances in Neural Information Processing Systems, 2022

  17. [17]

    Adam: A Method for Stochastic Optimization

    Diederik P. Kingma and Jimmy Ba. Adam: A method for stochastic optimization.arXiv preprint arXiv:1412.6980, 2014

  18. [18]

    Molecule generation by principal subgraph mining and assembling.Advances in Neural Information Processing Systems, 2022

    Xiangzhe Kong, Wenbing Huang, Zhixing Tan, and Yang Liu. Molecule generation by principal subgraph mining and assembling.Advances in Neural Information Processing Systems, 2022

  19. [19]

    Accelerating the generation of molecular conforma- tions with progressive distillation of equivariant latent diffusion models.arXiv preprint arXiv:2404.13491, 2024

    Romain Lacombe and Neal Vaidya. Accelerating the generation of molecular conforma- tions with progressive distillation of equivariant latent diffusion models.arXiv preprint arXiv:2404.13491, 2024

  20. [20]

    Geometric representation condition improves equivariant molecule generation.arXiv preprint arXiv:2410.03655, 2024

    Zian Li, Cai Zhou, Xiyuan Wang, Xingang Peng, and Muhan Zhang. Geometric representation condition improves equivariant molecule generation.arXiv preprint arXiv:2410.03655, 2024

  21. [21]

    Haitao Lin, Peiyan Hu, Minsi Ren, Zhifeng Gao, Zhi-Ming Ma, Guolin ke, Tailin Wu, and Stan Z. Li. On the design of one-step diffusion via shortcutting flow paths.arXiv preprint arXiv:2512.11831, 2025

  22. [22]

    Divergence measures based on the shannon entropy.IEEE Transactions on Information theory, 2002

    Jianhua Lin. Divergence measures based on the shannon entropy.IEEE Transactions on Information theory, 2002

  23. [23]

    Simplifying, Stabilizing and Scaling Continuous-Time Consistency Models

    Cheng Lu and Yang Song. Simplifying, stabilizing and scaling continuous-time consistency models.arXiv preprint arXiv:2410.11081, 2024

  24. [24]

    Which training methods for gans do actually converge?Proceedings of the 35th International Conference on Machine Learning, 2018

    Lars Mescheder, Andreas Geiger, and Sebastian Nowozin. Which training methods for gans do actually converge?Proceedings of the 35th International Conference on Machine Learning, 2018

  25. [25]

    Straight-line diffusion model for efficient 3d molecular generation.arXiv preprint arXiv:2503.02918, 2025

    Yuyan Ni, Shikun Feng, Haohan Chi, Bowen Zheng, Huan ang Gao, Wei-Ying Ma, Zhi-Ming Ma, and Yanyan Lan. Straight-line diffusion model for efficient 3d molecular generation.arXiv preprint arXiv:2503.02918, 2025

  26. [26]

    Automatic differentiation in pytorch.NeurIPS 2017 Workshop on Autodiff, 2017

    Adam Paszke, Sam Gross, Soumith Chintala, Gregory Chanan, Edward Yang, Zachary DeVito, Zeming Lin, Alban Desmaison, Luca Antiga, and Adam Lerer. Automatic differentiation in pytorch.NeurIPS 2017 Workshop on Autodiff, 2017

  27. [27]

    Defog: Discrete flow matching for graph generation.arXiv preprint arXiv:2410.04263, 2024

    Yiming Qin, Manuel Madeira, Dorina Thanou, and Pascal Frossard. Defog: Discrete flow matching for graph generation.arXiv preprint arXiv:2410.04263, 2024

  28. [28]

    Dral, Matthias Rupp, and Anatole von Lilienfeld

    Raghunathan Ramakrishnan, Pavlo O. Dral, Matthias Rupp, and Anatole von Lilienfeld. Quan- tum chemistry structures and properties of 134 kilo molecules.Scientific Data, 2014

  29. [29]

    Progressive Distillation for Fast Sampling of Diffusion Models

    Tim Salimans and Jonathan Ho. Progressive distillation for fast sampling of diffusion models. arXiv preprint arXiv:2202.00512, 2022

  30. [30]

    E (n) equivariant graph neural networks

    Vıctor Garcia Satorras, Emiel Hoogeboom, and Max Welling. E (n) equivariant graph neural networks. InInternational conference on machine learning, pages 9323–9332. PMLR, 2021

  31. [31]

    Denoising Diffusion Implicit Models

    Jiaming Song, Chenlin Meng, and Stefano Ermon. Denoising diffusion implicit models.arXiv preprint arXiv:2010.02502, 2020

  32. [32]

    Improved techniques for training consistency models

    Yang Song and Prafulla Dhariwal. Improved techniques for training consistency models.arXiv preprint arXiv:2310.14189, 2023

  33. [33]

    Consistency models.Proceedings of the 40th International Conference on Machine Learning, 2023

    Yang Song, Prafulla Dhariwal, Mark Chen, and Ilya Sutskever. Consistency models.Proceedings of the 40th International Conference on Machine Learning, 2023. 11

  34. [34]

    Score-Based Generative Modeling through Stochastic Differential Equations

    Yang Song, Jascha Sohl-Dickstein, Diederik P. Kingma, Abhishek Kumar, Stefano Ermon, and Ben Poole. Score-based generative modeling through stochastic differential equations.arXiv preprint arXiv:2011.13456, 2020

  35. [35]

    Unified generative modeling of 3d molecules with bayesian flow networks.The Twelfth International Conference on Learning Representations, 2024

    Yuxuan Song, Jingjing Gong, Yanru Qu, Hao Zhou, Mingyue Zheng, Jingjing Liu, and Wei-Ying Ma. Unified generative modeling of 3d molecules with bayesian flow networks.The Twelfth International Conference on Learning Representations, 2024

  36. [36]

    Equivariant flow matching with hybrid probability transport for 3d molecule generation.Advances in Neural Information Processing Systems, 2023

    Yuxuan Song, Jingjing Gong, Minkai Xu, Ziyao Cao, Yanyan Lan, Stefano Ermon, Hao Zhou, and Wei-Ying Ma. Equivariant flow matching with hybrid probability transport for 3d molecule generation.Advances in Neural Information Processing Systems, 2023

  37. [37]

    Flow map distillation without data.arXiv preprint arXiv:2511.19428, 2025

    Shangyuan Tong, Nanye Ma, Saining Xie, and Tommi Jaakkola. Flow map distillation without data.arXiv preprint arXiv:2511.19428, 2025

  38. [38]

    Gomez, Lukasz Kaiser, and Illia Polosukhin

    Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Lukasz Kaiser, and Illia Polosukhin. Attention is all you need.Advances in neural information processing systems, 2017

  39. [39]

    The Eleventh International Conference on Learning Representations , publisher =

    Clement Vignac, Igor Krawczuk, Antoine Siraudin, Bohan Wang, V olkan Cevher, and Pas- cal Frossard. Digress: Discrete denoising diffusion for graph generation.arXiv preprint arXiv:2209.14734, 2022

  40. [40]

    Learning diffusion models with flexible representation guidance.arXiv preprint arXiv:2507.08980, 2025

    Chenyu Wang, Cai Zhou, Sharut Gupta, Zongyu Lin, Stefanie Jegelka, Stephen Bates, and Tommi Jaakkola. Learning diffusion models with flexible representation guidance.arXiv preprint arXiv:2507.08980, 2025

  41. [41]

    Warr, Marc C

    Wendy A. Warr, Marc C. Nicklaus, Christos A. Nicolaou, and Matthias Rarey. Exploration of ultralarge compound collections for drug discovery.Journal of Chemical Information and Modeling, 2022

  42. [42]

    Diffusion-based molecule generation with informative prior bridges.Advances in Neural Information Processing Systems, 2022

    Lemeng Wu, Chengyue Gong, Xingchao Liu, Mao Ye, and Qiang Liu. Diffusion-based molecule generation with informative prior bridges.Advances in Neural Information Processing Systems, 2022

  43. [43]

    Geometric latent diffusion models for 3d molecule generation.Proceedings of the 40th International Conference on Machine Learning, 2023

    Minkai Xu, Alexander Powers, Ron Dror, Stefano Ermon, and Jure Leskovec. Geometric latent diffusion models for 3d molecule generation.Proceedings of the 40th International Conference on Machine Learning, 2023

  44. [44]

    Geodiff: A geo- metric diffusion model for molecular conformation generation.arXiv preprint arXiv:2203.02923, 2022

    Minkai Xu, Lantao Yu, Yang Song, Chence Shi, Stefano Ermon, and Jian Tang. Geodiff: A geo- metric diffusion model for molecular conformation generation.arXiv preprint arXiv:2203.02923, 2022

  45. [45]

    arXiv preprint arXiv:2502.15681 , year=

    Yilun Xu, Weili Nie, and Arash Vahdat. One-step diffusion models with f-divergence distribu- tion matching.arXiv preprint arXiv:2502.15681, 2025

  46. [46]

    Next-gen therapeutics: pioneering drug discovery with ipscs, genomics, ai, and clinical trials in a dish.Annual Review of Pharmacology and Toxicology, 2025

    Zehra Yildirim, Kyle Swanson, Xuekun Wu, James Zou, and Joseph Wu. Next-gen therapeutics: pioneering drug discovery with ipscs, genomics, ai, and clinical trials in a dish.Annual Review of Pharmacology and Toxicology, 2025

  47. [47]

    Tianwei Yin, Michaël Gharbi, Taesung Park, Richard Zhang, Eli Shechtman, Fredo Durand, and William T. Freeman. Improved distribution matching distillation for fast image synthesis. Advances in Neural Information Processing Systems, 2024

  48. [48]

    Free- man, and Taesung Park

    Tianwei Yin, Michaël Gharbi, Richard Zhang, Eli Shechtman, Frédo Durand, William T. Free- man, and Taesung Park. One-step diffusion with distribution matching distillation.Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

  49. [49]

    Freeman, Frédo Durand, Eli Shecht- man, and Xun Huang

    Tianwei Yin, Qiang Zhang, Richard Zhang, William T. Freeman, Frédo Durand, Eli Shecht- man, and Xun Huang. From slow bidirectional to fast autoregressive video diffusion models. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2025

  50. [50]

    Accelerating 3d molecule generative models with trajectory diagnosis

    Zhilong Zhang, Yuxuan Song, Yichun Wang, Jingjing Gong, Hanlin Wu, Dongzhan Zhou, Hao Zhou, and Wei-Ying Ma. Accelerating 3d molecule generative models with trajectory diagnosis. The Thirty-ninth Annual Conference on Neural Information Processing Systems, 2025. 12

  51. [51]

    Unifying generation and prediction on graphs with latent graph diffusion.Advances in Neural Information Processing Systems, 2024

    Cai Zhou, Xiyuan Wang, and Muhan Zhang. Unifying generation and prediction on graphs with latent graph diffusion.Advances in Neural Information Processing Systems, 2024

  52. [52]

    (pages 4 and 10)

    Linqi Zhou, Mathias Parger, Ayaan Haque, and Jiaming Song. Terminal velocity matching. arXiv preprint arXiv:2511.19797, 2025. 13 Appendix A Additional Preliminaries A.1 Molecule Diffusion Models We provide additional details on the molecule diffusion model summarized in Section 3. Following GeoLDM [43], a molecule with N atoms is represented as G=⟨x, h⟩ ,...