pith. sign in

arxiv: 2203.02923 · v1 · pith:2AFRKZ3Pnew · submitted 2022-03-06 · 💻 cs.LG · q-bio.QM

GeoDiff: a Geometric Diffusion Model for Molecular Conformation Generation

classification 💻 cs.LG q-bio.QM
keywords geodiffmarkovmolecularconformationsdiffusiondistributionprocessapproaches
0
0 comments X
read the original abstract

Predicting molecular conformations from molecular graphs is a fundamental problem in cheminformatics and drug discovery. Recently, significant progress has been achieved with machine learning approaches, especially with deep generative models. Inspired by the diffusion process in classical non-equilibrium thermodynamics where heated particles will diffuse from original states to a noise distribution, in this paper, we propose a novel generative model named GeoDiff for molecular conformation prediction. GeoDiff treats each atom as a particle and learns to directly reverse the diffusion process (i.e., transforming from a noise distribution to stable conformations) as a Markov chain. Modeling such a generation process is however very challenging as the likelihood of conformations should be roto-translational invariant. We theoretically show that Markov chains evolving with equivariant Markov kernels can induce an invariant distribution by design, and further propose building blocks for the Markov kernels to preserve the desirable equivariance property. The whole framework can be efficiently trained in an end-to-end fashion by optimizing a weighted variational lower bound to the (conditional) likelihood. Experiments on multiple benchmarks show that GeoDiff is superior or comparable to existing state-of-the-art approaches, especially on large molecules.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Forward citations

Cited by 21 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. Generative Modeling with Flux Matching

    cs.LG 2026-05 unverdicted novelty 8.0

    Flux Matching generalizes score-based generative modeling by using a weaker objective that admits infinitely many non-conservative vector fields with the data as stationary distribution, enabling new design choices be...

  2. World Models as Group Actions

    cs.CV 2026-05 unverdicted novelty 7.0

    Formalizes video world models as group actions on states and uses latent regularization with synthesized supervision to enforce consistency, introducing GAC and GAR metrics that improve structural correctness in SOTA models.

  3. Training-Free Generative Sampling via Moment-Matched Score Smoothing

    stat.ML 2026-05 unverdicted novelty 7.0

    MM-SOLD is a training-free particle sampler whose large-particle limit converges to a moment-matched Gibbs distribution obtained by exponentially tilting a score-smoothed target.

  4. From Holo Pockets to Electron Density: GPT-style Drug Design with Density

    cs.AI 2026-05 unverdicted novelty 7.0

    EDMolGPT generates drug-like molecules from low-resolution electron density point clouds of holo binding pockets and shows effectiveness across 101 biological targets.

  5. PerFlow: Physics-Embedded Rectified Flow for Efficient Reconstruction and Uncertainty Quantification of Spatiotemporal Dynamics

    cs.LG 2026-05 unverdicted novelty 7.0

    PerFlow embeds physics constraints into rectified flow sampling through guidance-free conditioning and constraint-preserving projections, achieving efficient sparse reconstruction and uncertainty quantification for sp...

  6. h-MINT: Modeling Pocket-Ligand Binding with Hierarchical Molecular Interaction Network

    cs.LG 2026-04 unverdicted novelty 7.0

    h-MINT improves ligand-protein binding affinity prediction by 2-4% and virtual screening metrics by 1-3% via overlapping fragment tokenization and hierarchical modeling.

  7. How Creative Are Large Language Models in Generating Molecules?

    cs.CL 2026-04 unverdicted novelty 7.0

    Large language models exhibit distinct creative patterns in molecule generation, including higher constraint satisfaction when more constraints are added, and this is the first work to reframe molecule generation abil...

  8. Time-Aware Diffusion based on Preference Disentanglement for Generative Recommendation

    cs.IR 2026-06 unverdicted novelty 6.0

    TDPM is a diffusion-based generative recommender that disentangles user preferences into period and point components to enable time-aware diffusion on semantic indices, reporting up to 29% gains on HR@20 and NDCG@20 o...

  9. Latent Diffusion Pretraining for Crystal Property Prediction

    cs.LG 2026-05 unverdicted novelty 6.0

    CrysLDNet combines VAE and latent diffusion pretraining on unlabeled crystals to improve graph encoder performance on property prediction by about 4-5% on JARVIS and MP datasets.

  10. DiffATS: Diffusion in Aligned Tensor Space

    cs.LG 2026-05 unverdicted novelty 6.0

    DiffATS trains diffusion models directly on aligned Tucker tensor primitives that are proven to be homeomorphisms, delivering efficient unconditional and conditional generation across images, videos, and PDE data with...

  11. From Holo Pockets to Electron Density: GPT-style Drug Design with Density

    cs.AI 2026-05 unverdicted novelty 6.0

    EDMolGPT generates molecules from low-resolution electron density for de novo structure-based drug design, claiming better performance than pocket-based methods on 101 targets.

  12. Toward Better Geometric Representations for Molecule Generative Models

    cs.LG 2026-05 unverdicted novelty 6.0

    LENSEs improves representation-conditioned molecule generation by jointly training a multi-level representation head, perceptual loss, and REPA alignment on pretrained encoders, yielding 97.28% validity and 98.51% sta...

  13. FlashMol: High-Quality Molecule Generation in as Few as Four Steps

    cs.LG 2026-05 unverdicted novelty 6.0

    FlashMol produces chemically valid 3D molecules in 4 steps via distribution matching distillation with respaced timesteps and Jensen-Shannon regularization, matching or exceeding 1000-step teacher performance on QM9 a...

  14. SymDrift: One-Shot Generative Modeling under Symmetries

    cs.LG 2026-05 unverdicted novelty 6.0

    SymDrift makes drifting models produce symmetry-invariant samples in one step via symmetrized coordinate drifts or G-invariant embeddings, outperforming prior one-shot baselines on molecular benchmarks and cutting com...

  15. Interests Burn-down Diffusion Process for Personalized Collaborative Filtering

    cs.IR 2026-05 unverdicted novelty 6.0

    A new interests burn-down diffusion process models decaying user interests for personalized collaborative filtering and outperforms prior generative methods in the StageCF implementation.

  16. PerFlow: Physics-Embedded Rectified Flow for Efficient Reconstruction and Uncertainty Quantification of Spatiotemporal Dynamics

    cs.LG 2026-05 unverdicted novelty 6.0

    PerFlow decouples observation conditioning from physics enforcement in rectified flows using constraint-preserving projections and invariance guarantees for fast, physics-consistent reconstruction of spatiotemporal dynamics.

  17. LEGO-MOF: Equivariant Latent Manipulation for Editable, Generative, and Optimizable MOF Design

    cs.LG 2026-04 unverdicted novelty 6.0

    LEGO-MOF maps MOF linkers to an equivariant latent space for continuous editing and uses test-time optimization to achieve a 147.5% average boost in pure CO2 uptake while preserving structural validity.

  18. MolDA: Molecular Understanding and Generation via Large Language Diffusion Model

    cs.AI 2026-04 unverdicted novelty 6.0

    MolDA is a multimodal molecular model that uses a discrete large language diffusion backbone plus a hybrid graph encoder to achieve better global coherence and validity than autoregressive approaches.

  19. Energy-Guided Generative Modeling for Low-Energy Molecular Structure Discovery

    cs.LG 2025-12 unverdicted novelty 6.0

    EnFlow integrates flow-based conformer generation with energy landscape modeling to enable joint ensemble generation and ground-state identification using only 1-2 ODE steps.

  20. DPM-Solver++: Fast Solver for Guided Sampling of Diffusion Probabilistic Models

    cs.LG 2022-11 conditional novelty 6.0

    DPM-Solver++ enables high-quality guided sampling of diffusion models in 15-20 steps via data-prediction ODE solving and multistep stabilization.

  21. On the Limits of Latent Reuse in Diffusion Models

    stat.ML 2026-05 unverdicted novelty 5.0

    Reusing source latent spaces in diffusion models under distribution shift produces target score error set by principal-angle misalignment and diffusion-time-amplified ambient noise.