pith. sign in

arxiv: 2209.14734 · v4 · pith:4Y26HST4new · submitted 2022-09-29 · 💻 cs.LG

DiGress: Discrete Denoising diffusion for graph generation

classification 💻 cs.LG
keywords diffusionmodeldigressdiscreteedgegraphgraphsnode
0
0 comments X
read the original abstract

This work introduces DiGress, a discrete denoising diffusion model for generating graphs with categorical node and edge attributes. Our model utilizes a discrete diffusion process that progressively edits graphs with noise, through the process of adding or removing edges and changing the categories. A graph transformer network is trained to revert this process, simplifying the problem of distribution learning over graphs into a sequence of node and edge classification tasks. We further improve sample quality by introducing a Markovian noise model that preserves the marginal distribution of node and edge types during diffusion, and by incorporating auxiliary graph-theoretic features. A procedure for conditioning the generation on graph-level features is also proposed. DiGress achieves state-of-the-art performance on molecular and non-molecular datasets, with up to 3x validity improvement on a planar graph dataset. It is also the first model to scale to the large GuacaMol dataset containing 1.3M drug-like molecules without the use of molecule-specific representations.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Forward citations

Cited by 23 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. Set Diffusion: Interpolating Token Orderings Between Autoregression and Diffusion for Fast and Flexible Decoding

    cs.LG 2026-07 unverdicted novelty 7.0

    Set diffusion factorizes likelihood over arbitrary token sets and uses a set-causal diffusion architecture to support KV caching and any-order decoding, yielding improved speed-quality tradeoffs versus prior diffusion LMs.

  2. Interpretable Meta-Learning for Multi-Objective Chemical Search

    cs.CE 2026-06 unverdicted novelty 7.0

    Linear meta-learning surrogates trained across chemical objectives and auxiliary properties adapt rapidly to new multi-objective molecular searches and outperform baselines by 78% in Pareto performance on spin-crossov...

  3. Variational Learning for Insertion-based Generation

    cs.LG 2026-06 unverdicted novelty 7.0

    Introduces the Insertion Process model for variable-length non-monotonic sequence generation via a bijective permutation mapping and permutation-based variational inference.

  4. Adaptive Order Policies for Masked Diffusion

    cs.LG 2026-05 unverdicted novelty 7.0

    A policy network learns to choose unmasking order in masked diffusion by reweighting the loss, outperforming random and heuristic baselines on ordering-sensitive tasks.

  5. DGLD: Domain-Gated Latent Diffusion for the Discovery of Novel Energetic Materials

    physics.chem-ph 2026-05 unverdicted novelty 7.0

    DGLD applies domain-gated latent diffusion with label-quality gating and multi-task guidance to discover 12 novel energetic material leads validated by DFT, outperforming SMILES-LSTM, SELFIES-GA, and REINVENT baseline...

  6. One Pass for All: A Discrete Diffusion Model for Knowledge Graph Triple Set Prediction

    cs.AI 2026-04 unverdicted novelty 7.0

    DiffTSP applies discrete diffusion to knowledge graph triple set prediction, recovering all missing triples simultaneously via edge-masking noise reversal and a structure-aware transformer, achieving SOTA on three datasets.

  7. SynHAT: A Two-stage Coarse-to-Fine Diffusion Framework for Synthesizing Human Activity Traces

    cs.AI 2026-04 unverdicted novelty 7.0

    SynHAT uses a novel two-stage spatio-temporal diffusion framework with Latent Spatio-Temporal U-Net to synthesize realistic human activity traces, outperforming baselines by 52% on spatial and 33% on temporal metrics ...

  8. LangFlow: Continuous Diffusion Rivals Discrete in Language Modeling

    cs.CL 2026-04 unverdicted novelty 7.0

    LangFlow is the first continuous diffusion language model to rival discrete diffusion on perplexity and generative perplexity while exceeding autoregressive baselines on several zero-shot tasks.

  9. GraphWeave: Interpretable and Robust Graph Generation via Random Walk Trajectories

    cs.LG 2025-09 unverdicted novelty 7.0

    GraphWeave learns graph family patterns via random walk trajectories and reconstructs new graphs through joint optimization, outperforming diffusion baselines on benchmarks for structures like communities and degree d...

  10. Accelerating Discrete Diffusion Models with Parallel-In-Time Sampling

    cs.LG 2026-07 unverdicted novelty 6.0

    A parallel-in-time τ-leaping sampler for absorbing discrete diffusion models is introduced, with an exponential-factorial convergence proof and empirical speedups of 7-9× on synthetic tasks and 1.45-1.86× on image/tex...

  11. Estimation--Prediction Tradeoff in Causal Probabilistic Temporal Graphs

    cs.LG 2026-06 unverdicted novelty 6.0

    Characterizes an estimation-prediction tradeoff in binary logistic models for causal probabilistic temporal graphs and proposes a framework to jointly evaluate temporal link prediction with causal parameter recovery v...

  12. Modular Diffusion Models for Structured Visual Recognition

    cs.CV 2026-06 unverdicted novelty 6.0

    Modular Diffusion Models decompose diffusion into task-specific modules to model distributions over structured visual outputs for detection, segmentation, and scene graph generation.

  13. Streamlining Analysis and Design of Two-Dimensional Electronic Spectroscopy using Machine Learning

    physics.chem-ph 2026-06 unverdicted novelty 6.0

    A Gaussian mixture model is used to learn spectral densities from 2DES experiments, enabling extraction of vibronic couplings, spectral extrapolation, and optimized experiment selection across simulated and experiment...

  14. Generative Molecular Morphing for Flexible-Size Design via Unbalanced Optimal Transport

    cs.LG 2026-06 unverdicted novelty 6.0

    Morph is a flexible-size 3D molecular generative model using unbalanced optimal transport on geometric graphs that matches fixed-size SOTA performance while enabling out-of-distribution generation.

  15. FlashMol: High-Quality Molecule Generation in as Few as Four Steps

    cs.LG 2026-05 unverdicted novelty 6.0

    FlashMol produces chemically valid 3D molecules in 4 steps via distribution matching distillation with respaced timesteps and Jensen-Shannon regularization, matching or exceeding 1000-step teacher performance on QM9 a...

  16. GCCM: Enhancing Generative Graph Prediction via Contrastive Consistency Model

    cs.AI 2026-05 unverdicted novelty 6.0

    GCCM prevents shortcut collapse in consistency models for graph prediction by using contrastive negative pairs and input feature perturbation, leading to better performance than deterministic baselines.

  17. Interpolating Discrete Diffusion Models with Controllable Resampling

    cs.LG 2026-04 unverdicted novelty 6.0

    IDDM interpolates diffusion transitions with a resampling mechanism to lessen dependence on intermediate latents and improve sample quality over masked and uniform discrete diffusion models.

  18. Discrete Bayesian Sample Inference for Graph Generation

    cs.LG 2025-11 unverdicted novelty 6.0

    GraphBSI uses Bayesian Sample Inference as noise-controlled SDEs to generate discrete graphs in one shot, achieving state-of-the-art results on molecular benchmarks Moses and GuacaMol.

  19. Efficient Inference for Coupled Hidden Markov Models in Continuous Time and Discrete Space

    stat.ML 2025-10 unverdicted novelty 6.0

    Proposes Latent Interacting Particle Systems with an efficient parameterization of twist potentials to enable approximate posterior inference for coupled continuous-time hidden Markov models via twisted sequential Mon...

  20. SpectraLLM: Uncovering the Ability of LLMs for Molecular Structure Elucidation from Multi-Spectral Data

    q-bio.QM 2025-08 unverdicted novelty 6.0

    SpectraLLM is an LLM fine-tuned to predict small-molecule structures from single or multiple spectra, reporting state-of-the-art results on four public benchmarks with gains from multi-modal input.

  21. Graph Defense Diffusion Model

    cs.LG 2025-01 unverdicted novelty 6.0

    GDDM is a diffusion-based purification method with a graph structure refiner and node feature regularizer that defends against multiple adversarial attack types on graphs.

  22. xAI-Drop: Don't Use What You Cannot Explain

    cs.LG 2024-07 unverdicted novelty 5.0

    xAI-Drop introduces an explainability-based topological dropping regularizer for GNNs that outperforms state-of-the-art dropping methods in accuracy and explanation quality on real-world datasets.

  23. Built Environment Reasoning from Remote Sensing Imagery Using Large Vision--Language Models

    cs.CL 2026-05 unverdicted novelty 3.0

    Large vision-language models applied to multi-scale remote sensing imagery can generate recommendations on built environment design, constructability, land use, and risks for smart city decision-making.