DiGress: Discrete Denoising diffusion for graph generation
read the original abstract
This work introduces DiGress, a discrete denoising diffusion model for generating graphs with categorical node and edge attributes. Our model utilizes a discrete diffusion process that progressively edits graphs with noise, through the process of adding or removing edges and changing the categories. A graph transformer network is trained to revert this process, simplifying the problem of distribution learning over graphs into a sequence of node and edge classification tasks. We further improve sample quality by introducing a Markovian noise model that preserves the marginal distribution of node and edge types during diffusion, and by incorporating auxiliary graph-theoretic features. A procedure for conditioning the generation on graph-level features is also proposed. DiGress achieves state-of-the-art performance on molecular and non-molecular datasets, with up to 3x validity improvement on a planar graph dataset. It is also the first model to scale to the large GuacaMol dataset containing 1.3M drug-like molecules without the use of molecule-specific representations.
This paper has not been read by Pith yet.
Forward citations
Cited by 23 Pith papers
-
Set Diffusion: Interpolating Token Orderings Between Autoregression and Diffusion for Fast and Flexible Decoding
Set diffusion factorizes likelihood over arbitrary token sets and uses a set-causal diffusion architecture to support KV caching and any-order decoding, yielding improved speed-quality tradeoffs versus prior diffusion LMs.
-
Interpretable Meta-Learning for Multi-Objective Chemical Search
Linear meta-learning surrogates trained across chemical objectives and auxiliary properties adapt rapidly to new multi-objective molecular searches and outperform baselines by 78% in Pareto performance on spin-crossov...
-
Variational Learning for Insertion-based Generation
Introduces the Insertion Process model for variable-length non-monotonic sequence generation via a bijective permutation mapping and permutation-based variational inference.
-
Adaptive Order Policies for Masked Diffusion
A policy network learns to choose unmasking order in masked diffusion by reweighting the loss, outperforming random and heuristic baselines on ordering-sensitive tasks.
-
DGLD: Domain-Gated Latent Diffusion for the Discovery of Novel Energetic Materials
DGLD applies domain-gated latent diffusion with label-quality gating and multi-task guidance to discover 12 novel energetic material leads validated by DFT, outperforming SMILES-LSTM, SELFIES-GA, and REINVENT baseline...
-
One Pass for All: A Discrete Diffusion Model for Knowledge Graph Triple Set Prediction
DiffTSP applies discrete diffusion to knowledge graph triple set prediction, recovering all missing triples simultaneously via edge-masking noise reversal and a structure-aware transformer, achieving SOTA on three datasets.
-
SynHAT: A Two-stage Coarse-to-Fine Diffusion Framework for Synthesizing Human Activity Traces
SynHAT uses a novel two-stage spatio-temporal diffusion framework with Latent Spatio-Temporal U-Net to synthesize realistic human activity traces, outperforming baselines by 52% on spatial and 33% on temporal metrics ...
-
LangFlow: Continuous Diffusion Rivals Discrete in Language Modeling
LangFlow is the first continuous diffusion language model to rival discrete diffusion on perplexity and generative perplexity while exceeding autoregressive baselines on several zero-shot tasks.
-
GraphWeave: Interpretable and Robust Graph Generation via Random Walk Trajectories
GraphWeave learns graph family patterns via random walk trajectories and reconstructs new graphs through joint optimization, outperforming diffusion baselines on benchmarks for structures like communities and degree d...
-
Accelerating Discrete Diffusion Models with Parallel-In-Time Sampling
A parallel-in-time τ-leaping sampler for absorbing discrete diffusion models is introduced, with an exponential-factorial convergence proof and empirical speedups of 7-9× on synthetic tasks and 1.45-1.86× on image/tex...
-
Estimation--Prediction Tradeoff in Causal Probabilistic Temporal Graphs
Characterizes an estimation-prediction tradeoff in binary logistic models for causal probabilistic temporal graphs and proposes a framework to jointly evaluate temporal link prediction with causal parameter recovery v...
-
Modular Diffusion Models for Structured Visual Recognition
Modular Diffusion Models decompose diffusion into task-specific modules to model distributions over structured visual outputs for detection, segmentation, and scene graph generation.
-
Streamlining Analysis and Design of Two-Dimensional Electronic Spectroscopy using Machine Learning
A Gaussian mixture model is used to learn spectral densities from 2DES experiments, enabling extraction of vibronic couplings, spectral extrapolation, and optimized experiment selection across simulated and experiment...
-
Generative Molecular Morphing for Flexible-Size Design via Unbalanced Optimal Transport
Morph is a flexible-size 3D molecular generative model using unbalanced optimal transport on geometric graphs that matches fixed-size SOTA performance while enabling out-of-distribution generation.
-
FlashMol: High-Quality Molecule Generation in as Few as Four Steps
FlashMol produces chemically valid 3D molecules in 4 steps via distribution matching distillation with respaced timesteps and Jensen-Shannon regularization, matching or exceeding 1000-step teacher performance on QM9 a...
-
GCCM: Enhancing Generative Graph Prediction via Contrastive Consistency Model
GCCM prevents shortcut collapse in consistency models for graph prediction by using contrastive negative pairs and input feature perturbation, leading to better performance than deterministic baselines.
-
Interpolating Discrete Diffusion Models with Controllable Resampling
IDDM interpolates diffusion transitions with a resampling mechanism to lessen dependence on intermediate latents and improve sample quality over masked and uniform discrete diffusion models.
-
Discrete Bayesian Sample Inference for Graph Generation
GraphBSI uses Bayesian Sample Inference as noise-controlled SDEs to generate discrete graphs in one shot, achieving state-of-the-art results on molecular benchmarks Moses and GuacaMol.
-
Efficient Inference for Coupled Hidden Markov Models in Continuous Time and Discrete Space
Proposes Latent Interacting Particle Systems with an efficient parameterization of twist potentials to enable approximate posterior inference for coupled continuous-time hidden Markov models via twisted sequential Mon...
-
SpectraLLM: Uncovering the Ability of LLMs for Molecular Structure Elucidation from Multi-Spectral Data
SpectraLLM is an LLM fine-tuned to predict small-molecule structures from single or multiple spectra, reporting state-of-the-art results on four public benchmarks with gains from multi-modal input.
-
Graph Defense Diffusion Model
GDDM is a diffusion-based purification method with a graph structure refiner and node feature regularizer that defends against multiple adversarial attack types on graphs.
-
xAI-Drop: Don't Use What You Cannot Explain
xAI-Drop introduces an explainability-based topological dropping regularizer for GNNs that outperforms state-of-the-art dropping methods in accuracy and explanation quality on real-world datasets.
-
Built Environment Reasoning from Remote Sensing Imagery Using Large Vision--Language Models
Large vision-language models applied to multi-scale remote sensing imagery can generate recommendations on built environment design, constructability, land use, and risks for smart city decision-making.
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.