hub Canonical reference

Diffusion language models are versatile protein learners

· 2024 · arXiv 2402.18567

Canonical reference. 100% of citing Pith papers cite this work as background.

11 Pith papers citing it

Background 100% of classified citations

read on arXiv browse 11 citing papers

hub tools

JSON dossier citing papers JSON arXiv source

citation-role summary

background 5

citation-polarity summary

background 5

representative citing papers

A-CODE: Fully Atomic Protein Co-Design with Unified Multimodal Diffusion

q-bio.QM · 2026-05-05 · unverdicted · novelty 8.0

A-CODE presents a fully atomic one-stage multimodal diffusion model for protein co-design that claims superior unconditional generation performance over prior one- and two-stage models plus a tenfold success-rate gain on hard binder-design tasks.

Large Language Diffusion Models

cs.CL · 2025-02-14 · unverdicted · novelty 8.0

LLaDA is a scalable diffusion-based language model that matches autoregressive LLMs like LLaMA3 8B on tasks and surpasses GPT-4o on reversal poem completion.

Dual Triangle Attention: Effective Bidirectional Attention Without Positional Embeddings

q-bio.QM · 2026-04-09 · unverdicted · novelty 7.0

Dual Triangle Attention achieves effective bidirectional attention with built-in positional inductive bias via dual triangular masks, outperforming standard bidirectional attention on position-sensitive tasks and showing strong masked language modeling results with or without positional embeddings.

Yeti: A compact protein structure tokenizer for reconstruction and multi-modal generation

q-bio.BM · 2026-05-11 · unverdicted · novelty 6.0

Yeti is a compact tokenizer for protein structures that delivers strong codebook use, token diversity, and reconstruction while enabling from-scratch multimodal generation of plausible sequences and structures with 10x fewer parameters than ESM3.

Primal-Dual Guided Decoding for Constrained Discrete Diffusion

cs.AI · 2026-05-10 · unverdicted · novelty 6.0

Primal-dual guided decoding casts constrained discrete diffusion as a KL-regularized optimization solved online with adaptive Lagrangian multipliers to satisfy constraints while staying close to the unconstrained model distribution.

Coupling Models for One-Step Discrete Generation

cs.LG · 2026-05-08 · unverdicted · novelty 6.0

Coupling Models enable single-step discrete sequence generation via learned couplings to Gaussian latents and outperform prior one-step baselines on text perplexity, biological FBD, and image FID metrics.

MP2D: Constrained Monte Carlo Tree-Guided Diffusion for Multi-Objective Protein Sequence Design

q-bio.BM · 2026-05-07 · unverdicted · novelty 6.0

MP2D is a framework that guides discrete diffusion denoising with constrained MCTS and Pareto rewards to optimize protein sequences for four to five simultaneous objectives, outperforming baselines on antimicrobial peptide and binder design tasks.

MIMIC: A Generative Multimodal Foundation Model for Biomolecules

cs.AI · 2026-04-27 · unverdicted · novelty 6.0

MIMIC is a split-track encoder-decoder foundation model that unifies sequence reconstruction, prediction, and constrained design across nucleic acids, proteins, and regulatory context using partially observed multimodal inputs.

A Unification of Discrete, Gaussian, and Simplicial Diffusion

cs.LG · 2025-12-17 · unverdicted · novelty 6.0

Discrete, Gaussian, and simplicial diffusion models for sequences are unified as parameterizations of the Wright-Fisher population genetics model, allowing multi-domain training and stable simplicial diffusion.

Co-Generative De Novo Functional Protein Design

q-bio.QM · 2026-05-01 · unverdicted · novelty 5.0

CodeFP jointly generates protein sequences and structures using functional local structures and auxiliary supervision, yielding 6.1% better functional consistency and 3.2% better foldability than prior baselines.

Towards A Generative Protein Evolution Machine with DPLM-Evo

cs.LG · 2026-04-30 · 2 refs

citing papers explorer

Showing 11 of 11 citing papers.

A-CODE: Fully Atomic Protein Co-Design with Unified Multimodal Diffusion q-bio.QM · 2026-05-05 · unverdicted · none · ref 32
A-CODE presents a fully atomic one-stage multimodal diffusion model for protein co-design that claims superior unconditional generation performance over prior one- and two-stage models plus a tenfold success-rate gain on hard binder-design tasks.
Large Language Diffusion Models cs.CL · 2025-02-14 · unverdicted · none · ref 77
LLaDA is a scalable diffusion-based language model that matches autoregressive LLMs like LLaMA3 8B on tasks and surpasses GPT-4o on reversal poem completion.
Dual Triangle Attention: Effective Bidirectional Attention Without Positional Embeddings q-bio.QM · 2026-04-09 · unverdicted · none · ref 17
Dual Triangle Attention achieves effective bidirectional attention with built-in positional inductive bias via dual triangular masks, outperforming standard bidirectional attention on position-sensitive tasks and showing strong masked language modeling results with or without positional embeddings.
Yeti: A compact protein structure tokenizer for reconstruction and multi-modal generation q-bio.BM · 2026-05-11 · unverdicted · none · ref 33
Yeti is a compact tokenizer for protein structures that delivers strong codebook use, token diversity, and reconstruction while enabling from-scratch multimodal generation of plausible sequences and structures with 10x fewer parameters than ESM3.
Primal-Dual Guided Decoding for Constrained Discrete Diffusion cs.AI · 2026-05-10 · unverdicted · none · ref 54
Primal-dual guided decoding casts constrained discrete diffusion as a KL-regularized optimization solved online with adaptive Lagrangian multipliers to satisfy constraints while staying close to the unconstrained model distribution.
Coupling Models for One-Step Discrete Generation cs.LG · 2026-05-08 · unverdicted · none · ref 65
Coupling Models enable single-step discrete sequence generation via learned couplings to Gaussian latents and outperform prior one-step baselines on text perplexity, biological FBD, and image FID metrics.
MP2D: Constrained Monte Carlo Tree-Guided Diffusion for Multi-Objective Protein Sequence Design q-bio.BM · 2026-05-07 · unverdicted · none · ref 37
MP2D is a framework that guides discrete diffusion denoising with constrained MCTS and Pareto rewards to optimize protein sequences for four to five simultaneous objectives, outperforming baselines on antimicrobial peptide and binder design tasks.
MIMIC: A Generative Multimodal Foundation Model for Biomolecules cs.AI · 2026-04-27 · unverdicted · none · ref 113
MIMIC is a split-track encoder-decoder foundation model that unifies sequence reconstruction, prediction, and constrained design across nucleic acids, proteins, and regulatory context using partially observed multimodal inputs.
A Unification of Discrete, Gaussian, and Simplicial Diffusion cs.LG · 2025-12-17 · unverdicted · none · ref 51
Discrete, Gaussian, and simplicial diffusion models for sequences are unified as parameterizations of the Wright-Fisher population genetics model, allowing multi-domain training and stable simplicial diffusion.
Co-Generative De Novo Functional Protein Design q-bio.QM · 2026-05-01 · unverdicted · none · ref 10
CodeFP jointly generates protein sequences and structures using functional local structures and auxiliary supervision, yielding 6.1% better functional consistency and 3.2% better foldability than prior baselines.
Towards A Generative Protein Evolution Machine with DPLM-Evo cs.LG · 2026-04-30 · unreviewed · ref 54 · 2 links

Diffusion language models are versatile protein learners

hub tools

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer