Finite-particle approximation of the Doob h-function causes reward hacking via two failure modes in reward-guided diffusion; a damping schedule corrects within-mode bias in Gaussian settings.
Ingraham, Max Baranov, Zak Costello, Karl W
3 Pith papers cite this work. Polarity classification is still indexing.
years
2026 3verdicts
UNVERDICTED 3representative citing papers
GeoCoupling optimizes temporal couplings between modalities in biomolecular generative models and outperforms synchronous baselines on drug design and protein design tasks.
AMix-2 unifies protein sequences and text in one LLM via shared tokens and block-wise diffusion modeling, introduces the ProteinArena benchmark, and reports competitive performance against task-specific protein models and frontier LLMs.
citing papers explorer
-
Are we really tilting? The mechanics of reward guidance in flow and diffusion models
Finite-particle approximation of the Doob h-function causes reward hacking via two failure modes in reward-guided diffusion; a damping schedule corrects within-mode bias in Gaussian settings.
-
Demystifying Multimodal Biomolecular Co-design With Intrinsic Geodesic Coupling
GeoCoupling optimizes temporal couplings between modalities in biomolecular generative models and outperforms synchronous baselines on drug design and protein design tasks.
-
AMix-2: Establishing Protein as a Native Modality in Large Language Models
AMix-2 unifies protein sequences and text in one LLM via shared tokens and block-wise diffusion modeling, introduces the ProteinArena benchmark, and reports competitive performance against task-specific protein models and frontier LLMs.