StitchVM stitches clean-image reward models with diffusion backbones to enable efficient value estimation for noisy latents, speeding up diffusion alignment methods like DPS by 3.2x and halving memory.
Deep unsupervised learning using nonequilibrium thermodynamics
3 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
roles
background 1polarities
background 1representative citing papers
Flow matching critics outperform monolithic ones in RL by 2x performance and 5x sample efficiency via test-time error recovery through integration and multi-point velocity supervision that preserves feature plasticity.
Directly predicting clean data with large-patch pixel Transformers enables strong generative performance in diffusion models where noise prediction fails at high dimensions.
citing papers explorer
-
Stitched Value Model for Diffusion Alignment
StitchVM stitches clean-image reward models with diffusion backbones to enable efficient value estimation for noisy latents, speeding up diffusion alignment methods like DPS by 3.2x and halving memory.
-
What Does Flow Matching Bring To TD Learning?
Flow matching critics outperform monolithic ones in RL by 2x performance and 5x sample efficiency via test-time error recovery through integration and multi-point velocity supervision that preserves feature plasticity.
-
Back to Basics: Let Denoising Generative Models Denoise
Directly predicting clean data with large-patch pixel Transformers enables strong generative performance in diffusion models where noise prediction fails at high dimensions.