EnergyFlow shows that denoising score matching on diffusion policies recovers the gradient of the expert's soft Q-function under maximum-entropy optimality, enabling non-adversarial reward extraction and improved policy generalization.
Title resolution pending
2 Pith papers cite this work. Polarity classification is still indexing.
years
2026 2verdicts
UNVERDICTED 2representative citing papers
Longitudinal surveys show AI coding assistants reduce time on code writing but increase supervisory verification tasks, with stable productivity perceptions yet rising reports of worsened developer experience.
citing papers explorer
-
Recovering Hidden Reward in Diffusion-Based Policies
EnergyFlow shows that denoising score matching on diffusion policies recovers the gradient of the expert's soft Q-function under maximum-entropy optimality, enabling non-adversarial reward extraction and improved policy generalization.
-
The Impact of AI Coding Assistants on Software Engineering: A Longitudinal Study
Longitudinal surveys show AI coding assistants reduce time on code writing but increase supervisory verification tasks, with stable productivity perceptions yet rising reports of worsened developer experience.