hub

How to train your energy-based models

Yang Song, Diederik P Kingma · 2021 · arXiv 2101.03288

14 Pith papers cite this work. Polarity classification is still indexing.

14 Pith papers citing it

read on arXiv browse 14 citing papers

hub tools

JSON dossier citing papers JSON arXiv source

citation-role summary

background 2

citation-polarity summary

background 1 unclear 1

representative citing papers

Trajectory as the Teacher: Few-Step Discrete Flow Matching via Energy-Navigated Distillation

cs.LG · 2026-05-08 · unverdicted · novelty 7.0

Energy-navigated trajectory shaping during training produces 8-step discrete flow matching students that achieve 32% lower perplexity than 1024-step teachers on 170M language models with unchanged inference cost.

Energy-based Tissue Manifolds for Longitudinal Multiparametric MRI Analysis

cs.CV · 2026-04-08 · unverdicted · novelty 7.0 · 2 refs

Patient-specific energy manifolds from baseline mpMRI scans act as fixed geometric references to monitor longitudinal evolution of voxel distributions in sequence space for neuro-oncology proof-of-concept cases.

Stochastic Attention via Langevin Dynamics on the Modern Hopfield Energy

cs.LG · 2026-03-06 · unverdicted · novelty 7.0

Langevin sampling on the modern Hopfield energy produces training-free stochastic attention that transitions from exact retrieval to generation as temperature rises, with an entropy inflection condition marking the shift.

Contrastive Residual Energy Test-time Adaptation

cs.LG · 2025-05-26 · unverdicted · novelty 7.0

CreTTA reformulates test-time adaptation of marginal distributions as residual energy learning, producing a contrastive objective that cancels the partition function and uses relative energy differences for adaptive gradient reweighting to avoid overfitting.

Complexity Analysis of Normalizing Constant Estimation: from Jarzynski Equality to Annealed Importance Sampling and beyond

stat.ML · 2025-02-07 · unverdicted · novelty 7.0

Derives Õ(d β² A² / ε⁴) oracle complexity for AIS estimating normalizing constant Z to relative error ε and introduces reverse diffusion sampler for geometric paths with large action.

Stochastic Interpolants: A Unifying Framework for Flows and Diffusions

cs.LG · 2023-03-15 · unverdicted · novelty 7.0

Stochastic interpolants unify flow-based and diffusion-based generative models by bridging target densities exactly via latent-variable processes whose drifts minimize quadratic objectives.

SDEdit: Guided Image Synthesis and Editing with Stochastic Differential Equations

cs.CV · 2021-08-02 · conditional · novelty 7.0

SDEdit performs guided image synthesis and editing by adding noise to inputs and refining them via denoising with a diffusion model's SDE prior, outperforming GAN methods in human studies without task-specific training.

Decentralized Diffusion Policy Learning for Enhanced Exploration in Cooperative Multi-agent Reinforcement Learning

cs.MA · 2026-05-08 · unverdicted · novelty 6.0

Decentralized diffusion policies trained with importance sampling score matching enhance exploration and performance in cooperative MARL over Gaussian policy baselines.

Energy Generative Modeling: A Lyapunov-based Energy Matching Perspective

cs.LG · 2026-05-07 · unverdicted · novelty 6.0

Training and sampling in static scalar energy generative models are two instances of the same Lyapunov-driven density transport dynamics on Wasserstein space, differing only by initial condition, which yields a finite stopping criterion for Langevin sampling and additive composition rules that keep

From Action Labels to Sets: Rethinking Action Supervision for Imitation Learning from Corrective Feedback

cs.RO · 2025-02-11 · unverdicted · novelty 6.0

CLIC uses set-valued action targets from interactive human corrections instead of pointwise labels to train more robust imitation learning policies.

Discovering interpretable low-dimensional dynamics using maximum entropy

q-bio.QM · 2026-05-16 · unverdicted · novelty 5.0

Edwin integrates dynamic maximum entropy dimensionality reduction with symbolic regression to recover physically interpretable low-dimensional dynamics from high-dimensional observations that generalize to unseen conditions.

The Score-Difference Flow for Implicit Generative Modeling

cs.LG · 2023-04-25 · unverdicted · novelty 5.0

Score-difference flow reduces KL divergence between distributions and is formally equivalent to denoising diffusion models and a hidden subproblem in optimal GAN training under stated conditions.

Generative AI Meets 6G and Beyond: Diffusion Models for Semantic Communications

eess.SP · 2025-11-11 · unverdicted · novelty 3.0

The tutorial synthesizes diffusion model techniques for generative semantic communications to achieve high compression while preserving meaning in wireless transmission.

Autoregressive Language Models are Secretly Energy-Based Models: Insights into the Lookahead Capabilities of Next-Token Prediction

cs.LG · 2025-12-17

citing papers explorer

Showing 14 of 14 citing papers.

Trajectory as the Teacher: Few-Step Discrete Flow Matching via Energy-Navigated Distillation cs.LG · 2026-05-08 · unverdicted · none · ref 75
Energy-navigated trajectory shaping during training produces 8-step discrete flow matching students that achieve 32% lower perplexity than 1024-step teachers on 170M language models with unchanged inference cost.
Energy-based Tissue Manifolds for Longitudinal Multiparametric MRI Analysis cs.CV · 2026-04-08 · unverdicted · none · ref 7 · 2 links
Patient-specific energy manifolds from baseline mpMRI scans act as fixed geometric references to monitor longitudinal evolution of voxel distributions in sequence space for neuro-oncology proof-of-concept cases.
Stochastic Attention via Langevin Dynamics on the Modern Hopfield Energy cs.LG · 2026-03-06 · unverdicted · none · ref 10
Langevin sampling on the modern Hopfield energy produces training-free stochastic attention that transitions from exact retrieval to generation as temperature rises, with an entropy inflection condition marking the shift.
Contrastive Residual Energy Test-time Adaptation cs.LG · 2025-05-26 · unverdicted · none · ref 8
CreTTA reformulates test-time adaptation of marginal distributions as residual energy learning, producing a contrastive objective that cancels the partition function and uses relative energy differences for adaptive gradient reweighting to avoid overfitting.
Complexity Analysis of Normalizing Constant Estimation: from Jarzynski Equality to Annealed Importance Sampling and beyond stat.ML · 2025-02-07 · unverdicted · none · ref 91
Derives Õ(d β² A² / ε⁴) oracle complexity for AIS estimating normalizing constant Z to relative error ε and introduces reverse diffusion sampler for geometric paths with large action.
Stochastic Interpolants: A Unifying Framework for Flows and Diffusions cs.LG · 2023-03-15 · unverdicted · none · ref 11
Stochastic interpolants unify flow-based and diffusion-based generative models by bridging target densities exactly via latent-variable processes whose drifts minimize quadratic objectives.
SDEdit: Guided Image Synthesis and Editing with Stochastic Differential Equations cs.CV · 2021-08-02 · conditional · none · ref 12
SDEdit performs guided image synthesis and editing by adding noise to inputs and refining them via denoising with a diffusion model's SDE prior, outperforming GAN methods in human studies without task-specific training.
Decentralized Diffusion Policy Learning for Enhanced Exploration in Cooperative Multi-agent Reinforcement Learning cs.MA · 2026-05-08 · unverdicted · none · ref 9
Decentralized diffusion policies trained with importance sampling score matching enhance exploration and performance in cooperative MARL over Gaussian policy baselines.
Energy Generative Modeling: A Lyapunov-based Energy Matching Perspective cs.LG · 2026-05-07 · unverdicted · none · ref 31
Training and sampling in static scalar energy generative models are two instances of the same Lyapunov-driven density transport dynamics on Wasserstein space, differing only by initial condition, which yields a finite stopping criterion for Langevin sampling and additive composition rules that keep
From Action Labels to Sets: Rethinking Action Supervision for Imitation Learning from Corrective Feedback cs.RO · 2025-02-11 · unverdicted · none · ref 17
CLIC uses set-valued action targets from interactive human corrections instead of pointwise labels to train more robust imitation learning policies.
Discovering interpretable low-dimensional dynamics using maximum entropy q-bio.QM · 2026-05-16 · unverdicted · none · ref 79
Edwin integrates dynamic maximum entropy dimensionality reduction with symbolic regression to recover physically interpretable low-dimensional dynamics from high-dimensional observations that generalize to unseen conditions.
The Score-Difference Flow for Implicit Generative Modeling cs.LG · 2023-04-25 · unverdicted · none · ref 12
Score-difference flow reduces KL divergence between distributions and is formally equivalent to denoising diffusion models and a hidden subproblem in optimal GAN training under stated conditions.
Generative AI Meets 6G and Beyond: Diffusion Models for Semantic Communications eess.SP · 2025-11-11 · unverdicted · none · ref 32
The tutorial synthesizes diffusion model techniques for generative semantic communications to achieve high compression while preserving meaning in wireless transmission.
Autoregressive Language Models are Secretly Energy-Based Models: Insights into the Lookahead Capabilities of Next-Token Prediction cs.LG · 2025-12-17 · unreviewed · ref 39

How to train your energy-based models

hub tools

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer