pith. sign in

Deep Directed Generative Models with Energy-Based Probability Estimation

2 Pith papers cite this work. Polarity classification is still indexing.

2 Pith papers citing it
abstract

Training energy-based probabilistic models is confronted with apparently intractable sums, whose Monte Carlo estimation requires sampling from the estimated probability distribution in the inner loop of training. This can be approximately achieved by Markov chain Monte Carlo methods, but may still face a formidable obstacle that is the difficulty of mixing between modes with sharp concentrations of probability. Whereas an MCMC process is usually derived from a given energy function based on mathematical considerations and requires an arbitrarily long time to obtain good and varied samples, we propose to train a deep directed generative model (not a Markov chain) so that its sampling distribution approximately matches the energy function that is being trained. Inspired by generative adversarial networks, the proposed framework involves training of two models that represent dual views of the estimated probability distribution: the energy function (mapping an input configuration to a scalar energy value) and the generator (mapping a noise vector to a generated configuration), both represented by deep neural networks.

fields

cs.LG 2

years

2025 1 2023 1

verdicts

UNVERDICTED 2

representative citing papers

Contrastive Residual Energy Test-time Adaptation

cs.LG · 2025-05-26 · unverdicted · novelty 7.0

CreTTA reformulates test-time adaptation of marginal distributions as residual energy learning, producing a contrastive objective that cancels the partition function and uses relative energy differences for adaptive gradient reweighting to avoid overfitting.

citing papers explorer

Showing 2 of 2 citing papers.

  • Contrastive Residual Energy Test-time Adaptation cs.LG · 2025-05-26 · unverdicted · none · ref 5 · internal anchor

    CreTTA reformulates test-time adaptation of marginal distributions as residual energy learning, producing a contrastive objective that cancels the partition function and uses relative energy differences for adaptive gradient reweighting to avoid overfitting.

  • Explaining the effects of non-convergent sampling in the training of Energy-Based Models cs.LG · 2023-01-23 · unverdicted · none · ref 23 · internal anchor

    EBMs trained with non-persistent short runs reproduce empirical data statistics via a precise dynamical process, not the equilibrium measure.