STREAM decouples text and music conditioning in a diffusion transformer via AdaLN for structure and BEAM for beats, plus new Motorica++ dataset and editability metrics, claiming SOTA music alignment with preserved semantics.
hub
How to train your energy-based models
17 Pith papers cite this work. Polarity classification is still indexing.
hub tools
citation-role summary
citation-polarity summary
roles
background 2representative citing papers
Energy-navigated trajectory shaping during training produces 8-step discrete flow matching students that achieve 32% lower perplexity than 1024-step teachers on 170M language models with unchanged inference cost.
Patient-specific energy manifolds from baseline mpMRI scans act as fixed geometric references to monitor longitudinal evolution of voxel distributions in sequence space for neuro-oncology proof-of-concept cases.
Langevin sampling on the modern Hopfield energy produces training-free stochastic attention that transitions from exact retrieval to generation as temperature rises, with an entropy inflection condition marking the shift.
CreTTA reformulates test-time adaptation of marginal distributions as residual energy learning, producing a contrastive objective that cancels the partition function and uses relative energy differences for adaptive gradient reweighting to avoid overfitting.
Derives Õ(d β² A² / ε⁴) oracle complexity for AIS estimating normalizing constant Z to relative error ε and introduces reverse diffusion sampler for geometric paths with large action.
Stochastic interpolants unify flow-based and diffusion-based generative models by bridging target densities exactly via latent-variable processes whose drifts minimize quadratic objectives.
SDEdit performs guided image synthesis and editing by adding noise to inputs and refining them via denoising with a diffusion model's SDE prior, outperforming GAN methods in human studies without task-specific training.
The generalization advantage of SGD over random sampling diminishes with growing training set size in binary networks, as measured by joint density of states over train and test accuracy.
Explicit MSE bounds derived for time-average estimators in adaptive increasingly rare MCMC under simultaneous Wasserstein contraction.
Decentralized diffusion policies trained with importance sampling score matching enhance exploration and performance in cooperative MARL over Gaussian policy baselines.
Training and sampling in static scalar energy generative models are two instances of the same Lyapunov-driven density transport dynamics on Wasserstein space, differing only by initial condition, which yields a finite stopping criterion for Langevin sampling and additive composition rules that keep
CLIC uses set-valued action targets from interactive human corrections instead of pointwise labels to train more robust imitation learning policies.
Edwin integrates dynamic maximum entropy dimensionality reduction with symbolic regression to recover physically interpretable low-dimensional dynamics from high-dimensional observations that generalize to unseen conditions.
Score-difference flow reduces KL divergence between distributions and is formally equivalent to denoising diffusion models and a hidden subproblem in optimal GAN training under stated conditions.
The tutorial synthesizes diffusion model techniques for generative semantic communications to achieve high compression while preserving meaning in wireless transmission.
citing papers explorer
-
SDEdit: Guided Image Synthesis and Editing with Stochastic Differential Equations
SDEdit performs guided image synthesis and editing by adding noise to inputs and refining them via denoising with a diffusion model's SDE prior, outperforming GAN methods in human studies without task-specific training.