Meta-Learning for Stochastic Gradient MCMC

Wenbo Gong , Yingzhen Li , Jos\'e Miguel Hern\'andez-Lobato

Authors on Pith no claims yet

classification 📊 stat.ML cs.LG

keywords sg-mcmcbayesiannetworkneuralsamplerdynamicsgeneralizesgradient

read the original abstract

Stochastic gradient Markov chain Monte Carlo (SG-MCMC) has become increasingly popular for simulating posterior samples in large-scale Bayesian modeling. However, existing SG-MCMC schemes are not tailored to any specific probabilistic model, even a simple modification of the underlying dynamical system requires significant physical intuition. This paper presents the first meta-learning algorithm that allows automated design for the underlying continuous dynamics of an SG-MCMC sampler. The learned sampler generalizes Hamiltonian dynamics with state-dependent drift and diffusion, enabling fast traversal and efficient exploration of neural network energy landscapes. Experiments validate the proposed approach on both Bayesian fully connected neural network and Bayesian recurrent neural network tasks, showing that the learned sampler out-performs generic, hand-designed SG-MCMC algorithms, and generalizes to different datasets and larger architectures.

This paper has not been read by Pith yet.

discussion (0)

Forward citations

Cited by 2 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

Adaptive Meta-Learning Stochastic Gradient Hamiltonian Monte Carlo Simulation for Bayesian Updating of Structural Dynamic Models
stat.AP 2026-04 unverdicted novelty 7.0

AM-SGHMC combines adaptive neural networks with SGHMC to produce a reusable MCMC sampler for Bayesian updating of similar structural dynamic models without per-task retraining.
MCMC with Adaptive Principal-Component Transformation: Rotation-Invariant Universal Samplers for Bayesian Structural System Identification
stat.AP 2026-04 unverdicted novelty 7.0

APM-SGHMC achieves zero-shot generalization in MCMC sampling for Bayesian system identification by adaptively aligning with principal components to enforce translation, scale, and rotation invariance.