Metric-Free Natural Gradient for Joint-Training of Boltzmann Machines

Aaron Courville; Guillaume Desjardins; Razvan Pascanu; Yoshua Bengio

arxiv: 1301.3545 · v2 · pith:2SXYCEFTnew · submitted 2013-01-16 · 💻 cs.LG · cs.NE· stat.ML

Metric-Free Natural Gradient for Joint-Training of Boltzmann Machines

Guillaume Desjardins , Razvan Pascanu , Aaron Courville , Yoshua Bengio This is my paper

classification 💻 cs.LG cs.NEstat.ML

keywords boltzmanngradientnaturalalgorithmfunctionjoint-trainingmachinesmethod

0 comments

read the original abstract

This paper introduces the Metric-Free Natural Gradient (MFNG) algorithm for training Boltzmann Machines. Similar in spirit to the Hessian-Free method of Martens [8], our algorithm belongs to the family of truncated Newton methods and exploits an efficient matrix-vector product to avoid explicitely storing the natural gradient metric $L$. This metric is shown to be the expected second derivative of the log-partition function (under the model distribution), or equivalently, the variance of the vector of partial derivatives of the energy function. We evaluate our method on the task of joint-training a 3-layer Deep Boltzmann Machine and show that MFNG does indeed have faster per-epoch convergence compared to Stochastic Maximum Likelihood with centering, though wall-clock performance is currently not competitive.

This paper has not been read by Pith yet.

Metric-Free Natural Gradient for Joint-Training of Boltzmann Machines

discussion (0)