Stochastic Bound Majorization
read the original abstract
Recently a majorization method for optimizing partition functions of log-linear models was proposed alongside a novel quadratic variational upper-bound. In the batch setting, it outperformed state-of-the-art first- and second-order optimization methods on various learning tasks. We propose a stochastic version of this bound majorization method as well as a low-rank modification for high-dimensional data-sets. The resulting stochastic second-order method outperforms stochastic gradient descent (across variations and various tunings) both in terms of the number of iterations and computation time till convergence while finding a better quality parameter setting. The proposed method bridges first- and second-order stochastic optimization methods by maintaining a computational complexity that is linear in the data dimension and while exploiting second order information about the pseudo-global curvature of the objective function (as opposed to the local curvature in the Hessian).
This paper has not been read by Pith yet.
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.