Stochastic Subspace Descent

Alireza Doostan; David Kozak; Luis Tenorio; Stephen Becker

arxiv: 1904.01145 · v2 · pith:WMG6U4TNnew · submitted 2019-04-01 · 🧮 math.OC

Stochastic Subspace Descent

David Kozak , Stephen Becker , Alireza Doostan , Luis Tenorio This is my paper

classification 🧮 math.OC

keywords descentoptimizationstochasticsvrgalgorithmanalysisconvergefunction

0 comments

read the original abstract

We present two stochastic descent algorithms that apply to unconstrained optimization and are particularly efficient when the objective function is slow to evaluate and gradients are not easily obtained, as in some PDE-constrained optimization and machine learning problems. The basic algorithm projects the gradient onto a random subspace at each iteration, similar to coordinate descent but without restricting directional derivatives to be along the axes. This algorithm is previously known but we provide new analysis. We also extend the popular SVRG method to this framework but without requiring that the objective function be written as a finite sum. We provide proofs of convergence for our methods under various convexity assumptions and show favorable results when compared to gradient descent and BFGS on non-convex problems from the machine learning and shape optimization literature. We also note that our analysis gives a proof that the iterates of SVRG and several other popular first-order stochastic methods, in their original formulation, converge almost surely to the optimum; to our knowledge, prior to this work the iterates of SVRG had only been known to converge in expectation.

This paper has not been read by Pith yet.

discussion (0)

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

Factor Augmented High-Dimensional SGD
stat.ML 2026-05 unverdicted novelty 6.0

Proposes Factor-Augmented SGD that runs on streaming high-dimensional data and supplies the first convergence analysis explicitly accounting for latent-factor estimation error.