Stochastic Alternating Direction Method of Multipliers with Variance Reduction for Nonconvex Optimization

Feihu Huang; Songcan Chen; Zhaosong Lu

arxiv: 1610.02758 · v5 · pith:QMTO7JXXnew · submitted 2016-10-10 · 🧮 math.OC · stat.ML

Stochastic Alternating Direction Method of Multipliers with Variance Reduction for Nonconvex Optimization

Feihu Huang , Songcan Chen , Zhaosong Lu This is my paper

classification 🧮 math.OC stat.ML

keywords stochasticnonconvexvarianceadmmgradientsiterationmethodmethods

0 comments

read the original abstract

In the paper, we study the stochastic alternating direction method of multipliers (ADMM) for the nonconvex optimizations, and propose three classes of the nonconvex stochastic ADMM with variance reduction, based on different reduced variance stochastic gradients. Specifically, the first class called the nonconvex stochastic variance reduced gradient ADMM (SVRG-ADMM), uses a multi-stage scheme to progressively reduce the variance of stochastic gradients. The second is the nonconvex stochastic average gradient ADMM (SAG-ADMM), which additionally uses the old gradients estimated in the previous iteration. The third called SAGA-ADMM is an extension of the SAG-ADMM method. Moreover, under some mild conditions, we establish the iteration complexity bound of $O(1/\epsilon)$ of the proposed methods to obtain an $\epsilon$-stationary solution of the nonconvex optimizations. In particular, we provide a general framework to analyze the iteration complexity of these nonconvex stochastic ADMM methods with variance reduction. Finally, some numerical experiments demonstrate the effectiveness of our methods.

This paper has not been read by Pith yet.

discussion (0)

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

Distributed Inexact Successive Convex Approximation ADMM: Analysis-Part I
math.OC 2019-07 unverdicted novelty 6.0

The paper develops two variants of a distributed inexact SCA-ADMM algorithm and proves first-order convergence rate guarantees under mild assumptions for non-convex problems with robustness to errors and delays.