A Proximal Block Coordinate Descent Algorithm for Deep Neural Network Training

Baoyuan Wu; Jinshan Zeng; Tim Tsz-Kit Lau; Yuan Yao

arxiv: 1803.09082 · v1 · pith:OIAZHYZMnew · submitted 2018-03-24 · 📊 stat.ML · cs.LG· math.OC

A Proximal Block Coordinate Descent Algorithm for Deep Neural Network Training

Tim Tsz-Kit Lau , Jinshan Zeng , Baoyuan Wu , Yuan Yao This is my paper

classification 📊 stat.ML cs.LGmath.OC

keywords algorithmdnnstrainingbackpropblockcoordinatedeepdescent

0 comments

read the original abstract

Training deep neural networks (DNNs) efficiently is a challenge due to the associated highly nonconvex optimization. The backpropagation (backprop) algorithm has long been the most widely used algorithm for gradient computation of parameters of DNNs and is used along with gradient descent-type algorithms for this optimization task. Recent work have shown the efficiency of block coordinate descent (BCD) type methods empirically for training DNNs. In view of this, we propose a novel algorithm based on the BCD method for training DNNs and provide its global convergence results built upon the powerful framework of the Kurdyka-Lojasiewicz (KL) property. Numerical experiments on standard datasets demonstrate its competitive efficiency against standard optimizers with backprop.

This paper has not been read by Pith yet.

A Proximal Block Coordinate Descent Algorithm for Deep Neural Network Training

discussion (0)