Local Convergence Properties of SAGA/Prox-SVRG and Acceleration
read the original abstract
Over the past ten years, driven by large scale optimisation problems arising from machine learning, the development of stochastic optimisation methods have witnessed a tremendous growth. However, despite their popularity, the theoretical understandings of these methods are quite limited in contrast to the deterministic optimisation methods. In this paper, we present a local convergence analysis for a typical type of stochastic optimisation methods: proximal variance reduced stochastic gradient methods, and mainly focus on the SAGA [12] and Prox-SVRG [43] algorithms. Under the assumption that the non-smooth component of the optimisation problem is partly smooth relative to a smooth manifold, we present a unified framework for the local convergence analysis of the SAGA/Prox-SVRG algorithms: (i) the sequences generated by the SAGA/Prox-SVRG are able to identify the smooth manifold in a finite number of iterations; (ii) then the sequence enters a local linear convergence regime. Beyond local convergence analysis, we also discuss various possibilities for accelerating these algorithms, including adapting to better local parameters, and applying higher-order deterministic/stochastic optimisation methods which can achieve super-linear convergence. Concrete examples arising from machine learning are considered to verify the obtained results.
This paper has not been read by Pith yet.
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.