pith. sign in

arxiv: 1901.07648 · v2 · pith:GA3DW32Dnew · submitted 2019-01-22 · 🧮 math.OC · cs.LG· stat.ML

Finite-Sum Smooth Optimization with SARAH

classification 🧮 math.OC cs.LGstat.ML
keywords epsiloncomplexityconvergenceconvexoptimizationsarahtotalalgorithm
0
0 comments X
read the original abstract

The total complexity (measured as the total number of gradient computations) of a stochastic first-order optimization algorithm that finds a first-order stationary point of a finite-sum smooth nonconvex objective function $F(w)=\frac{1}{n} \sum_{i=1}^n f_i(w)$ has been proven to be at least $\Omega(\sqrt{n}/\epsilon)$ for $n \leq \mathcal{O}(\epsilon^{-2})$ where $\epsilon$ denotes the attained accuracy $\mathbb{E}[ \|\nabla F(\tilde{w})\|^2] \leq \epsilon$ for the outputted approximation $\tilde{w}$ (Fang et al., 2018). In this paper, we provide a convergence analysis for a slightly modified version of the SARAH algorithm (Nguyen et al., 2017a;b) and achieve total complexity that matches the lower-bound worst case complexity in (Fang et al., 2018) up to a constant factor when $n \leq \mathcal{O}(\epsilon^{-2})$ for nonconvex problems. For convex optimization, we propose SARAH++ with sublinear convergence for general convex and linear convergence for strongly convex problems; and we provide a practical version for which numerical experiments on various datasets show an improved performance.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Forward citations

Cited by 2 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. A Stochastic Composite Gradient Method with Incremental Variance Reduction

    math.OC 2019-06 unverdicted novelty 6.0

    Proposes an incremental variance-reduced stochastic gradient method for minimizing smooth nonconvex composite functions that achieves optimal first-order complexity rates.

  2. Accelerating Mini-batch SARAH by Step Size Rules

    cs.LG 2019-06 unverdicted novelty 4.0

    MB-SARAH-RBB uses a random Barzilai-Borwein step size to accelerate mini-batch SARAH, with a linear convergence proof and improved complexity for strongly convex objectives.