A Stochastic GDA Method With Backtracking For Solving Nonconvex Concave Minimax Problems

Mert G\"urb\"uzbalaban; Necdet Serhat Aybat; Qiushui Xu; Xuan Zhang

read the original abstract

We propose a stochastic GDA (gradient descent ascent) method with backtracking (SGDA-B) to solve nonconvex-concave (NCC) minimax problems of the form: $\min_{\mathbf{x}} \max_y \sum_{i=1}^N g_i(x_i)+f(\mathbf{x},y)-h(y)$, where $h$ and $g_i$ for $i=1,\cdots,N$ are closed, convex functions, and for some $L,\mu\geq 0$, $f$ is $L$-smooth and $f(\mathbf{x},\cdot)$ is $\mu$-strongly concave for all $\mathbf{x}$ in the problem domain. We consider the stochastic setting where one only has an access to an unbiased stochastic oracle of $\nabla f$ with a finite variance bound $\sigma^2$. While most of the existing methods assume knowledge of $L$, $\mu$ and/or $\sigma^2$, SGDA-B is agnostic to all of these problem parameters. Moreover, SGDA-B can support random block-coordinate updates. In the deterministic setting, i.e., $\sigma^2=0$ and one can compute $\nabla f$ exactly, SGDA-B can compute an $\epsilon$-stationary point within $\mathcal{O}(L\kappa^2/\epsilon^2)$ and $\mathcal{O}(L^3/\epsilon^4)$ gradient calls when $\mu>0$ and $\mu=0$, respectively, where $\kappa\triangleq L/\mu$. In the stochastic setting, i.e., $\sigma^2>0$, for any $p\in(0,1)$ and $\epsilon>0$, it can compute an $\epsilon$-stationary point with high probability, which requires $\mathcal{O}(L \kappa^3 \epsilon^{-4} \log^2(1/p))$ and $\tilde{\mathcal{O}}(L^4\epsilon^{-7}\log^2(1/p))$ stochastic oracle calls, with probability at least $1-p$, when $\mu>0$ and $\mu=0$, respectively. To our knowledge, SGDA-B is the first GDA-type method with backtracking to solve NCC minimax problems and achieves the best complexity among the methods that are agnostic to $L$, $\mu$ and $\sigma^2$. We also provide numerical results for SGDA-B on a distributionally robust learning problem illustrating the potential performance gains that can be achieved by SGDA-B.

A Stochastic GDA Method With Backtracking For Solving Nonconvex Concave Minimax Problems

discussion (0)