Fast minimization of structured convex quartics

Brian Bullins

read the original abstract

We propose faster methods for unconstrained optimization of \emph{structured convex quartics}, which are convex functions of the form \begin{equation*} f(x) = c^\top x + x^\top \mathbf{G} x + \mathbf{T}[x,x,x] + \frac{1}{24} \mathopen\| \mathbf{A} x \mathclose\|_4^4 \end{equation*} for $c \in \mathbb{R}^d$, $\mathbf{G} \in \mathbb{R}^{d \times d}$, $\mathbf{T} \in \mathbb{R}^{d \times d \times d}$, and $\mathbf{A} \in \mathbb{R}^{n \times d}$ such that $\mathbf{A}^\top \mathbf{A} \succ 0$. In particular, we show how to achieve an $\epsilon$-optimal minimizer for such functions with only $O(n^{1/5}\log^{O(1)}(\mathcal{Z}/\epsilon))$ calls to a gradient oracle and linear system solver, where $\mathcal{Z}$ is a problem-dependent parameter. Our work extends recent ideas on efficient tensor methods and higher-order acceleration techniques to develop a descent method for optimizing the relevant quartic functions. As a natural consequence of our method, we achieve an overall cost of $O(n^{1/5}\log^{O(1)}(\mathcal{Z} / \epsilon))$ calls to a gradient oracle and (sparse) linear system solver for the problem of $\ell_4$-regression when $\mathbf{A}^\top \mathbf{A} \succ 0$, providing additional insight into what may be achieved for general $\ell_p$-regression. Our results show the benefit of combining efficient higher-order methods with recent acceleration techniques for improving convergence rates in fundamental convex optimization problems.

Fast minimization of structured convex quartics

discussion (0)