Fast minimization of structured convex quartics
read the original abstract
We propose faster methods for unconstrained optimization of \emph{structured convex quartics}, which are convex functions of the form \begin{equation*} f(x) = c^\top x + x^\top \mathbf{G} x + \mathbf{T}[x,x,x] + \frac{1}{24} \mathopen\| \mathbf{A} x \mathclose\|_4^4 \end{equation*} for $c \in \mathbb{R}^d$, $\mathbf{G} \in \mathbb{R}^{d \times d}$, $\mathbf{T} \in \mathbb{R}^{d \times d \times d}$, and $\mathbf{A} \in \mathbb{R}^{n \times d}$ such that $\mathbf{A}^\top \mathbf{A} \succ 0$. In particular, we show how to achieve an $\epsilon$-optimal minimizer for such functions with only $O(n^{1/5}\log^{O(1)}(\mathcal{Z}/\epsilon))$ calls to a gradient oracle and linear system solver, where $\mathcal{Z}$ is a problem-dependent parameter. Our work extends recent ideas on efficient tensor methods and higher-order acceleration techniques to develop a descent method for optimizing the relevant quartic functions. As a natural consequence of our method, we achieve an overall cost of $O(n^{1/5}\log^{O(1)}(\mathcal{Z} / \epsilon))$ calls to a gradient oracle and (sparse) linear system solver for the problem of $\ell_4$-regression when $\mathbf{A}^\top \mathbf{A} \succ 0$, providing additional insight into what may be achieved for general $\ell_p$-regression. Our results show the benefit of combining efficient higher-order methods with recent acceleration techniques for improving convergence rates in fundamental convex optimization problems.
This paper has not been read by Pith yet.
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.