pith. sign in

arxiv: 2402.11427 · v2 · pith:AXRJ5PSNnew · submitted 2024-02-18 · 💻 cs.LG · cs.AI· stat.ML

OptEx: Expediting First-Order Optimization with Approximately Parallelized Iterations

classification 💻 cs.LG cs.AIstat.ML
keywords optexgradientiterationsestimationfirst-orderoptimizationapproximatelyefficiency
0
0 comments X
read the original abstract

First-order optimization (FOO) algorithms are pivotal in numerous computational domains such as machine learning and signal denoising. However, their application to complex tasks like neural network training often entails significant inefficiencies due to the need for many sequential iterations for convergence. In response, we introduce first-order optimization expedited with approximately parallelized iterations (OptEx), the first framework that enhances the efficiency of FOO by leveraging parallel computing to mitigate its iterative bottleneck. OptEx employs kernelized gradient estimation to make use of gradient history for future gradient prediction, enabling parallelization of iterations -- a strategy once considered impractical because of the inherent iterative dependency in FOO. We provide theoretical guarantees for the reliability of our kernelized gradient estimation and the iteration complexity of SGD-based OptEx, confirming that estimation errors diminish to zero as historical gradients accumulate and that SGD-based OptEx enjoys an effective acceleration rate of $\Omega(\sqrt{N})$ over standard SGD given parallelism of N. We also use extensive empirical studies, including synthetic functions, reinforcement learning tasks, and neural network training across various datasets, to underscore the substantial efficiency improvements achieved by OptEx.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. Self-Improvement Can Self-Regress: The Rise-and-Collapse Failure Mode of LLM Self-Training

    cs.AI 2026-06 unverdicted novelty 6.0

    REINFORCE self-training on competitive programming tasks exhibits robust rise-then-collapse in pass@1; CARE, ES, and GRPO mitigate it in model-size-dependent ways across Qwen-2.5-3B/7B and a Gemma pilot.