pith. sign in

arxiv: 1906.10107 · v1 · pith:IFRRVUYRnew · submitted 2019-06-24 · 🧮 math.OC

Primal-dual fast gradient method with a model

Pith reviewed 2026-05-25 17:14 UTC · model grok-4.3

classification 🧮 math.OC
keywords primal-dual methods(δ, L)-modelgradient methodsfast gradient methodsconvergence ratesadaptive optimizationconvex optimization
0
0 comments X

The pith

Primal-dual adaptive and fast gradient methods use (δ, L)-models to recover dual solutions from primal approximations and prove optimal convergence rates for some problem classes.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper develops primal-dual methods that approximate the primal objective with a convex upper bound containing additive noise δ. Solving this easier approximation at each step recovers the dual solution, and an adaptive structure lets the same method apply across different optimization classes. The resulting adaptive gradient method and fast gradient method come with proven convergence rates that match optimal bounds on certain problem families. A sympathetic reader would care because the approach replaces problem-specific primal-dual schemes with one framework built on the model.

Core claim

We developed primal-dual adaptive gradient method and fast gradient method with (δ, L)-model and proved convergence rates of the methods, moreover, for some classes of optimization problems the rates are optimal. The main idea is the following: we find a dual solution to an approximation of a primal problem using the conception of (δ, L)-model. It is much easier to find a solution to an approximated problem, however, we have to do it in each step of our method, thereby the principle of divide and conquer is realized.

What carries the argument

(δ, L)-model: a convex function that upper-bounds the objective with additive noise δ, used to create tractable primal approximations whose solutions yield the dual variables.

If this is right

  • The adaptive gradient method converges at a rate determined by the model parameters δ and L.
  • The fast gradient variant achieves an accelerated rate under the same model.
  • Rates are optimal for some standard classes of convex optimization problems.
  • A single algorithmic template applies to multiple problem classes through the choice of model and adaptive tuning.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The divide-and-conquer pattern could be instantiated with concrete models for specific applications such as support-vector machines or network flow.
  • The framework might extend to settings where the model is updated online rather than fixed in advance.
  • Similar model-based recovery could be tested in nonconvex problems if suitable upper bounds can still be formed.

Load-bearing premise

A (δ, L)-model exists that lets each primal approximation be solved accurately enough to recover the dual solution.

What would settle it

An explicit optimization problem where repeated solution of the (δ, L)-model approximation fails to recover the correct dual solution or where the observed convergence rate falls below the stated bound.

read the original abstract

In this work we consider a possibility to use the conception of $(\delta, L)$-model of a function for optimization tasks, whereby solving a primal problem there is a necessity to recover a solution of a dual problem. The conception of $(\delta, L)$-model is based on the conception of $(\delta, L)$-oracle which was proposed by Devolder-Glineur-Nesterov, herewith the authors proposed approximate a function with an upper bound using a convex quadratic function with some additive noise $\delta$. They managed to get convex quadratic upper bounds with noise even for nonsmooth functions. The conception of $(\delta, L)$-model continues this idea by using instead of a convex quadratic function a more complex convex function in an upper bound. Possibility to recover the solution of a dual problem gives great benefits in different problems, for instance, in some cases, it is faster to find a solution in a primal problem than in a dual problem. Note that primal-dual methods are well studied, but usually each class of optimization problems has its own primal-dual method. Our goal is to develop a method which can find solutions in different classes of optimization problems. This is realized through the use of the conception of $(\delta, L)$-model and adaptive structure of our methods. Thereby, we developed primal-dual adaptive gradient method and fast gradient method with $(\delta, L)$-model and proved convergence rates of the methods, moreover, for some classes of optimization problems the rates are optimal. The main idea is the following: we find a dual solution to an approximation of a primal problem using the conception of $(\delta, L)$-model. It is much easier to find a solution to an approximated problem, however, we have to do it in each step of our method, thereby the principle of "divide and conquer" is realized.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

0 major / 3 minor

Summary. The manuscript develops primal-dual adaptive gradient and fast gradient methods that employ the (δ, L)-model to approximate the primal objective via a convex upper bound (extending the (δ, L)-oracle of Devolder-Glineur-Nesterov) and recover the dual solution from the approximated primal subproblem at each iteration. It claims to establish convergence rates for the resulting methods and to show optimality of some of those rates for particular problem classes.

Significance. If the convergence analysis is correct, the work supplies a single adaptive framework that applies across multiple optimization classes by leveraging the model and the 'divide-and-conquer' recovery step, which may be advantageous when the primal is easier to solve than the dual. The explicit use of a general convex model rather than a quadratic oracle is a natural extension that could broaden applicability to nonsmooth problems.

minor comments (3)
  1. The abstract contains several grammatical and phrasing problems that reduce readability (e.g., 'herewith the authors proposed approximate a function', 'Possibility to recover the solution of a dual problem gives great benefits'). A careful language edit is needed.
  2. The abstract states that rates are 'optimal for some classes' but does not name the classes or give the explicit rate expressions; adding one sentence with this information would make the contribution easier to assess.
  3. The description of the (δ, L)-model is given only at a high level; a short formal definition (even if repeated from the introduction) would help readers who are not already familiar with the Devolder-Glineur-Nesterov oracle.

Simulated Author's Rebuttal

0 responses · 0 unresolved

We thank the referee for the positive summary, significance assessment, and recommendation of minor revision. No specific major comments appear in the provided report.

Circularity Check

0 steps flagged

No significant circularity in derivation chain

full rationale

The paper extends the external (δ, L)-oracle of Devolder-Glineur-Nesterov to a (δ, L)-model for constructing primal-dual adaptive and fast gradient methods. Convergence rates are proved for the resulting algorithms, with optimality asserted only for specific problem classes. No equations or steps in the provided abstract reduce by construction to fitted parameters, self-definitions, or self-citation chains; the central claims rest on independent analysis applied to the cited model. The derivation is therefore self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 1 invented entities

The approach depends on the existence of (δ, L)-models for the target functions and the validity of dual recovery from primal approximations; δ and L are parameters of the model definition rather than fitted values.

axioms (1)
  • domain assumption The objective functions admit (δ, L)-models
    Invoked as the basis for approximating the primal problem in each iteration.
invented entities (1)
  • (δ, L)-model no independent evidence
    purpose: To provide a convex upper bound with additive noise for use in primal-dual optimization
    Introduced as an extension of the (δ, L)-oracle using more complex convex functions.

pith-pipeline@v0.9.0 · 5858 in / 1180 out tokens · 30703 ms · 2026-05-25T17:14:04.981502+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

21 extracted references · 21 canonical work pages

  1. [1]

    Двойственные подходы к задачам минимизации сильно выпуклых функционалов простой структуры при аффинных ограничениях // Журнал выч

    Аникин А.\,С., Гасников А.\,В., Двуреченский П.\,Е., Тюрин А.\,И., Чернов А.\,В. Двойственные подходы к задачам минимизации сильно выпуклых функционалов простой структуры при аффинных ограничениях // Журнал выч. математики и мат. физики. --- 2017. --- Т. 57, № 8. --- C. 1270--1284. 0.1cm Anikin A.\,S., Gasnikov A.\,V.,Dvurechensky P.\,E, Tyurin .A.\,I., C...

  2. [2]

    Универсальный метод поиска равновесий и стохастических равновесий в транспортных сетях // Журнал выч

    Баймурзина Д.\,Р., Гасников А.\,В., Гасникова Е.\,В., Двуреченский П.\,Е., Ершов Е.\,И., Кубентаева М.\,Б., Лагуновская А.\,А. Универсальный метод поиска равновесий и стохастических равновесий в транспортных сетях // Журнал выч. математики и мат. физики. --- 2018. --- Т. 58. 0.1cm Baimurzina D.\,R., Gasnikov A.\,V., Gasnikova E.\,V., Dvurechensky P.\,E., ...

  3. [3]

    Методы оптимизации

    Васильев Ф.\,П. Методы оптимизации. Т.2. --- М.: МЦНМО, 2011. --- 433 с. 0.1cm Vasiliev F. P. Metody optimizatsii [Methods of Optimization]. Vol. 2. --- Moscow: MTSNMO, 2011. --- 433 p. (in Russian)

  4. [4]

    Современные численные методы оптимизации

    Гасников А.\,В. Современные численные методы оптимизации. Метод универсального градиентного спуска // e-print, 2019. --- URL: https://arxiv.org/pdf/1711.00394.pdf 0.1cm Gasnikov A.\,V. Sovremenii chislenii methodi optimizacii [Universal gradient descent] // e-print, 2019. --- URL: https://arxiv.org/pdf/1711.00394.pdf (in Russian)

  5. [5]

    Эффективные численные методы поиска равновесий в больших транспортных сетях: диссертация на соискание ученой степени д

    Гасников А.\,В. Эффективные численные методы поиска равновесий в больших транспортных сетях: диссертация на соискание ученой степени д. ф.-м. н. по специальности 05.13.18 --- Математическое моделирование, численные методы, комплексы программ. --- М.: МФТИ, 2016. --- 487 с. 0.1cm Gasnikov A.\,V. Effektivnyye chislennyye metody poiska ravnovesiya v bol’shik...

  6. [6]

    Двойственные методы поиска равновесий в смешанных моделях распределения потоков в больших транспортных сетях // Журнал выч

    Гасников А.\,В., Гасникова Е.\,В., Нестеров Ю.\,Е. Двойственные методы поиска равновесий в смешанных моделях распределения потоков в больших транспортных сетях // Журнал выч. математики и мат. физики. --- 2018. --- Т. 58, № 9. --- C. 1447--1454. 0.1cm Gasnikov A.\,V., Gasnikova E.\,V., Nesterov Yu.\,E . Dvoistvenii metodi poiska ravnovesii v smechanich mo...

  7. [7]

    Быстрый градиентный спуск для задач выпуклой минимизации с оракулом, выдающим ( , L) --модель функции в запрошенной точке // Журнал выч

    Гасников А.\,В., Тюрин А.\,И. Быстрый градиентный спуск для задач выпуклой минимизации с оракулом, выдающим ( , L) --модель функции в запрошенной точке // Журнал выч. математики и мат. физики. --- 2019. --- Т. 59, № 7. --- C. 1137--1150. 0.1cm Gasnikov A.\,V., Tyurin A.\,I . Bistriy gradientniy spusk dli zadachi vipucloy minimizacii s oraculom, viduachim ...

  8. [8]

    Сложность задач и эффективность методов оптимизации // Наука, 1979

    Немировский А.\,С., Юдин Д.\,Б. Сложность задач и эффективность методов оптимизации // Наука, 1979. --- 384 c. 0.1cm Nemirovsky A.\,S., Yudin D.\,B. . Slozhnost’ zadach i effektivnost’ metodov optimizatsii [Problem complexity and method efficiency in optimization]. --- Moscow: Science, 1979. --- 384 p. (in Russian)

  9. [9]

    Введение в выпуклую оптимизацию // М.:МЦНМО, 2010

    Нестеров Ю.\,Е. Введение в выпуклую оптимизацию // М.:МЦНМО, 2010. --- 262 c. 0.1cm Nesterov Yu.\,E. . Vvedeniye v vypukluyu optimizatsiyu [Introductory lectures on convex optimization]. --- Moscow: MCCME, 2010. --- 262 p. (in Russian)

  10. [10]

    Введение в оптимизацию

    Поляк Б.\,Т. Введение в оптимизацию. --- М.: Наука, 1983. --- 384 с. 0.1cm Polyak B\,T. . Vvedeniye v optimizatsiyu [Introductory lectures on convex optimization]. --- Moscow: Science, 1983. --- 384 p. (in Russian)

  11. [11]

    Convex optimization

    Boyd S., Vandenberghe L. Convex optimization. --- Cambridge University Press, 2004

  12. [12]

    First-order methods of smooth convex optimization with inexact oracle // Mathematical Programming

    Devolder O., Glineur F., Nesterov Yu. First-order methods of smooth convex optimization with inexact oracle // Mathematical Programming. --- 2014. --- Vol. 146, No. 1--2. --- P. 37--75

  13. [13]

    Exactness, inexactness and stochasticity in first-order methods for large-scale convex optimization

    Devolder O. Exactness, inexactness and stochasticity in first-order methods for large-scale convex optimization. --- PhD thesis, CORE UCL, 2013

  14. [14]

    First-order methods with inexact oracle: the strongly convex case // CORE Discussion Papers 2013/16 --- 2013

    Devolder O., Glineur F., Nesterov Yu. First-order methods with inexact oracle: the strongly convex case // CORE Discussion Papers 2013/16 --- 2013. --- URL: https://www.uclouvain.be/cps/ucl/doc/core/documents/coredp2013_16web.pdf

  15. [15]

    Intermediate gradient methods for smooth convex problems with inexact oracle // CORE Discussion Papers 2013/17 --- 2013

    Devolder O., Glineur F., Nesterov Yu. Intermediate gradient methods for smooth convex problems with inexact oracle // CORE Discussion Papers 2013/17 --- 2013. --- URL: https://cdn.uclouvain.be/public/Exports

  16. [16]

    Lectures on modern convex optimization analysis, algorithms, and engineering applications

    Nemirovski A. Lectures on modern convex optimization analysis, algorithms, and engineering applications. --- Philadelphia: SIAM, 2015. --- URL: http://www2.isye.gatech.edu/ nemirovs/Lect_ModConvOpt.pdf

  17. [17]

    Complexity bounds for primal-dual methods minimizing the model of objective function // Mathematical Programming

    Nesterov Yu. Complexity bounds for primal-dual methods minimizing the model of objective function // Mathematical Programming. --- 2018. --- Vol. 171, No. 1--2. --- P. 311--330

  18. [18]

    Gradient methods for minimizing composite functions // Mathematical Programming

    Nesterov Yu. Gradient methods for minimizing composite functions // Mathematical Programming. --- 2013. --- Vol. 140, No. 1. --- P. 125--161

  19. [19]

    Primal-dual subgradient methods for convex problems

    Nesterov Yu. Primal-dual subgradient methods for convex problems. // Mathematical Programming --- 2009. --- Vol. 120, No. 1. --- P. 221--259

  20. [20]

    Universal gradient methods for convex optimization problems // Mathematical Programming

    Nesterov Yu. Universal gradient methods for convex optimization problems // Mathematical Programming. --- 2015. --- Vol. 152, No. 1--2. --- P. 381--404

  21. [21]

    Gradient Methods for Problems with Inexact Model of the Objective // Mathematical Optimization Theory and Operations Research

    Stonyakin F., Dvinskikh D., Dvurechensky P., Kroshnin A., Kuznetsova O., Agafonov A., Gasnikov A., Tyurin A., Uribe C.\,A., Pasechnyuk D., Artamonov S. Gradient Methods for Problems with Inexact Model of the Objective // Mathematical Optimization Theory and Operations Research. 18th International Conference. --- Ekaterinburg, Russia. 2019. --- P. 97--114