pith. sign in

arxiv: 2307.00358 · v3 · pith:EESGJGUInew · submitted 2023-07-01 · 🧮 math.OC

The Error in Multivariate Linear Extrapolation with Applications to Derivative-Free Optimization

Pith reviewed 2026-05-24 08:00 UTC · model grok-4.3

classification 🧮 math.OC
keywords multivariate linear extrapolationerror boundssharp boundsderivative-free optimizationLipschitz gradientsimplicial searchaffinely independent pointsinterpolation error
0
0 comments X

The pith

A numerical method computes the sharp error bound for multivariate linear extrapolation under Lipschitz gradient assumptions.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper develops a procedure to find the smallest possible upper bound on the error made when a linear function fitted to several points is evaluated outside their convex hull. It supplies both this numerical procedure and closed-form expressions that achieve the same bound for particular point geometries. The resulting estimates are inserted into an iteration-complexity argument for a basic simplicial search algorithm used when gradients are unavailable. Readers should care because linear extrapolation appears routinely in derivative-free routines, and a tight, computable error constant directly controls the number of function evaluations needed for convergence.

Core claim

The authors introduce a method to numerically compute the sharp bound on the error of linear extrapolation for functions with Lipschitz continuous gradients when the interpolation points form an affinely independent set. They also derive analytical bounds that are sharp under certain conditions and apply the bounds to obtain complexity results for a simplicial search method.

What carries the argument

Numerical maximization of the extrapolation residual over all admissible functions whose gradients satisfy a unit Lipschitz condition, subject to the given affinely independent sample set.

Load-bearing premise

The target function has a Lipschitz continuous gradient and the sample points are affinely independent.

What would settle it

A concrete function whose gradient is Lipschitz continuous, together with an affinely independent point set, for which the observed linear extrapolation error exceeds the numerically computed sharp bound.

Figures

Figures reproduced from arXiv: 2307.00358 by Liyuan Cao, Ya-Xiang Yuan, Zaiwen Wen.

Figure 1
Figure 1. Figure 1: An illustration of two DFO algorithm when minimizing a bivariate fu [PITH_FULL_IMAGE:figures/full_fig_p002_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: The sharp error bound on | ˆf(x) − f(x)| for each x on the 100 × 100 grid that covers the area [−2.5, 2.5] × [−1.5, 2.5] evenly. The sample set and the Lipschitz constant are chosen as Θ = {(−0.3, 1),(−1.1, −0.5),(1, 0)} and ν = 1. In (f-EEP), the point x, its derivative, and its function value are represented by the triplet (x0, g0, y0), whereas {(xi , gi , yi)} n+1 i=1 are used for the interpolation poin… view at source ↗
Figure 3
Figure 3. Figure 3: A visualization of results in Section 4 for bivariate interpolat [PITH_FULL_IMAGE:figures/full_fig_p010_3.png] view at source ↗
Figure 2
Figure 2. Figure 2: However, this does not mean (21) is a formula to the optima [PITH_FULL_IMAGE:figures/full_fig_p012_2.png] view at source ↗
Figure 4
Figure 4. Figure 4: The areas to which if x belongs, (21) is not an upper bound on the function approximation error for bivariate interpolation. The dashed line on the left is perpendicular to the line going through x1 and x2; and the one on the right is perpendicular to the line going through x3 and x1. 12 [PITH_FULL_IMAGE:figures/full_fig_p012_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: Two configurations of Θ and x where (21) is an invalid error bound for bivariate extrapolation. The case in Figure 5a can be defined mathematically as ℓ2 > 0, ℓ3 < 0, and ℓ1[x2 −x1]·[x3 −x1]−ℓ3[x2 − x3] · [x1 − x3] < 0. The following lemma shows the point w, as defined in (30), is the intersection of the line going through x1 and x3 and the line going through x and x2. 17 [PITH_FULL_IMAGE:figures/full_fig… view at source ↗
read the original abstract

We study in this paper the function approximation error of multivariate linear extrapolation. The sharp error bound of linear interpolation already exists in the literature. However, linear extrapolation is used far more often in applications such as derivative-free optimization, while its error is not well-studied. We introduce in this paper a method to numerically compute the sharp bound on the error, and then present several analytical bounds along with the conditions under which they are sharp. We also provide a complexity analysis of a basic simplicial search method to illustrate an application of these error bounds in derivative-free optimization. All results are under the assumptions that the function being interpolated has Lipschitz continuous gradient and is interpolated on an affinely independent sample set.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

0 major / 4 minor

Summary. The paper studies the error of multivariate linear extrapolation for functions with Lipschitz continuous gradients, sampled on affinely independent sets. It introduces a numerical procedure to compute the sharp worst-case error bound, derives several analytical bounds together with conditions under which each is sharp, and applies the bounds to obtain a complexity result for a basic simplicial search method in derivative-free optimization.

Significance. If the derivations hold, the work addresses a practical gap: linear extrapolation appears frequently in DFO yet lacks the sharp error analysis already available for interpolation. The manuscript supplies an explicit numerical formulation for the sharp bound, proves sharpness conditions for the analytical expressions, and carries the bounds through to a falsifiable complexity corollary under standard assumptions, all without circularity or hidden parameters.

minor comments (4)
  1. [§2] §2 (Preliminaries): the definition of the extrapolation point relative to the simplex could be accompanied by a short diagram or explicit coordinate example to aid readers unfamiliar with the geometry.
  2. [§3] The numerical procedure in §3 is described at a high level; adding a short pseudocode listing or explicit statement of the optimization problem solved would improve reproducibility.
  3. [Table 1] Table 1 (numerical sharpness examples): the column headers for the analytical bounds could be aligned with the theorem numbers that state the corresponding conditions.
  4. [§5] The complexity corollary in §5 inherits the O(h²) scaling from the error bound; a one-sentence reminder of the precise constant dependence on the Lipschitz constant L would help readers trace the result.

Simulated Author's Rebuttal

0 responses · 0 unresolved

We thank the referee for the positive summary, significance assessment, and recommendation of minor revision. No major comments appear in the report, so we have no specific points to address point-by-point. We will incorporate any minor editorial or presentation changes in the revised version.

Circularity Check

0 steps flagged

No significant circularity identified

full rationale

The derivation starts from external, explicitly stated assumptions (Lipschitz continuous gradient with constant L, affinely independent sample set) that are standard in approximation theory and DFO. The sharp error bound is obtained by a numerical procedure that solves an optimization problem over the unit ball in the gradient-Lipschitz seminorm; this is not a fit to data but a direct computation from the assumptions. Analytical bounds are then proved by relaxing or specializing that same optimization problem, with explicit sharpness conditions. The simplicial-search complexity corollary applies the derived O(h²) error directly to a standard DFO convergence argument. No equation reduces to a prior result by definition, no parameter is fitted and then relabeled as a prediction, and no load-bearing step rests on a self-citation whose content is itself unverified. The manuscript is therefore self-contained against the declared external benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 0 invented entities

The central claims rest on two domain assumptions stated in the abstract; no free parameters or invented entities are mentioned.

axioms (2)
  • domain assumption The function has Lipschitz continuous gradient
    Explicitly required for all error bounds and complexity results in the abstract.
  • domain assumption The sample set is affinely independent
    Required so that the linear extrapolation model is uniquely defined.

pith-pipeline@v0.9.0 · 5646 in / 1299 out tokens · 38008 ms · 2026-05-24T08:00:57.121710+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

30 extracted references · 30 canonical work pages

  1. [1]

    Computation of sparse low degree inter- polating polynomials and their application to derivative-free optimizat ion

    Afonso S Bandeira, Katya Scheinberg, and Lu ´ ıs Nunes Vicente. Computation of sparse low degree inter- polating polynomials and their application to derivative-free optimizat ion. Mathematical programming, 134(1):223–257, 2012

  2. [2]

    C onvergence of trust-region methods based on probabilistic models

    Afonso S Bandeira, Katya Scheinberg, and Luis Nunes Vicente. C onvergence of trust-region methods based on probabilistic models. SIAM Journal on Optimization , 24(3):1238–1264, 2014

  3. [3]

    Convergence rate analysis of a stochastic trust-region method via supermartingales

    Jose Blanchet, Coralia Cartis, Matt Menickelly, and Katya Scheinb erg. Convergence rate analysis of a stochastic trust-region method via supermartingales. INFORMS journal on optimization , 1(2):92–119, 2019

  4. [4]

    First-and second-order high probability com- plexity bounds for trust-region methods with noisy oracles

    Liyuan Cao, Albert S Berahas, and Katya Scheinberg. First-and second-order high probability com- plexity bounds for trust-region methods with noisy oracles. Mathematical Programming, pages 1–52, 2023

  5. [5]

    Stochast ic optimization using a trust-region method and random models

    Ruobing Chen, Matt Menickelly, and Katya Scheinberg. Stochast ic optimization using a trust-region method and random models. Mathematical Programming, 169:447–487, 2018

  6. [6]

    General Lagrange a nd Hermite interpolation in Rn with applications to finite element methods

    Philippe G Ciarlet and Pierre-Arnaud Raviart. General Lagrange a nd Hermite interpolation in Rn with applications to finite element methods. Archive for Rational Mechanics and Analysis , 46(3):177–199, 1972

  7. [7]

    On the conver gence of derivative-free methods for unconstrained optimization

    Andrew R Conn, Katya Scheinberg, and Ph L Toint. On the conver gence of derivative-free methods for unconstrained optimization. Approximation theory and optimization: tributes to MJD Pow ell, pages 83–108, 1997

  8. [8]

    Geometr y of interpolation sets in derivative free optimization

    Andrew R Conn, Katya Scheinberg, and Lu ´ ıs N Vicente. Geometr y of interpolation sets in derivative free optimization. Mathematical programming, 111:141–172, 2008. 26

  9. [9]

    Global co nvergence of general derivative- free trust-region algorithms to first-and second-order critical points

    Andrew R Conn, Katya Scheinberg, and Lu ´ ıs N Vicente. Global co nvergence of general derivative- free trust-region algorithms to first-and second-order critical points. SIAM Journal on Optimization , 20(1):387–415, 2009

  10. [10]

    Introduction to derivative-free optimization

    Andrew R Conn, Katya Scheinberg, and Luis N Vicente. Introduction to derivative-free optimization . SIAM, 2009

  11. [11]

    Interpolation and Approximation

    Philip J Davis. Interpolation and Approximation . Courier Corporation, 1975

  12. [12]

    Performance of first-order me thods for smooth convex minimization: a novel approach

    Yoel Drori and Marc Teboulle. Performance of first-order me thods for smooth convex minimization: a novel approach. Mathematical Programming, 145(1-2):451–482, 2014

  13. [13]

    Complexity and global rates of trust-region methods based on probabilistic models

    Serge Gratton, Cl´ ement W Royer, Lu ´ ıs N Vicente, and Zaikun Zhang. Complexity and global rates of trust-region methods based on probabilistic models. IMA Journal of Numerical Analysis , 38(3):1579– 1597, 2018

  14. [14]

    Matrix Analysis

    Roger A Horn and Charles R Johnson. Matrix Analysis . Cambridge university press, 2012

  15. [15]

    Con vergence of the restricted nelder–mead algorithm in two dimensions

    Jeffrey C Lagarias, Bjorn Poonen, and Margaret H Wright. Con vergence of the restricted nelder–mead algorithm in two dimensions. SIAM Journal on Optimization , 22(2):501–532, 2012

  16. [16]

    Derivative-f ree optimization methods

    Jeffrey Larson, Matt Menickelly, and Stefan M Wild. Derivative-f ree optimization methods. Acta Numerica, 28:287–404, 2019

  17. [17]

    Convergence of the nelder–mead simplex meth od to a nonstationary point

    Ken IM McKinnon. Convergence of the nelder–mead simplex meth od to a nonstationary point. SIAM Journal on optimization , 9(1):148–158, 1998

  18. [18]

    A simplex method for function minim ization

    John A Nelder and Roger Mead. A simplex method for function minim ization. The computer journal , 7(4):308–313, 1965

  19. [19]

    Introductory Lectures on Convex Optimization: A Basic Cour se, volume 87

    Y Nesterov. Introductory Lectures on Convex Optimization: A Basic Cour se, volume 87. Springer Science & Business Media, 2013

  20. [20]

    A direct search optimization method that mode ls the objective and constraint functions by linear interpolation

    Michael JD Powell. A direct search optimization method that mode ls the objective and constraint functions by linear interpolation. In Advances in optimization and numerical analysis , pages 51–67. Springer, 1994

  21. [21]

    On the Lagrange functions of quadratic models that are defined by interpolation

    MJD Powell. On the Lagrange functions of quadratic models that are defined by interpolation. Opti- mization Methods and Software , 16(1-4):289–309, 2001

  22. [22]

    The calculus of simplex gradients

    Rommel G Regis. The calculus of simplex gradients. Optimization Letters , 9:845–865, 2015

  23. [23]

    Sequ ential application of simplex designs in optimisation and evolutionary operation

    WGRFR Spendley, George R Hext, and Francis R Himsworth. Sequ ential application of simplex designs in optimisation and evolutionary operation. Technometrics, 4(4):441–461, 1962

  24. [24]

    Optimal estimates for the linear interpolation error on simplices

    Martin St¨ ampfle. Optimal estimates for the linear interpolation error on simplices. Journal of Approx- imation Theory, 103(1):78–90, 2000

  25. [25]

    James Joseph Sylvester. Xix. a demonstration of the theorem that every homogeneous quadratic polyno- mial is reducible by real orthogonal substitutions to the form of a s um of positive and negative squares. The London, Edinburgh, and Dublin Philosophical Magazine a nd Journal of Science , 4(23):138–142, 1852

  26. [26]

    Exact worst-case performance of first-order methods for composite convex optimization

    Adrien B Taylor, Julien M Hendrickx, and Fran¸ cois Glineur. Exact worst-case performance of first-order methods for composite convex optimization. SIAM Journal on Optimization , 27(3):1283–1313, 2017

  27. [27]

    Smoot h strongly convex interpolation and exact worst-case performance of first-order methods

    Adrien B Taylor, Julien M Hendrickx, and Fran¸ cois Glineur. Smoot h strongly convex interpolation and exact worst-case performance of first-order methods. Mathematical Programming, 161:307–345, 2017. 27

  28. [28]

    Fortified-descent simplicial search method: A gen eral approach

    Paul Tseng. Fortified-descent simplicial search method: A gen eral approach. SIAM Journal on Opti- mization, 10(1):269–288, 1999

  29. [29]

    The error in linear interpolation at the vertices of a simplex

    Shayne Waldron. The error in linear interpolation at the vertices of a simplex. SIAM Journal on Numerical Analysis, 35(3):1191–1200, 1998

  30. [30]

    An interactive approach for solving multi-ob jective optimization problems

    Daniel John Woods. An interactive approach for solving multi-ob jective optimization problems. 1985. 28