Progressively Sampled Equality-Constrained Optimization

Daniel P. Robinson; Frank E. Curtis; Lingjun Guo

arxiv: 2510.00417 · v2 · pith:FPB5KQVUnew · submitted 2025-10-01 · 🧮 math.OC · cs.LG· stat.ML

Progressively Sampled Equality-Constrained Optimization

Frank E. Curtis , Lingjun Guo , Daniel P. Robinson This is my paper

Pith reviewed 2026-05-18 11:13 UTC · model grok-4.3

classification 🧮 math.OC cs.LGstat.ML

keywords progressive samplingequality-constrained optimizationsample complexitynonlinear optimizationfinite-sum problemsstochastic optimization

0 comments

The pith

Solving a sequence of progressively sampled problems yields better worst-case sample complexity for equality-constrained optimization.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper presents an algorithm for continuous nonlinear equality-constrained optimization problems where the objective and constraints are defined as expectations or averages over large finite sets of terms. The key idea is to solve a sequence of these problems, each using larger and larger samples of the terms, rather than tackling the full set immediately. Under assumptions on the functions and derivatives that are standard in applications, this progressive approach improves the worst-case bound on the total number of samples required, as long as the starting sample sizes are large enough. Numerical tests on benchmark problems show that the method performs well in practice.

Core claim

For equality-constrained optimization problems whose objective and constraint functions are averages over large numbers of terms, solving a sequence of subproblems with progressively increasing sample sets produces a better worst-case sample complexity bound than directly solving the problem using the complete sets of samples.

What carries the argument

A sequence of optimization problems with progressively growing finite sample sets for the objective and constraint functions.

If this is right

The algorithm can solve such problems with fewer total function evaluations in the worst case.
It preserves the convergence guarantees of standard methods while improving efficiency.
The approach is particularly useful when evaluating all terms at once is costly.
Practical effectiveness is demonstrated through experiments on test problems.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

This technique may generalize to other types of constrained stochastic optimization problems.
It could be integrated with existing solvers for large-scale problems to reduce computational cost.
Further analysis might explore how the rate of sample growth affects the overall complexity.

Load-bearing premise

The objective and constraint functions along with their first- and second-order derivatives satisfy assumptions that are typical and reasonable for real-world problems, and the initial sample sizes are large enough.

What would settle it

A concrete equality-constrained problem satisfying the paper's assumptions where the progressive sampling method requires more total samples than solving with the full set to achieve the same accuracy.

read the original abstract

An algorithm is proposed, analyzed, and tested for solving continuous nonlinear-equality-constrained optimization problems where the objective and constraint functions are defined by expectations or averages over large, finite numbers of terms. The main idea of the algorithm is to solve a sequence of related problems, each involving finite samples of objective- and constraint-function terms, over which the sample sets grow progressively. Under assumptions about the problem functions and their first- and second-order derivatives that are reasonable in real-world settings of interest, it is shown that -- with sufficiently large initial sample sizes -- solving a sequence of problems defined through progressive sampling yields a better worst-case sample complexity bound compared to solving a single problem with the full sets of samples. The results of numerical experiments with a set of test problems demonstrate that the proposed approach can be effective in practice.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

Progressive sampling for equality-constrained problems claims a sample complexity improvement, but the gain depends on whether the initial sample can stay much smaller than the full set while meeting the local assumptions.

read the letter

The main point from this paper is a method for equality-constrained optimization that uses progressively larger samples to define the problems being solved at each stage. Under the right conditions, this yields a better worst-case bound on the total number of samples needed compared to solving the problem once with the complete set of terms. The paper does a good job of describing the algorithm clearly and backing it up with experiments on test problems that show it can work well in practice. The focus on equality constraints is a reasonable extension, as those require careful handling of the constraint satisfaction as the approximation improves. Where it could be stronger is in addressing how small the initial sample can be. The claim hinges on the initial sample being sufficiently large for the second-order assumptions and convergence rates to apply right away. If that threshold is high, close to the full sample size, then the progressive stages add little advantage in terms of total work. The abstract mentions the improvement holds with sufficiently large initial sizes but does not detail the scaling, so the actual savings might be limited in some cases. The assumptions listed are standard and reasonable for the settings mentioned, like in machine learning applications. The analysis appears to avoid circularity by building on established results for such problems. This paper would be of interest to optimization researchers dealing with stochastic or finite-sum models that include equality constraints. Someone looking for new ideas in complexity analysis for these problems could get useful insights from it. I think it should go to peer review for a thorough check on the proofs and to see if the complexity improvement holds up under closer scrutiny.

Referee Report

1 major / 2 minor

Summary. The paper proposes and analyzes an algorithm for nonlinear equality-constrained optimization problems in which the objective and constraints are defined via expectations or averages over large finite sample sets. The method solves a sequence of related subproblems whose sample sets grow progressively from an initial size n0 onward. Under standard smoothness and second-order assumptions on the functions and their derivatives, the analysis claims that—provided the initial sample sizes are sufficiently large—the total worst-case sample complexity is asymptotically better than that of solving a single problem using the full sample sets. Numerical experiments on a collection of test problems are presented to illustrate practical performance.

Significance. If the claimed improvement in worst-case sample complexity can be rigorously established with explicit dependence on n0 relative to the full sample size N, the result would be significant for large-scale stochastic constrained optimization arising in machine learning and engineering design, where full-sample evaluations are expensive. The progressive-sampling idea itself is a natural extension of existing stochastic and incremental methods; credit is due for targeting equality constraints and for providing both complexity analysis and numerical validation.

major comments (1)

[Abstract and main complexity theorem] The central claim of an improved sample-complexity bound rests on the phrase 'with sufficiently large initial sample sizes' (abstract and the statement of the main complexity result). The manuscript does not quantify the minimal n0(N) required for the local second-order assumptions, error bounds, and convergence rates to hold from the first subproblem, nor does it analyze how the progressive growth schedule interacts with the iteration complexity of each subproblem solve. If this minimal n0 scales as Ω(N) or Ω(N/polylog N), the total work cannot be asymptotically better than the single full-sample solve; this point is load-bearing for the claimed advantage.

minor comments (2)

[Introduction and notation] Notation for sample sizes (n0, Nk, N) should be introduced once in a dedicated notation table or paragraph and used consistently thereafter.
[Numerical experiments] The numerical section would benefit from reporting the actual sample sizes used at each progressive stage and the total number of function evaluations across all stages, to allow direct comparison with the single full-sample baseline.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for the careful reading and for highlighting the significance of establishing a rigorous improvement in worst-case sample complexity. The major comment correctly identifies that the current manuscript relies on the qualifier 'sufficiently large initial sample sizes' without an explicit dependence on N. We address this point directly below.

read point-by-point responses

Referee: [Abstract and main complexity theorem] The central claim of an improved sample-complexity bound rests on the phrase 'with sufficiently large initial sample sizes' (abstract and the statement of the main complexity result). The manuscript does not quantify the minimal n0(N) required for the local second-order assumptions, error bounds, and convergence rates to hold from the first subproblem, nor does it analyze how the progressive growth schedule interacts with the iteration complexity of each subproblem solve. If this minimal n0 scales as Ω(N) or Ω(N/polylog N), the total work cannot be asymptotically better than the single full-sample solve; this point is load-bearing for the claimed advantage.

Authors: We agree that the manuscript does not currently provide an explicit expression for the minimal n0 as a function of N, nor a detailed accounting of how the growth schedule interacts with per-subproblem iteration counts; this is a genuine gap in the presentation. Under the stated smoothness and second-order assumptions, the minimal n0 is determined by concentration bounds on the sampled gradients and Hessians to ensure the local error bounds and second-order sufficient conditions hold with high probability. These bounds depend on problem constants (Lipschitz moduli, variance proxies) and the target tolerance but can be chosen independent of N or at worst polylogarithmic in N. We will add a supporting lemma that makes this dependence explicit and shows n0 = o(N). For the progressive schedule (assumed geometric growth), the number of stages is logarithmic in N and each subproblem is solved to a tolerance that permits a uniformly bounded number of iterations once the second-order assumptions are satisfied; we will insert a short paragraph deriving the resulting total sample complexity and confirming it remains asymptotically superior to the single full-sample solve. These changes will be incorporated in the revised manuscript. revision: yes

Circularity Check

0 steps flagged

No circularity: derivation relies on stated assumptions and standard analysis techniques

full rationale

The paper proposes a progressive sampling algorithm for equality-constrained optimization and derives a sample-complexity bound under explicit assumptions on the objective/constraint functions and their first- and second-order derivatives. The central result compares the worst-case complexity of the progressive sequence against a single full-sample solve, with the improvement conditioned on sufficiently large initial sample sizes. No step in the provided abstract or described analysis reduces a claimed prediction or bound to a fitted parameter, self-definition, or unverified self-citation by construction; the argument proceeds from the stated assumptions and the algorithm's structure to the complexity comparison without tautological equivalence. The derivation is therefore self-contained against the paper's own premises.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The approach rests on standard smoothness assumptions for the objective and constraint functions; no free parameters or invented entities are mentioned in the abstract.

axioms (1)

domain assumption Problem functions and their first- and second-order derivatives satisfy assumptions reasonable in real-world settings
Invoked to guarantee the complexity result holds with sufficiently large initial samples

pith-pipeline@v0.9.0 · 5669 in / 1039 out tokens · 45898 ms · 2026-05-18T11:13:08.231105+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Foundation/RealityFromDistinction.lean reality_from_one_distinction unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

solving a sequence of problems defined through progressive sampling yields a better worst-case sample complexity bound compared to solving a single problem with the full sets of samples
IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

Assumption 3.1(c) … problem (2) is (α,β)-strongly Morse … |dᵀ ∇²ₓₓL(x,y(x))d| ≥ β‖d‖² for all d ∈ null(∇c(x)ᵀ)

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.