CLARSTA: A random subspace trust-region algorithm for convex-constrained derivative-free optimization
Pith reviewed 2026-05-19 08:17 UTC · model grok-4.3
The pith
A random subspace trust-region algorithm converges almost surely to first-order stationary points for convex-constrained derivative-free optimization.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The algorithm achieves almost-sure global convergence to a first-order stationary point and admits a worst-case complexity analysis for general convex-constrained derivative-free optimization problems, provided the sampled subspaces preserve a sufficient fraction of the criticality measure with a probability lower bound derived from concentration of measure on the Grassmann manifold and the models satisfy the new accuracy requirement only on the projected constraint set.
What carries the argument
Random subspace sampling on the Grassmann manifold that preserves a positive fraction of the first-order criticality measure with a computable probability lower bound, paired with a new class of models required to be accurate only on the projection of the feasible set onto the current subspace.
If this is right
- The method extends existing random-subspace DFO techniques from unconstrained to convex-constrained problems while retaining almost-sure convergence.
- A new geometry measure simplifies the construction and management of models that are accurate only on projected feasible sets.
- Worst-case complexity bounds become available for high-dimensional convex DFO under the stated subspace and model conditions.
- Numerical tests indicate reliable performance on problems whose dimension reaches at least 10000.
- The framework separates model accuracy from full-space geometry, allowing cheaper local models in each iteration.
Where Pith is reading between the lines
- The same projected-model idea could be tested on non-convex constraints or on problems with stochastic noise to see whether the convergence theory carries over.
- Because each iteration works in a low-dimensional subspace, the approach may lower the per-iteration cost enough to make derivative-free methods practical for engineering design tasks with thousands of variables.
- The Grassmann concentration argument supplies an explicit sampling probability that could be reused to design subspace methods for other derivative-free settings such as multi-objective or constrained least-squares problems.
Load-bearing premise
The randomly sampled subspaces must retain enough of the true first-order criticality measure with a positive probability lower bound, and the models need only be accurate after the constraints are projected into the subspace.
What would settle it
A sequence of independent runs on a simple convex-constrained DFO test problem in which the algorithm fails to reach a first-order stationary point with probability bounded away from zero, or in which the observed number of function evaluations exceeds the predicted worst-case bound by more than a constant factor.
read the original abstract
This paper proposes a random subspace trust-region algorithm for general convex-constrained derivative-free optimization (DFO) problems. Similar to previous random subspace DFO methods, the convergence of our algorithm requires a certain accuracy of models and a certain quality of subspaces. For model accuracy, we define a new class of models that is only required to provide reasonable accuracy on the projection of the constraint set onto the subspace. We provide a new geometry measure to make these models easy to analyze, construct, and manage. For subspace quality, we use the concentration of measure on the Grassmann manifold to provide a method to sample subspaces that preserve the first-order criticality measure by a certain fraction with a certain probability lower bound. Based on all these new theoretical results, we present an almost-sure global convergence and a worst-case complexity analysis of our algorithm. Numerical experiments on problems with dimensions up to 10000 demonstrate the reliable performance of our algorithm in high dimensions.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript proposes CLARSTA, a random subspace trust-region algorithm for general convex-constrained derivative-free optimization. It defines a new class of models required to be accurate only on the projection of the constraint set onto the sampled subspace, introduces a new geometry measure to facilitate analysis and construction of such models, derives subspace sampling probabilities via concentration of measure on the Grassmann manifold that preserve a fraction of the first-order criticality measure, and establishes almost-sure global convergence to a first-order stationary point together with a worst-case complexity bound. Numerical experiments on problems with dimensions up to 10000 are presented to illustrate performance.
Significance. If the central claims hold, the work meaningfully extends random-subspace DFO methods to convex constraints by replacing full-space model accuracy with a projected-set condition and by supplying an explicit Grassmannian concentration argument for subspace quality. The new geometry measure is a targeted device that renders the projected accuracy condition amenable to standard trust-region decrease arguments, and the almost-sure convergence plus complexity results are obtained via Borel-Cantelli and standard DFO counting arguments once the probabilistic ingredients are in place. The high-dimensional numerical tests provide practical evidence that the approach scales beyond the reach of full-space methods. These elements constitute a solid, self-contained contribution to the derivative-free optimization literature.
major comments (2)
- [§3.2] §3.2, Definition 3.3 and Lemma 3.6: The new geometry measure is defined on the projected constraint set; the proof that this measure implies a uniform Cauchy decrease for the trust-region subproblem on the original feasible set is only sketched and appears to require an additional uniform bound on the distance between the projected and original feasible sets that is not stated explicitly.
- [§5.3] §5.3, Theorem 5.2: The worst-case complexity bound is stated as O(1/ε²) iterations with high probability; the dependence of the hidden constant on the Grassmann concentration probability p and on the subspace dimension k is not displayed, making it impossible to verify whether the bound remains useful when k grows with the ambient dimension.
minor comments (3)
- [§7] The numerical section reports results up to dimension 10000 but does not list the specific test-problem families, the choice of subspace dimension k, or the baseline solvers used for comparison; adding a table with these details would improve reproducibility.
- [§2.1] Notation for the projected feasible set P_C(x) is introduced without an explicit statement of how the projection is computed in practice when C is given only by oracles.
- [§1] A few sentences in the introduction repeat the abstract almost verbatim; a more concise motivation paragraph would improve flow.
Simulated Author's Rebuttal
We thank the referee for the careful reading, positive assessment, and constructive comments. We address each major comment below and will incorporate the suggested clarifications in the revised manuscript.
read point-by-point responses
-
Referee: [§3.2] §3.2, Definition 3.3 and Lemma 3.6: The new geometry measure is defined on the projected constraint set; the proof that this measure implies a uniform Cauchy decrease for the trust-region subproblem on the original feasible set is only sketched and appears to require an additional uniform bound on the distance between the projected and original feasible sets that is not stated explicitly.
Authors: We agree that the sketch in the proof of Lemma 3.6 can be expanded for clarity. The required uniform bound on the distance between the projected and original feasible sets follows directly from the convexity of the constraint set together with the fact that the orthogonal projection onto a subspace is non-expansive. We will insert an explicit auxiliary lemma stating and proving this bound (with a short argument using the definition of convexity and the projection operator) immediately before Lemma 3.6, and we will expand the subsequent decrease argument to reference the new lemma at each step. revision: yes
-
Referee: [§5.3] §5.3, Theorem 5.2: The worst-case complexity bound is stated as O(1/ε²) iterations with high probability; the dependence of the hidden constant on the Grassmann concentration probability p and on the subspace dimension k is not displayed, making it impossible to verify whether the bound remains useful when k grows with the ambient dimension.
Authors: We agree that making the dependence explicit will strengthen the result. The constant hidden in the O(1/ε²) bound depends on p through the Borel–Cantelli summation of the failure probabilities and on k through the Lipschitz constants of the model and the geometry measure in the k-dimensional subspace. We will restate Theorem 5.2 with the explicit factors (including the precise polynomial dependence on 1/p and on k) and will add a short remark after the proof indicating how the bound behaves when k = o(n) or k grows slowly with n. revision: yes
Circularity Check
No significant circularity; derivation is self-contained
full rationale
The paper introduces an independent new model class whose accuracy is required only on the projection of the constraint set, defines a new geometry measure to support analysis of these models, and derives subspace sampling guarantees from concentration of measure on the Grassmann manifold. These elements are used to establish almost-sure convergence and worst-case complexity bounds. None of the load-bearing steps reduce by construction to fitted inputs, self-definitions, or unverified self-citations; the central claims rest on newly stated and analyzed properties rather than renaming or circular reuse of prior results.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption Concentration of measure on the Grassmann manifold yields a probability lower bound that subspaces preserve a fraction of the first-order criticality measure.
invented entities (2)
-
New class of models accurate on the projection of the constraint set onto the subspace
no independent evidence
-
New geometry measure for the projected models
no independent evidence
Lean theorems connected to this paper
-
IndisputableMonolith/Cost/FunctionalEquation.leanwashburn_uniqueness_aczel unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
We define a new class of models that is only required to provide reasonable accuracy on the projection of the constraint set onto the subspace... concentration of measure on the Grassmann manifold
-
IndisputableMonolith/Foundation/RealityFromDistinction.leanreality_from_one_distinction unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
almost-sure global convergence... worst-case complexity analysis
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
Forward citations
Cited by 1 Pith paper
-
Accuracy and Relationships of Quadratic Models in Derivative-free Optimization
The paper derives fully linear error bounds for minimum norm, minimum Frobenius norm, and quadratic generalized simplex derivative models, establishes directional fully quadratic Hessian accuracy under mild sample set...
Reference graph
Works this paper leans on
-
[1]
M. A. Abramson and C. Audet , Convergence of mesh adaptive direct search to second-order stationary points, SIAM Journal on Optimization, 17 (2006), pp. 606–619
work page 2006
- [2]
-
[3]
C. Audet and J. E. Dennis Jr , Mesh adaptive direct search algorithms for constrained optimization , SIAM Journal on optimization, 17 (2006), pp. 188–217
work page 2006
- [4]
-
[5]
C. Audet and W. Hare , Derivative-free and blackbox optimization , Springer, Cham, 2017
work page 2017
-
[6]
Beck , First-order methods in optimization , SIAM, 2017
A. Beck , First-order methods in optimization , SIAM, 2017
work page 2017
-
[7]
J. P. Boyle and R. L. Dykstra , A method for finding projections onto the intersection of convex sets in Hilbert spaces, in Advances in Order Restricted Statistical Inference, Springer, 1986, pp. 28–47
work page 1986
- [8]
-
[9]
C. Cartis and L. Roberts , Scalable subspace methods for derivative-free nonlinear least-squares optimiza- tion, Mathematical Programming, 199 (2023), pp. 461–524
work page 2023
-
[10]
, Randomized subspace derivative-free optimization with quadratic models and second-order conver- gence, 2024. arXiv:2412.14431
-
[11]
Y. Chen, W. Hare, and A. Wiebe , Q-fully quadratic modeling and its application in a random subspace derivative-free method, Computational Optimization and Applications, (2024)
work page 2024
-
[12]
Chikuse , Statistics on special manifolds , vol
Y. Chikuse , Statistics on special manifolds , vol. 174, Springer, 2012
work page 2012
-
[13]
P. D. Conejo, E. W. Karas, L. G. Pedroso, A. A. Ribeiro, and M. Sachine , Global convergence of trust-region algorithms for convex constrained minimization without derivatives , Applied Mathematics and Computation, 220 (2013), pp. 324–330
work page 2013
-
[14]
A. R. Conn, N. I. M. Gould, and P. L. Toint , Trust region methods, SIAM, 2000
work page 2000
-
[15]
A. R. Conn, K. Scheinberg, and L. N. Vicente , Geometry of interpolation sets in derivative free opti- mization, Mathematical programming, 111 (2008), pp. 141–172. CLARSTA: A random subspace trust-region algorithm for convex-constrained derivative-free optimization 25
work page 2008
-
[16]
, Global convergence of general derivative-free trust-region algorithms to first-and second-order critical points, SIAM Journal on Optimization, 20 (2009), pp. 387–415
work page 2009
-
[17]
, Introduction to derivative-free optimization, SIAM, 2009
work page 2009
-
[18]
K. J. Dzahini and S. M. Wild , Stochastic trust-region algorithm in random subspaces with convergence and expected complexity analyses, SIAM Journal on Optimization, 34 (2024), pp. 2671–2699
work page 2024
-
[19]
M. Feurer and F. Hutter , Hyperparameter optimization, Springer, Cham, 2019, pp. 3–33
work page 2019
-
[20]
Black-Box Optimization in Machine Learning with Trust Region Based Derivative Free Algorithm
H. Ghanbari and K. Scheinberg , Black-box optimization in machine learning with trust region based deriva- tive free algorithm, 2017. arXiv:1703.06925
work page internal anchor Pith review Pith/arXiv arXiv 2017
-
[21]
F. G ¨otze and H. Sambale , Higher order concentration on Stiefel and Grassmann manifolds , Electronic Journal of Probability, 28 (2023), pp. 1 – 30
work page 2023
-
[22]
S. Gratton, C. W. Royer, L. N. Vicente, and Z. Zhang , Direct search based on probabilistic descent , SIAM Journal on Optimization, 25 (2015), pp. 1515–1541
work page 2015
-
[23]
, Direct search based on probabilistic feasible descent for bound and linearly constrained problems , Computational Optimization and Applications, 72 (2019), pp. 525–559
work page 2019
-
[24]
S. Gratton, P. L. Toint, and A. Tr ¨oltzsch, An active-set trust-region method for derivative-free nonlinear bound-constrained optimization, Optimization Methods and Software, 26 (2011), pp. 873–894
work page 2011
-
[25]
G. Grimmett and D. Stirzaker , Probability and random processes, Oxford university press, 2020
work page 2020
-
[26]
W. Hare and G. Jarry-Bolduc , Calculus identities for generalized simplex gradients: rules and applications, SIAM Journal on Optimization, 30 (2020), pp. 853–884
work page 2020
-
[27]
M. Hough and L. Roberts , Model-based derivative-free methods for convex-constrained optimization, SIAM Journal on Optimization, 32 (2022), pp. 2552–2579
work page 2022
- [28]
-
[29]
Ledoux , Isoperimetry and Gaussian analysis , Springer, Berlin, Heidelberg, 1996, pp
M. Ledoux , Isoperimetry and Gaussian analysis , Springer, Berlin, Heidelberg, 1996, pp. 165–294
work page 1996
-
[30]
89, American Mathematical Society, 2001
, The concentration of measure phenomenon , no. 89, American Mathematical Society, 2001
work page 2001
-
[31]
R. M. Lewis and V. Torczon , Pattern search algorithms for bound constrained minimization, SIAM Journal on optimization, 9 (1999), pp. 1082–1099
work page 1999
-
[32]
, Pattern search methods for linearly constrained minimization , SIAM Journal on Optimization, 10 (2000), pp. 917–941
work page 2000
- [33]
-
[34]
Z. Li, F. Liu, W. Yang, S. Peng, and J. Zhou , A survey of convolutional neural networks: analysis, applica- tions, and prospects, IEEE Transactions on Neural Networks and Learning Systems, 33 (2021), pp. 6999–7019
work page 2021
-
[35]
Menickelly, Avoiding geometry improvement in derivative-free model-based methods via randomization,
M. Menickelly, Avoiding geometry improvement in derivative-free model-based methods via randomization,
-
[36]
R. J. Muirhead , Aspects of multivariate statistical theory , John Wiley & Sons, 2009
work page 2009
-
[37]
R. Penrose, A generalized inverse for matrices, in Mathematical Proceedings of the Cambridge Philosophical Society, vol. 51(3), Cambridge University Press, 1955, pp. 406–413
work page 1955
-
[38]
M. J. D. Powell , A direct search optimization method that models the objective and constraint functions by linear interpolation, Springer, Dordrecht, 1994, pp. 51–67
work page 1994
-
[39]
, On fast trust region methods for quadratic models with linear constraints, Mathematical Programming Computation, 7 (2015), pp. 237–267
work page 2015
-
[40]
T. M. Ragonneau and Z. Zhang , PDFO: a cross-platform package for Powell’s derivative-free optimization solvers, Mathematical Programming Computation, 16 (2024), pp. 535–559
work page 2024
-
[41]
L. Roberts, Model construction for convex-constrained derivative-free optimization, SIAM Journal on Opti- mization, 35 (2025), pp. 622–650
work page 2025
-
[42]
L. Roberts and C. W. Royer , Direct search based on probabilistic descent in reduced spaces, SIAM Journal on Optimization, 33 (2023), pp. 3057–3082
work page 2023
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.