A simple Newton method for local nonsmooth optimization
Pith reviewed 2026-05-24 15:16 UTC · model grok-4.3
The pith
A bundle Newton method achieves local quadratic convergence for nonsmooth optimization given the optimal active set cardinality.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The bundle Newton method, which augments the linear approximation oracle with second-order objective information, attains local quadratic convergence on the representative problem class of maxima of smooth functions when the cardinality of the optimal active set is supplied as input.
What carries the argument
The bundle Newton method that integrates second-order information with the linear approximation oracle, using the cardinality of the optimal active set to achieve quadratic convergence.
If this is right
- The method converges quadratically locally for maxima of smooth functions with known active set size.
- It performs well on more general nonsmooth functions, both convex and nonconvex.
- It opens the door to first-order analogues of the approach.
Where Pith is reading between the lines
- If the active set cardinality can be estimated reliably, the method could extend to broader problem classes without explicit input.
- The semi-structured approach might apply to other optimization settings where partial structure is known.
- Similar ideas could improve convergence in nonconvex nonsmooth problems beyond the tested cases.
Load-bearing premise
The objective function is a maximum of several smooth functions that the oracle cannot access individually.
What would settle it
A counterexample where the method does not exhibit quadratic convergence on a max of smooth functions despite knowing the active set cardinality.
Figures
read the original abstract
Superlinear convergence has been an elusive goal for black-box nonsmooth optimization. Even in the convex case, the subgradient method is very slow, and while some cutting plane algorithms, including traditional bundle methods, are popular in practice, local convergence is still sluggish. Faster variants depend either on problem structure or on analyses that elide sequences of "null" steps. Motivated by a semi-structured approach to optimization and the sequential quadratic programming philosophy, we describe a new bundle Newton method that incorporates second-order objective information with the usual linear approximation oracle. One representative problem class consists of maxima of several smooth functions, individually inaccessible to the oracle. Given as additional input just the cardinality of the optimal active set, we prove local quadratic convergence. A simple implementation shows promise on more general functions, both convex and nonconvex, and suggests first-order analogues.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript introduces a bundle Newton method for nonsmooth optimization that augments the standard linear approximation oracle with second-order information. For the representative class of objective functions that are pointwise maxima of several smooth functions (where the individual smooth functions are inaccessible to the oracle), and given the cardinality of the optimal active set as explicit additional input, the authors prove local quadratic convergence. Numerical results on a simple implementation are reported for both convex and nonconvex test problems, with a suggestion that first-order analogues may be viable.
Significance. If the local quadratic convergence claim holds under the stated assumptions, the result would be significant for black-box nonsmooth optimization, a setting in which superlinear rates have remained elusive even for convex problems. The approach is motivated by semi-structured problem classes and SQP ideas, and the explicit cardinality input allows a precise scoping that avoids eliding null steps. The empirical section indicates practical promise beyond the analyzed class.
minor comments (3)
- The abstract states that the method 'incorporates second-order objective information with the usual linear approximation oracle,' but the precise form of the second-order oracle (e.g., whether it returns a Hessian of an active function or an approximation) should be stated explicitly in §2 or §3 to clarify the information model.
- In the description of the representative problem class, the phrase 'individually inaccessible to the oracle' is used; a short clarifying sentence on what information about the component functions is nevertheless available (if any) would aid readability.
- The numerical experiments section would benefit from a table or plot that isolates the effect of supplying the correct active-set cardinality versus an incorrect value, to illustrate robustness.
Simulated Author's Rebuttal
We thank the referee for the positive assessment of the manuscript, the recognition of its potential significance for black-box nonsmooth optimization, and the recommendation of minor revision. No specific major comments were raised in the report.
Circularity Check
No significant circularity in derivation chain
full rationale
The paper's central result is a proof of local quadratic convergence for the proposed bundle Newton method on the representative class of max-of-smooth-functions objectives, conditioned on receiving the cardinality of the optimal active set as explicit additional input. No step reduces by construction to a fitted parameter, self-definition, or load-bearing self-citation; the proof is presented as depending on the algorithm and the supplied cardinality datum. The derivation is therefore self-contained against the stated assumptions and does not match any of the enumerated circularity patterns.
Axiom & Free-Parameter Ledger
Forward citations
Cited by 1 Pith paper
-
Technical results on the convergence of quasi-Newton methods for nonsmooth optimization
Sufficient conditions on eigenvalue vanishing in quasi-Newton updates, observed numerically, are shown to imply convergence to criticality for piecewise differentiable nonsmooth functions, along with the method's abil...
Reference graph
Works this paper leans on
-
[1]
D.S. Atkinson and P.M. Vaidya. A cutting plane algorithm for convex pro- gramming that uses analytic centers. Math. Programming, 69:1–43, 1995
work page 1995
-
[2]
S. Bubeck. Convex optimization: algorithms and complexity. Foundations and Trends in Machine Learning, 8:231–357, 2015
work page 2015
-
[3]
J.V. Burke, A.S. Lewis, and M.L. Overton. A robust gradient sampling algo- rithm for nonsmooth, nonconvex optimization. SIAM J. Optim. , 15:751–779, 2005
work page 2005
-
[4]
F.H. Clarke. Optimization and Nonsmooth Analysis . Wiley Interscience, New York, 1983. 31
work page 1983
-
[5]
M. Coste. An Introduction to Semialgebraic Geometry. RAAG Notes, 78 pages, Institut de Recherche Math´ ematiques de Rennes, October 2002
work page 2002
- [6]
-
[7]
J.-L. Goffin and J.-P. Vial. On the computation of weighted analytic centers and dual ellipsoids with the projective algorithm. Math. Programming, 60:81–92, 1993
work page 1993
-
[8]
G.H. Golub and C.F. Van Loan. Matrix Computations. Johns Hopkins Univer- sity Press, Baltimore, MD, fourth edition, 2013
work page 2013
-
[9]
K. C. Kiwiel. Efficiency of proximal bundle methods. J. Optim. Theory Appl. , 104:589–603, 2000
work page 2000
-
[10]
K.C. Kiwiel. Methods of Descent for Nondifferentiable Optimization , volume 1133 of Lecture Notes in Mathematics . Springer-Verlag, Berlin, 1985
work page 1985
-
[11]
Sidford, and Sam Chiu-wai Wong
Yin Tat Lee, A. Sidford, and Sam Chiu-wai Wong. A faster cutting plane method and its implications for combinatorial and convex optimization. In FOCS 2015, pages 1049–1065. IEEE Computer Soc., Los Alamitos, CA, 2015
work page 2015
-
[12]
C. Lemar´ echal. An extension of Davidon methods to non differentiable prob- lems. Math. Programming Stud., 3:95–109, 1975
work page 1975
-
[13]
C. Lemar´ echal, A. Nemirovskii, and Y. Nesterov. New variants of bundle meth- ods. Math. Programming, 69:111–147, 1995
work page 1995
-
[14]
C. Lemar´ echal, F. Oustry, and C. Sagastiz´ abal. The U-lagrangian of a convex function. Transactions of the American Mathematical Society , 352:711–729, 2000
work page 2000
-
[15]
C. Lemar´ echal and C. Sagastiz´ abal. Practical aspects of the Moreau-Yosida regularization: theoretical preliminaries. SIAM J. Optim. , 7:367—385, 1997
work page 1997
-
[16]
A.S. Lewis. Active sets, nonsmoothness, and sensitivity. SIAM J. Optim. , 13:702–725, 2002
work page 2002
-
[17]
A.S. Lewis and M.L. Overton. Nonsmooth optimization via quasi-Newton meth- ods. Math. Program., 141:135–163, 2013
work page 2013
-
[18]
L. Lukˇ san and J. Vlˇ cek. A bundle-Newton method for nonsmooth uncon- strained minimization. Math. Programming, 83:373–391, 1998
work page 1998
-
[19]
R. Mifflin. An algorithm for constrained optimization with semismooth func- tions. Math. Oper. Res., 2:191–207, 1977. 32
work page 1977
-
[20]
R. Mifflin and C. Sagastiz´ abal. Proximal points are on the fast track. Journal of Convex Analysis , 9:563—579, 2002
work page 2002
-
[21]
R. Mifflin and C. Sagastiz´ abal. AVU -algorithm for convex minimization. Math. Program., 104:583–608, 2005
work page 2005
-
[22]
S.A. Miller and J. Malick. Newton methods for nonsmooth convex minimiza- tion: connections among U-Lagrangian, Riemannian Newton and SQP meth- ods. Math. Program., 104:609–633, 2005
work page 2005
-
[23]
A. S. Nemirovsky and D. B. Yudin. Problem Complexity and Method Efficiency in Optimization . John Wiley & Sons, Inc., New York, 1983. Translated from the Russian and with a preface by E. R. Dawson
work page 1983
- [24]
- [25]
- [26]
-
[27]
Y. Nesterov and A. Nemirovskii. Interior-Point Polynomial Algorithms in Con- vex Programming. SIAM, Philadelphia, PA, 1994
work page 1994
-
[28]
J. Nocedal and S.J. Wright. Numerical Optimization . Springer, New York, second edition, 2006
work page 2006
-
[29]
F. Oustry. The U-Lagrangian of the maximum eigenvalue function. SIAM J. Optim., 9:526–549, 1999
work page 1999
-
[30]
M.L. Overton. On minimizing the maximum eigenvalue of a symmetric ma- trix. In Linear algebra in signals, systems, and control , pages 150–169. SIAM, Philadelphia, PA, 1988
work page 1988
-
[31]
N. Parikh and S. Boyd. Proximal algorithms. Foundations and Trends in Optimization, 1:123–231, 2013
work page 2013
- [32]
- [33]
-
[34]
C. Sagastiz´ abal. AVU -point of view of nonsmooth optimization. In Proceedings of the International Congress of Mathematicians, Rio de Janeiro , volume 3, pages 3785–3806, 2018
work page 2018
-
[35]
A. Shapiro and M.K.H. Fan. On eigenvalue optimization. SIAM J. Optim. , 5:552–569, 1995
work page 1995
-
[36]
P. M. Vaidya. A new algorithm for minimizing convex functions over convex sets. Math. Programming, 73:291–341, 1996. 34
work page 1996
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.