Inexact Limited Memory Bundle Method
Pith reviewed 2026-05-10 17:34 UTC · model grok-4.3
The pith
An inexact limited memory bundle method converges globally to an approximate stationary point for large-scale nonsmooth nonconvex problems even when function values and subgradients contain noise.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The proposed inexact limited memory bundle method maintains global convergence to an approximate stationary point while tolerating bounded noise in both function evaluations and subgradient information, and it demonstrates practical performance on large-scale nonsmooth nonconvex test problems.
What carries the argument
The inexact limited memory bundle method, which maintains a limited history of inexact subgradients to form a bundle model and uses updates that approximate second-order information without storing the full history.
If this is right
- The method applies directly to settings where subgradients are available only through noisy oracles, such as differential privacy or stochastic approximations.
- Global convergence holds to an approximate stationary point rather than an exact one, with the tolerance controlled by the noise bounds.
- Limited memory updates keep memory usage low, enabling the method to scale to problems with thousands of variables.
- Numerical tests confirm competitiveness with exact bundle methods when noise is absent and continued reliability when noise is present.
Where Pith is reading between the lines
- The same inexact framework could be adapted to other bundle or proximal algorithms for nonsmooth problems without major changes to the convergence analysis.
- In applications like machine learning, the noise tolerance might allow cheaper gradient estimators or privacy mechanisms while retaining convergence guarantees.
- Performance on very high-dimensional problems may depend on how the limited memory length interacts with the specific noise structure.
Load-bearing premise
The noise affecting function values and subgradients must remain bounded or satisfy regularity conditions that preserve stability of the inexact bundle framework.
What would settle it
A concrete counter-example or numerical run in which the noise level violates the boundedness conditions yet the method still fails to reach any approximate stationary point on a problem known to possess one.
Figures
read the original abstract
Large-scale nonsmooth optimization problems arise in many real-world applications, but obtaining exact function and subgradient values for these problems may be computationally expensive or even infeasible. In many practical settings, only inexact information is available due to measurement or modeling errors, privacy-preserving computations, or stochastic approximations, making inexact optimization methods particularly relevant. In this paper, we propose a novel inexact limited memory bundle method for large-scale nonsmooth nonconvex optimization. The method tolerates noise in both function values and subgradients. We prove the global convergence of the proposed method to an approximate stationary point. Numerical experiments with different levels of noise in function and/or subgradient values show that the method performs well with both exact and noisy data. In particular, the results demonstrate competitiveness in large-scale nonsmooth optimization and highlight the suitability of the method for applications where noise is unavoidable, such as differential privacy in machine learning.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript proposes a novel inexact limited memory bundle method for large-scale nonsmooth nonconvex optimization. The method is designed to tolerate noise in both function values and subgradients. The authors claim to prove global convergence to an approximate stationary point and report numerical experiments demonstrating competitiveness with both exact and noisy data, including suitability for applications such as differential privacy in machine learning.
Significance. If the convergence result holds under clearly stated and verifiable noise assumptions, the method would address a practically relevant gap in scalable nonsmooth optimization where exact oracles are unavailable. The limited-memory feature supports applicability to large-scale instances, and the numerical results suggest robustness across noise levels. However, the significance is currently difficult to assess without explicit statements of the noise model, the precise definition of approximate stationarity, and the key steps in the convergence argument.
major comments (1)
- The abstract asserts a global convergence proof to an approximate stationary point under noise in function and subgradient values, yet provides no details on the noise regularity conditions, the definition of the stationarity measure, or the descent and bundle-update conditions used in the argument. Without these, it is not possible to verify whether the proof is load-bearing or relies on unstated assumptions.
minor comments (1)
- The abstract refers to 'different levels of noise' in the experiments but does not specify the noise model (e.g., bounded additive noise, relative noise, or stochastic approximation) or the test problems used, which limits reproducibility assessment.
Simulated Author's Rebuttal
We thank the referee for their careful review and constructive feedback on our manuscript. We address the major comment below and will revise the abstract to improve accessibility while preserving its conciseness.
read point-by-point responses
-
Referee: The abstract asserts a global convergence proof to an approximate stationary point under noise in function and subgradient values, yet provides no details on the noise regularity conditions, the definition of the stationarity measure, or the descent and bundle-update conditions used in the argument. Without these, it is not possible to verify whether the proof is load-bearing or relies on unstated assumptions.
Authors: We appreciate the referee pointing out the abstract's lack of detail on these elements. The noise regularity conditions (bounded noise in function values and subgradients) are formalized in Assumption 3.1. The notion of approximate stationarity is defined in Definition 2.3 as an approximate Goldstein ε-stationary point. The descent condition is enforced via the inexact line search in Step 4 of Algorithm 1, while bundle-update rules (including aggregation and deletion) appear in Steps 5–6, with the full convergence argument given in Theorem 4.1 and its proof in Section 4. All conditions used in the proof are explicitly stated in the manuscript; no unstated assumptions are employed. To address the referee's concern, we will revise the abstract to include a brief sentence referencing the noise model and the stationarity measure. revision: yes
Circularity Check
No significant circularity identified
full rationale
The abstract and description present a novel inexact limited memory bundle method with a claimed independent global convergence proof to an approximate stationary point under noise in function and subgradient values. No equations, definitions, or load-bearing steps are available that reduce the result to fitted parameters, self-referential definitions, or self-citation chains by construction. The proof is described as a separate argument resting on regularity conditions for the inexact bundle framework, with no evidence of renaming known results or smuggling ansatzes via citation in the provided content. The derivation therefore appears self-contained against external benchmarks.
Axiom & Free-Parameter Ledger
Reference graph
Works this paper leans on
- [1]
-
[2]
A. Astorino and A. Fuduli,Nonsmooth optimization techniques for semi-supervised clas- sification, IEEE Trans. Pattern Anal. Mach. Intell. 29 (2007), pp. 2135–2142
work page 2007
-
[3]
A. Astorino, A. Fuduli, and E. Gorgone,Nonsmoothness in classification problems, Optim. Methods Softw. 23 (2008), pp. 675–688
work page 2008
-
[4]
A.M. Bagirov, M. Gaudioso, N. Karmitsa, M.M. Mäkelä, and S. Taheri,Numerical Non- smooth Optimization: State of the Art Algorithms, Springer Cham, 2020
work page 2020
-
[5]
A.M. Bagirov, N. Karmitsa, and M.M. Mäkelä,Introduction to Nonsmooth Optimization: Theory, Practice and Software, Springer, 2014
work page 2014
-
[6]
A.M. Bagirov, N. Karmitsa, and S. Taheri,Partitional Clustering via Nonsmooth Opti- mization: Clustering via Optimization, 2nd ed., Springer Cham, 2025
work page 2025
-
[7]
A.M. Bagirov, S. Taheri, N. Karmitsa, N. Sultanova, and S. Asadi,Robust piecewise linear L1-regression via nonsmooth DC optimization, Optim. Methods Softw. 37 (2022), pp. 1289–1309
work page 2022
-
[8]
A.M. Bagirov, S. Taheri, and J. Ugon,Nonsmooth DC programming approach to the minimum sum-of-squares clustering problems, Pattern Recognit. 53 (2016), pp. 12–24
work page 2016
-
[9]
A.M. Bagirov and J. Ugon,Nonsmooth DC programming approach to clusterwise linear regression: Optimality conditions and algorithms, Optim. Methods Softw. 33 (2018), pp. 194–219
work page 2018
-
[10]
Bihain,Optimization of upper semidifferentiable functions, J
A. Bihain,Optimization of upper semidifferentiable functions, J. Optim. Theory Appl. 4 (1984), pp. 545–568
work page 1984
- [11]
-
[12]
R.H. Byrd, J. Nocedal, and R.B. Schnabel,Representations of quasi-Newton matrices and their use in limited memory methods, Math. Program. 63 (1994), pp. 129–156
work page 1994
-
[13]
Clarke,Optimization and Nonsmooth Analysis, Wiley-Interscience, New York, 1983
F.H. Clarke,Optimization and Nonsmooth Analysis, Wiley-Interscience, New York, 1983
work page 1983
-
[14]
W. de Oliveira and M. Solodov,Bundle methods for inexact data, inNumerical nonsmooth optimization: State of the art algorithms, Springer, 2020, pp. 417–459
work page 2020
-
[15]
V.F. Demyanov, G.E. Stavroulakis, L.N. Polyakova, and P.D. Panagiotopoulos,Qua- sidifferentiability and Nonsmooth Modelling in Mechanics, Engineering and Economics, Springer Science & Business Media, 2013
work page 2013
- [16]
-
[17]
C. Dwork and A. Roth,The algorithmic foundations of differential privacy, Found. Trends Theor. Comput. Sci. 9 (2014), pp. 211–407
work page 2014
-
[18]
M. Haarala, K. Miettinen, and M.M. Mäkelä,New limited memory bundle method for large-scale nonsmooth optimization, Optim. Methods Softw. 19 (2004), pp. 673–692
work page 2004
-
[19]
N. Haarala, K. Miettinen, and M.M. Mäkelä,Globally convergent limited memory bundle method for large-scale nonsmooth optimization, Math. Program. 109 (2007), pp. 181–205
work page 2007
-
[20]
A. Halkola, K. Joki, T. Mirtti, M.M. Mäkelä, T. Aittokallio, and T.D. Laajala,OSCAR: Optimal subset cardinality regression using the L0-pseudonorm with applications to prog- nostic modelling of prostate cancer, PLOS Comput. Biol. 19 (2023), p. e1010333
work page 2023
-
[21]
W. Hare and C. Sagastizábal,A redistributed proximal bundle method for nonconvex 27 optimization, SIAM J. Optim. 20 (2010), pp. 2442–2473
work page 2010
-
[22]
W. Hare, C. Sagastizábal, and S. M.A proximal bundle method for nonsmooth nonconvex functions with inexact information, Comput. Optim. Appl. 63 (2016), pp. 1–28
work page 2016
-
[23]
L. Hertlein and M. Ulbrich,An inexact bundle algorithm for nonconvex nonsmooth min- imization in Hilbert space, SIAM J. Control Optim. 57 (2019), pp. 3137–3165
work page 2019
-
[24]
J.B. Hiriart-Urruty and C. Lemaréchal,Convex Analysis and Minimization Algorithms I, Springer-Verlag, Berlin, 1993
work page 1993
-
[25]
N. Hoseini Monjezi and S. Nobakhtian,New proximal bundle algorithm based on the gradient sampling method for nonsmooth nonconvex optimization with exact and inexact information, Numer. Algorithms 94 (2023), pp. 765–787
work page 2023
-
[26]
T. Kärkkäinen, K. Majava, and M.M. Mäkelä,Comparison of formulations and solution methods for image restoration problems, Inverse Probl. 17 (2001), pp. 1977–1995
work page 2001
-
[27]
A Comprehensive Guide to Differential Privacy: From Theory to User Expectations
N. Karmitsa, A. Airola, T. Pahikkala, and T. Pitkämäki,A comprehensive guide to differential privacy: From theory to user expectations, preprint (2025). Available at https://arxiv.org/abs/2509.03294
work page internal anchor Pith review Pith/arXiv arXiv 2025
-
[28]
N. Karmitsa and A.M. Bagirov,Limited memory discrete gradient bundle method for nonsmooth derivative free optimization, Optimization 61 (2012), pp. 1491–1509
work page 2012
-
[29]
N. Karmitsa, A.M. Bagirov, and M.M. Mäkelä,Comparing different nonsmooth optimiza- tion methods and software, Optim. Methods Softw. 27 (2012), pp. 131–153
work page 2012
-
[30]
N. Karmitsa, A.M. Bagirov, and S. Taheri,Clustering in large data sets with the limited memory bundle method, Pattern Recognit. 83 (2018), pp. 245–259
work page 2018
-
[31]
N. Karmitsa, V.P. Eronen, M.M. Mäkelä, T. Pahikkala, and A. Airola,Stochastic limited memory bundle algorithm for clustering in big data, Pattern Recognit. 165 (2025), p. 111654
work page 2025
-
[32]
N. Karmitsa, K. Joki, A. Airola, and T. Pahikkala,Limited memory bundle DC algorithm for sparse pairwise kernel learning, J. Glob. Optim. 92 (2025), pp. 55–85
work page 2025
-
[33]
N. Karmitsa and M.M. Mäkelä,Adaptive limited memory bundle method for bound con- strained large-scale nonsmooth optimization, Optimization 59 (2010), pp. 945–962
work page 2010
-
[34]
N. Karmitsa, S. Taheri, K. Joki, P. Paasivirta, A.M. Bagirov, and M.M. Mäkelä,Nons- mooth optimization-based hyperparameter-free neural networks for large-scale regression, Algorithms 16 (2023)
work page 2023
-
[35]
P.D. Khanh, B.S. Mordukhovich, and D.B. Tran,A new inexact gradient descent method with applications to nonsmooth convex optimization, Optim. Methods Softw. (2024), pp. 1–29
work page 2024
-
[36]
Kiwiel,Methods of Descent for Nondifferentiable Optimization, Lect
K.C. Kiwiel,Methods of Descent for Nondifferentiable Optimization, Lect. Notes Math., Springer-Verlag, Berlin, 1985
work page 1985
- [37]
-
[38]
L. Lukšan,Dual method for solving a special problem of quadratic programming as a sub- problem at linearly constrained nonlinear minmax approximation, Kybernetika 20 (1984), pp. 445–457
work page 1984
-
[39]
L. Lukšan and J. Vlček,Globally convergent variable metric method for convex nonsmooth unconstrained minimization, J. Optim. Theory Appl. 102 (1999), pp. 593–613
work page 1999
-
[40]
J. Lv, L.P. Pang, and F.Y. Meng,A proximal bundle method for constrained nonsmooth nonconvex optimization with inexact information, J. Glob. Optim. 70 (2018), pp. 517–549
work page 2018
-
[41]
Mäkelä,Issues of implementing a Fortran subroutine package NSOLIB for nons- mooth optimization, Tech
M.M. Mäkelä,Issues of implementing a Fortran subroutine package NSOLIB for nons- mooth optimization, Tech. Rep. 5/1993, Dept. Math., Lab. Sci. Comput., Univ. Jyväskylä (1993)
work page 1993
-
[42]
M.M. Mäkelä,Multiobjective proximal bundle method for nonconvex nonsmooth optimiza- tion: Fortran subroutine MPBNGC 2.0, Rep. Dept. Math. Inf. Technol., Ser. B. Sci. Com- put., B. 13/2003, Univ. Jyväskylä, Jyväskylä (2003)
work page 2003
-
[43]
M.M. Mäkelä and P. Neittaanmäki,Nonsmooth Optimization: Analysis and Algorithms with Applications to Optimal Control, World Scientific Publishing Co., Singapore, 1992
work page 1992
-
[44]
E.S. Mistakidis and G.E. Stavroulakis,Nonconvex Optimization in Mechanics. Smooth 28 and Nonsmooth Algorithms, Heuristics and Engineering Applications by the F.E.M., Kluwert Academic Publishers, Dordrecht, 1998
work page 1998
-
[45]
J.J. Moreau, P.D. Panagiotopoulos, and G. Strang,Topics in Nonsmooth Mechanics, Birkhäuser Verlag, Basel, 1988
work page 1988
-
[46]
N. Movahedian and S. Nobakhtian,Necessary and sufficient conditions for nonsmooth mathematical programs with equilibrium constraints, Nonlinear Anal.: Theory Methods Appl. 72 (2010), pp. 2694–2705
work page 2010
-
[47]
M. Nikolova, M.K. Ng, and C.P. Tam,Fast nonconvex nonsmooth minimization methods for image restoration and reconstruction, IEEE Trans. Image Process. 19 (2010), pp. 3073–3088
work page 2010
-
[48]
D. Noll,Bundle method for non-convex minimization with inexact subgradients and func- tion values, inComput. Anal. Math., Springer, 2013, pp. 555–592
work page 2013
-
[49]
Theory, Applications and Numerical Results, Kluwert Academic Publisher, Dordrecht, 1998
J.Outrata,M.Kočvara,andJ.Zowe,Nonsmooth Approach to Optimization Problems with Equilibrium Constraints. Theory, Applications and Numerical Results, Kluwert Academic Publisher, Dordrecht, 1998
work page 1998
-
[50]
P. Paasivirta, R. Numminen, A. Airola, N. Karmitsa, and T. Pahikkala,Predicting pair- wise interaction affinities withℓ0-penalized least squares – a nonsmooth bi-objective opti- mization based approach, Optim. Methods Softw. (2024), pp. 1–28
work page 2024
-
[51]
C. Ramirez and M. Argaez,An minimization algorithm for non-smooth regularization in image processing, Signal Image Video Process. 9 (2015), pp. 373–386
work page 2015
-
[52]
V. Singh and S.K. Mishra,Duality results for quasidifferentiable mathematical programs with equilibrium constraints, Comput. Manag. Sci. 23 (2026), p. 13
work page 2026
-
[53]
J. Vlček and L. Lukšan,Globally convergent variable metric method for nonconvex nondif- ferentiable unconstrained minimization, J. Optim. Theory Appl. 111 (2001), pp. 407–430
work page 2001
-
[54]
W. Wang and Y. Chen,An accelerated smoothing gradient method for nonconvex nons- mooth minimization in image processing, J. Sci. Comput. 90 (2022), p. 31
work page 2022
-
[55]
J.M. Yang and C.C. Chen,GEMDOCK: A generic evolutionary method for molecular docking, Proteins: Struct. Funct. Bioinf. 55 (2004), pp. 288–304
work page 2004
-
[56]
M.C. Yue, Z. Zhou, and A.M.C. So,A family of inexact SQA methods for non-smooth convex minimization with provable convergence guarantees based on the Luo-Tseng error bound property, Math. Program. 174 (2019), pp. 327–358. 29 Appendix A. Test problems In our numerical experiments we usedfivenonsmooth nonconvex academic minimiza- tion problems from [18] and...
work page 2019
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.