A operatorname{prox}-Based Semi-Smooth Newton Method for Convex Variational Problems
Pith reviewed 2026-06-25 19:11 UTC · model grok-4.3
The pith
A proximity operator reformulation turns discrete optimality conditions for convex variational problems into a semi-smooth Newton method with global well-posedness and local superlinear convergence.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
On the basis of the proximity operator, the discrete primal-dual optimality conditions are reformulated as nonlinear operator equations with Newton-differentiable structure. Under suitable assumptions on the energy densities, this yields a semi-smooth Newton method that is globally well-posed and locally superlinearly convergent. The approach applies to a broad class of problems, coincides with established methods for obstacle-type problems, satisfies primal-dual invariance, and admits global well-posedness in the infinite-dimensional setting under additional assumptions.
What carries the argument
The proximity operator, used to recast the discrete primal-dual optimality conditions as a Newton-differentiable nonlinear operator equation that the semi-smooth Newton iteration solves.
If this is right
- The method applies directly to total variation minimization, the p-Dirichlet problem, the obstacle problem, and the elasto-plastic torsion problem after finite-element discretization.
- The iteration coincides with existing semi-smooth Newton schemes for obstacle-type problems.
- The scheme satisfies a primal-dual invariance property that is preserved at every step.
- Under further assumptions the same reformulation yields global well-posedness already in the infinite-dimensional continuous setting.
Where Pith is reading between the lines
- The invariance property may supply a natural way to monitor both primal and dual residuals without extra computation.
- The same reformulation strategy could be tested on time-dependent or stochastic extensions of the listed variational problems.
- If the infinite-dimensional well-posedness holds more generally, it would justify applying the method directly on adaptive meshes without first proving discrete well-posedness separately.
Load-bearing premise
The energy densities satisfy conditions that permit the discrete primal-dual optimality conditions to be rewritten as a Newton-differentiable nonlinear operator equation via the proximity operator.
What would settle it
A concrete energy density satisfying the stated assumptions for which the resulting semi-smooth Newton iteration either fails to be globally well-posed or loses local superlinear convergence on a sequence of finite-element meshes.
Figures
read the original abstract
In this paper, we devise a $\operatorname{prox}$-based semi-smooth Newton method that is applicable to a finite element discretization of a broad class of nonsmooth convex variational problems, including the TV-minimization problem, the $p$-Dirichlet problem, the obstacle problem, and the elasto-plastic torsion problem. To this end, on the basis of the proximity operator, the discrete primal-dual optimality conditions are reformulated as nonlinear operator equations with Newton-differentiable structure. Under suitable assumptions on the energy densities, we establish the global well-posedness and local super-linear convergence of the resulting semi-smooth Newton method. The proposed approach coincides with established semi-smooth Newton methods for obstacle-type problems, satisfies a primal-dual invariance, and, under suitable additional assumptions, is globally well-posed in the infinite-dimensional setting.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper proposes a prox-based semi-smooth Newton method for finite-element discretizations of nonsmooth convex variational problems (TV minimization, p-Dirichlet, obstacle, elasto-plastic torsion). Discrete primal-dual optimality conditions are rewritten, via the proximity operator, as Newton-differentiable nonlinear operator equations. Under suitable assumptions on the energy densities the method is shown to be globally well-posed and locally superlinearly convergent; it recovers known schemes for obstacle problems, preserves primal-dual invariance, and extends to the infinite-dimensional setting under further hypotheses.
Significance. If the stated convergence theory holds, the work supplies a unified, structure-preserving solver for a practically important class of nonsmooth convex problems with a superlinear rate that is attractive for high-accuracy computations. The explicit recovery of existing obstacle-problem schemes and the primal-dual invariance property are concrete strengths that increase the result’s immediate utility.
major comments (2)
- [Abstract] Abstract and opening paragraphs: the central claims of global well-posedness and local superlinear convergence are asserted to follow from Newton-differentiability of the prox-based reformulation, yet no explicit statement of the required assumptions on the energy densities, no derivation of the Newton derivative, and no sketch of the convergence argument appear in the visible text; without these the load-bearing analytic results cannot be assessed.
- [Reformulation paragraph] Reformulation paragraph: the claim that the discrete primal-dual optimality conditions become a Newton-differentiable operator equation via the proximity operator is presented as routine, but the precise conditions under which the proximity operator yields a Newton derivative (and the resulting operator remains well-defined on the finite-element space) are not supplied; this step is load-bearing for both well-posedness and the superlinear rate.
minor comments (2)
- Add a dedicated assumptions section that lists the precise conditions on the energy densities (convexity, growth, smoothness away from kinks, etc.) needed for Newton-differentiability.
- Include at least one numerical example that reports observed convergence rates (e.g., iteration counts versus mesh size) to corroborate the local superlinear claim.
Simulated Author's Rebuttal
We thank the referee for the careful reading and constructive feedback. The comments correctly note that the abstract and reformulation paragraph would benefit from greater explicitness regarding assumptions and technical conditions. We address each point below and will revise the manuscript accordingly.
read point-by-point responses
-
Referee: [Abstract] Abstract and opening paragraphs: the central claims of global well-posedness and local superlinear convergence are asserted to follow from Newton-differentiability of the prox-based reformulation, yet no explicit statement of the required assumptions on the energy densities, no derivation of the Newton derivative, and no sketch of the convergence argument appear in the visible text; without these the load-bearing analytic results cannot be assessed.
Authors: We agree that the abstract and introduction would be clearer with a concise statement of the assumptions. In the revision we will add one sentence to the abstract: 'Under the assumptions that the energy densities are convex, proper, lower semicontinuous and satisfy standard growth and coercivity conditions (Assumption 2.1), the resulting operator is Newton differentiable.' A one-paragraph sketch of the Newton derivative (via the chain rule for the prox) and the convergence argument (semismooth Newton theory in finite dimensions) will be inserted at the end of the introduction, with forward references to Sections 3 and 4 where the full proofs appear. revision: yes
-
Referee: [Reformulation paragraph] Reformulation paragraph: the claim that the discrete primal-dual optimality conditions become a Newton-differentiable operator equation via the proximity operator is presented as routine, but the precise conditions under which the proximity operator yields a Newton derivative (and the resulting operator remains well-defined on the finite-element space) are not supplied; this step is load-bearing for both well-posedness and the superlinear rate.
Authors: The precise conditions are stated in Proposition 3.2: the prox operator of a convex lsc function with quadratic growth is Newton differentiable on the finite-element space, and the composite operator inherits this property by the chain rule for semismooth functions. Well-definedness follows immediately from the finite-dimensional setting and the fact that the discrete duality map is linear and continuous. We will revise the reformulation paragraph to include an explicit forward reference to Proposition 3.2 and a one-sentence reminder of the semismoothness hypothesis on the energy density. revision: yes
Circularity Check
No significant circularity; derivation self-contained
full rationale
The paper derives global well-posedness and local superlinear convergence for a prox-based semi-smooth Newton method from standard reformulations of discrete primal-dual optimality conditions into Newton-differentiable nonlinear equations, using proximity operators under stated assumptions on energy densities. These steps rely on established properties of proximal mappings and variational inequalities rather than any self-referential definition, fitted input renamed as prediction, or load-bearing self-citation chain. The method's coincidence with known schemes for obstacle problems is presented as a consistency check, not a foundational reduction. No quoted equation or assumption collapses the claimed convergence result to its inputs by construction.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption Energy densities admit a proximity operator that yields a Newton-differentiable nonlinear operator equation from the discrete primal-dual conditions.
Reference graph
Works this paper leans on
-
[1]
H. Antil,S. Bartels,A. Kaltenbach, andR. Khandelwal, Variational problems with gradient constraints: A priori and a posteriori error identities,Math. Comput.(2025). doi:10.1090/mcom/4146
-
[2]
BartelsandA
S. BartelsandA. Kaltenbach,A prox-Based Semi-Smooth Newton Method for TV-Minimization,
-
[3]
doi:10.48550/arXiv.2605.22728
work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.2605.22728
-
[4]
Bartels,Numerical methods for nonlinear partial differential equations,Springer Ser
S. Bartels,Numerical methods for nonlinear partial differential equations,Springer Ser. Comput. Math.47, Cham: Springer, 2015. doi:10.1007/978-3-319-13797-1 1
-
[5]
S. Bartels, Nonconforming discretizations of convex minimization problems and precise relations to mixed methods,Comput. Math. Appl.93(2021), 214–229. doi:10.1016/j.camwa.2021.04.014
-
[6]
S. Bartels,L. Diening, andR. H. Nochetto, Unconditional stability of semi-implicit discretiza- tions of singular flows,SIAM J. Numer. Anal.56no. 3 (2018), 1896–1914. doi:10.1137/17M1159166
-
[7]
S. Bartels,T. Gudi, andA. Kaltenbach, A priori and a posteriori error identities for the scalar Signorini problem,SIAM J. Numer. Anal.63no. 5 (2025), 2155–2186. doi:10.1137/24M1677691
-
[8]
S. BartelsandA. Kaltenbach, Error estimates for total-variation regularized minimization problems with singular dual solutions,Numer. Math.152no. 4 (2022), 881–906. doi:10.1007/s00211- 022-01324-w
-
[9]
S. BartelsandP. Tscherner, Necessary and sufficient conditions for avoiding Babuˇ ska’s paradox on simplicial meshes,IMA J. Numer. Anal.45no. 3 (2025), 1300–1319. doi:10.1093/imanum/drae050
-
[10]
S. BartelsandA. Kaltenbach, Chapter seven - exact a posteriori error control for variational problems via convex duality and explicit flux reconstruction, inError Control, Adaptive Discretiza- tions, and Applications, Part 1(F. Chouly,S. P. Bordas,R. Becker, andP. Omnes, eds.), Advances in Applied Mechanics58, Elsevier, 2024, pp. 295–375. doi:10.1016/bs.a...
-
[11]
H. H. BauschkeandP. L. Combettes,Convex analysis and monotone operator theory in Hilbert spaces, 2nd ed.,CMS Books Math./Ouvrages Math. SMC, Cham: Springer, 2017. doi:10.1007/978-3- 319-48311-5
-
[12]
Beck,First-order methods in optimization,MOS-SIAM Ser
A. Beck,First-order methods in optimization,MOS-SIAM Ser. Optim.25, SIAM; Philadelphia, MOS, 2017. doi:10.1137/1.9781611974997
-
[13]
A. ChambolleandT. Pock, A first-order primal-dual algorithm for convex problems with applications to imaging,J. Math. Imaging Vis.40no. 1 (2011), 120–145. doi:10.1007/s10851-010- 0251-1
-
[14]
A. ChambolleandT. Pock, Crouzeix–Raviart approximation of the total variation on simplicial meshes,J. Math. Imaging Vis.62no. 6-7 (2020), 872–899. doi:10.1007/s10851-019-00939-3
-
[15]
X. Chen,Z. Nashed, andL. Qi, Smoothing methods and semismooth methods for nondifferentiable operator equations,SIAM J. Numer. Anal.38no. 4 (2000), 1200–1216. doi:10.1137/S0036142999356719
-
[16]
Y. Chen,T. A. Davis,W. W. Hager, andS. Rajamanickam, Algorithm 887: Cholmod, supernodal sparse cholesky factorization and update/downdate,ACM Transactions on Mathematical Software35no. 3 (2008), 22:1–22:14. doi:10.1145/1391989.1391995
-
[17]
L. DieningandJ. Storn, Guaranteed upper bounds for iteration errors and modified Kaˇ canov schemes via discrete duality,Comput. Methods Appl. Math.25no. 3 (2025), 587–600. doi:10.1515/cmam-2025-0017
-
[18]
DuvautandJ
G. DuvautandJ. L. Lions,Les in´ equations en m´ ecanique et en physique,Trav. Rech. Math.21, Dunod, Paris, 1972
1972
-
[19]
I. EkelandandR. T ´emam,Convex analysis and variational problems,Class. Appl. Math.28, SIAM, 1999. doi:10.1137/1.9781611971088
-
[20]
A. ErnandJ.-L. Guermond,Finite elements I. Approximation and interpolation,Texts Appl. Math.72, Cham: Springer, 2020. doi:10.1007/978-3-030-56341-7
-
[21]
A. ErnandJ.-L. Guermond,Finite elements II. Galerkin approximation, elliptic and mixed PDEs,Texts Appl. Math.73, Cham: Springer, 2021. doi:10.1007/978-3-030-56923-5
-
[22]
Glowinski,J.-L
R. Glowinski,J.-L. Lions, andR. Tremolieres,Numerical analysis of variational inequalities. Transl. and rev. ed,Stud. Math. Appl.8, Elsevier, Amsterdam, 1981
1981
-
[23]
R. GriesseandD. A. Lorenz, A semismooth Newton method for Tikhonov functionals with sparsity S. Bartels and A. Kaltenbach40 constraints,Inverse Probl.24no. 3 (2008), 19, Id/No 035007. doi:10.1088/0266-5611/24/3/035007
-
[24]
T. H. HildebrandtandL. M. Graves, Implicit functions and their differentials in general analysis, Trans. Am. Math. Soc.29(1927), 127–153. doi:10.2307/1989282
-
[25]
M. Hinterm¨uller,K. Ito, andK. Kunisch, The primal-dual active set strategy as a semismooth Newton method,SIAM J. Optim.13no. 3 (2003), 865–888. doi:10.1137/S1052623401383558
-
[26]
M. Hinterm ¨ullerandK. Kunisch, Total bounded variation regularization as a bilater- ally constrained optimization problem,SIAM J. Appl. Math.64no. 4 (2004), 1311–1333. doi:10.1137/S0036139903422784
-
[27]
P. J. Huber, Robust estimation of a location parameter,Ann. Math. Stat.35(1964), 73–101. doi:10.1214/aoms/1177703732
-
[28]
J. D. Hunter, Matplotlib: A 2D graphics environment,Computing in Science & Engineering9 no. 3 (2007), 90–95. doi:10.1109/MCSE.2007.55
-
[29]
A. D. Ioffe, On lower semicontinuity of integral functionals. I,SIAM J. Control Optim.15(1977), 521–538. doi:10.1137/0315035
-
[30]
Kawohl, On a family of torsional creep problems,J
B. Kawohl, On a family of torsional creep problems,J. Reine Angew. Math.410(1990), 1–22. doi:10.1515/crll.1990.410.1
-
[31]
M. A. Krasnosel’ski˘ı, Topological methods in the theory of nonlinear integral equations. Translated by A. H. Armstrong, International Series of Monographs on Pure and Applied Mathematics. 45. Oxford etc.: Pergamon Press. xi, 395 p., 1964
1964
-
[32]
H. W. Kuhn, Some combinatorial lemmas in topology,IBM J. Res. Dev.4(1960), 518–524. doi:10.1147/rd.45.0518
-
[33]
O. A. LadyzhenskayaandN. N. Ural’tseva,Linear and quasilinear elliptic equations,Math. Sci. Eng.46, Elsevier, Amsterdam, 1968
1968
-
[34]
J. D. Lee,Y. Sun, andM. A. Saunders, Proximal Newton-type methods for minimizing composite functions,SIAM J. Optim.24no. 3 (2014), 1420–1443. doi:10.1137/130921428
-
[35]
P. Lindqvist,Notes on the stationary p-Laplace equation,SpringerBriefs Math., Cham: Springer; Bilbao: BCAM – Basque Center for Applied Mathematics, 2019. doi:10.1007/978-3-030-14501-9
-
[36]
Logg,K.-A
A. Logg,K.-A. Mardal, andG. Wells(eds.),Automated solution of differential equations by the finite element method. The FEniCS book,Lect. Notes Comput. Sci. Eng.84, Berlin: Springer,
-
[37]
doi:10.1007/978-3-642-23099-8
-
[38]
T.-T. LuandS.-H. Shiou, Inverses of 2 ×2 block matrices,Comput. Math. Appl.43no. 1-2 (2002), 119–129. doi:10.1016/S0898-1221(01)00278-4
-
[39]
Ostwald, Ueber die Geschwindigkeitsfunktion der Viskosit¨ at disperser Systeme
W. Ostwald, Ueber die Geschwindigkeitsfunktion der Viskosit¨ at disperser Systeme. I,Kolloid- Zeitschrift36no. 2 (1925), 99–117. doi:10.1007/BF01431449
-
[40]
N. ParikhandS. Boyd, Proximal algorithms,Found. Trends Optim.1no. 3 (2014), 127–239. doi:10.1561/2400000003
-
[41]
B. P¨otzl,A. Schiela, andP. Jaap, Second order semi-smooth proximal Newton methods in Hilbert spaces,Comput. Optim. Appl.82no. 2 (2022), 465–498. doi:10.1007/s10589-022-00369-9
-
[42]
B. P¨otzl,A. Schiela, andP. Jaap, Inexact proximal Newton methods in Hilbert spaces,Comput. Optim. Appl.87no. 1 (2024), 1–37. doi:10.1007/s10589-023-00515-x. [41]L. Prandtl, Zur Torsion von prismatischen St¨ aben,Phys. Z.4(1903), 758–759
-
[43]
R. T. Rockafellar, Integrals which are convex functionals,Pac. J. Math.24(1968), 525–539. doi:10.2140/pjm.1968.24.525
-
[44]
R. T. Rockafellar, Convex integral functionals and duality, inContributions to Nonlinear Functional Analysis, Academic Press, 1971, pp. 215–236. doi:10.1016/B978-0-12-775850-3.50012-1
-
[45]
R. T. RockafellarandR. J.-B. Wets,Variational analysis,Grundlehren Math. Wiss.317, Berlin: Springer, 1998. doi:10.1007/978-3-642-02431-3
-
[46]
L. I. Rudin,S. Osher, andE. Fatemi, Nonlinear total variation based noise removal algorithms, Physica D60no. 1-4 (1992), 259–268. doi:10.1016/0167-2789(92)90242-F
-
[47]
Available athttps://pypi.org/project/scikit-sparse/
scikit-sparse developers, scikit-sparse: Sparse matrix tools extending scipy.sparse, 2026. Available athttps://pypi.org/project/scikit-sparse/
2026
-
[48]
Toland,The dual of L∞(X,L, λ ), finitely additive measures and weak convergence
J. Toland,The dual of L∞(X,L, λ ), finitely additive measures and weak convergence. A primer, SpringerBriefs Math., Cham: Springer, 2020. doi:10.1007/978-3-030-34732-1
-
[49]
Ulbrich, Semismooth Newton methods for operator equations in function spaces,SIAM J
M. Ulbrich, Semismooth Newton methods for operator equations in function spaces,SIAM J. Optim.13no. 3 (2003), 805–841. doi:10.1137/S1052623400371569. A prox-based Semi-Smooth Newton Method41
-
[50]
D. Wachsmuth, A globalized inexact semismooth Newton method for strongly convex optimal control problems,J. Nonsmooth Anal. Optim.6(2026), 15574. doi:10.46298/jnsao-2026-15574. [50]A. de Waele,Viscometry and Plastometry,6, Oil and Colour Chemists’ Association, 1923. [51]E. Zeidler,Nonlinear functional analysis and its applications. III: Variational method...
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.