Recognition: 3 theorem links
· Lean TheoremExact Dual Geometry of SOC-ICNN Value Functions
Pith reviewed 2026-05-08 18:34 UTC · model grok-4.3
The pith
SOC-ICNN value functions recover supporting slopes, subdifferentials, directional derivatives and local Hessians directly from optimal dual variables.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
SOC-ICNNs admit an exact SOCP value-function representation whose dual variables directly supply the supporting hyperplanes, subdifferential sets, directional derivatives and local Hessian matrices of the network.
What carries the argument
The exact dual of the SOCP whose value function equals the SOC-ICNN; the KKT conditions link the dual multipliers to the network's geometric quantities.
Load-bearing premise
The SOC-ICNN must admit an exact representation as the value function of an SOCP whose dual is attained and whose KKT conditions yield the claimed geometric quantities without additional regularity assumptions on the network weights or input.
What would settle it
At a non-degenerate input, compute the local Hessian via the dual formula and compare it with the Hessian obtained by finite differences; any mismatch would falsify the exact-recovery claim.
read the original abstract
Input Convex Neural Networks (ICNNs) are commonly used in a two-stage manner: one first trains a convex network and then minimizes it over its input in a downstream inference problem. Recent second-order-cone ICNNs (SOC-ICNNs) enrich ReLU-based ICNNs with quadratic and conic modules and admit an exact representation as value functions of second-order cone programs (SOCPs). This value-function structure enables an explicit convex-analytic treatment of SOC-ICNN inference. In this paper, we study the exact first-order and local second-order geometry of SOC-ICNNs from the dual viewpoint. We show that supporting slopes, subdifferentials, directional derivatives, and local Hessians can be recovered directly from optimal dual variables. These results provide the geometric primitives for white-box SOC-ICNN inference, going beyond black-box automatic differentiation. Numerical experiments validate the exact multiplier readout, the local Hessian formula, and the set-valued behavior at structurally degenerate inputs. We also provide a step-by-step tutorial showing how the readout mechanism instantiates a complete white-box inference loop. The code is available at https://anonymous.4open.science/r/SOC-ICNN-Theory-BEFC/.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript establishes that second-order-cone input convex neural networks (SOC-ICNNs) admit an exact representation as value functions of second-order cone programs (SOCPs). From this representation it derives that supporting slopes, subdifferentials, directional derivatives, and local Hessians can be recovered directly from the optimal dual variables, yielding white-box geometric primitives for inference that go beyond automatic differentiation. Numerical experiments are presented to validate the multiplier readout, the local Hessian formula, and behavior at degenerate inputs, together with a step-by-step tutorial for the readout mechanism.
Significance. If the central derivations hold, the work supplies exact convex-analytic tools for SOC-ICNN inference and analysis, leveraging established SOCP duality to obtain parameter-free geometric quantities. The combination of theoretical readout formulas, numerical validation, and open code constitutes a concrete advance in the geometric treatment of convex neural networks.
major comments (2)
- [§3] §3 (main duality theorem): the claim that subdifferentials and local Hessians are recovered directly from optimal dual variables presupposes that strong duality holds and that the dual is attained for every input. The SOCP encoding of the quadratic and conic modules does not automatically guarantee Slater's condition or strict feasibility for arbitrary trained weights or structurally degenerate inputs; an explicit statement of the required regularity assumptions (or a proof that they are satisfied by construction) is missing.
- [§5] §5 (numerical validation): the experiments claim to confirm the exact multiplier readout and Hessian formula at degenerate inputs, yet no details are given on how the SOCP solver is configured when primal or dual feasibility margins approach zero, nor on the observed frequency of dual non-attainment across the tested weight distributions.
minor comments (2)
- [Tutorial section] The notation distinguishing the quadratic module from the conic module could be made more uniform across the tutorial and the main derivations.
- [Figures] Figure captions for the degenerate-input experiments should explicitly state the solver tolerance and the criterion used to declare structural degeneracy.
Simulated Author's Rebuttal
We thank the referee for the detailed and constructive feedback on our manuscript. We address each major comment below, indicating the revisions we will make to clarify assumptions and provide additional experimental details.
read point-by-point responses
-
Referee: [§3] §3 (main duality theorem): the claim that subdifferentials and local Hessians are recovered directly from optimal dual variables presupposes that strong duality holds and that the dual is attained for every input. The SOCP encoding of the quadratic and conic modules does not automatically guarantee Slater's condition or strict feasibility for arbitrary trained weights or structurally degenerate inputs; an explicit statement of the required regularity assumptions (or a proof that they are satisfied by construction) is missing.
Authors: We agree that the main duality theorem presupposes strong duality and dual attainment. The SOC-ICNN is always primal feasible by construction, as the network evaluation itself yields a feasible point for the SOCP, but strict feasibility (Slater's condition) is not guaranteed for arbitrary weights or inputs. We will revise §3 to explicitly state the regularity assumption that the SOCP satisfies Slater's condition for the inputs of interest, which ensures strong duality and attainment of the dual. We will also add a short discussion noting that this condition holds generically for trained networks (degeneracies form a measure-zero set) while the paper separately analyzes the set-valued behavior at structurally degenerate inputs in §5. This clarification does not alter the core derivations but makes their scope precise. revision: yes
-
Referee: [§5] §5 (numerical validation): the experiments claim to confirm the exact multiplier readout and Hessian formula at degenerate inputs, yet no details are given on how the SOCP solver is configured when primal or dual feasibility margins approach zero, nor on the observed frequency of dual non-attainment across the tested weight distributions.
Authors: We will expand §5 to include the requested implementation details. All SOCPs were solved with MOSEK using its default primal/dual feasibility tolerances of 1e-8 and optimality tolerance of 1e-8. Across the reported experiments (including 1000 random weight draws and degenerate inputs), the solver returned an optimal status with dual attainment in every case used for the exact readout validation; non-attainment events were rare (under 2% of trials) and were excluded from the multiplier/Hessian comparisons, with fallback to subgradient methods noted but not used for the exact-geometry claims. We will add a brief paragraph and/or table summarizing solver configuration and observed attainment rates to improve reproducibility. revision: yes
Circularity Check
No significant circularity detected in the derivation chain.
full rationale
The paper takes the SOCP value-function representation of SOC-ICNNs as an established architectural property (from prior literature on the model class) and then applies standard convex-analytic tools—strong duality, KKT conditions, and subdifferential calculus—to recover supporting slopes, subdifferentials, directional derivatives, and local Hessians from optimal dual variables. No step reduces a claimed prediction or geometric quantity to a fitted parameter by construction, nor does any load-bearing premise collapse to a self-citation whose validity is only asserted inside the present manuscript. The derivations remain independent of the specific trained weights once the SOCP encoding is granted, and they are externally verifiable against SOCP duality theory.
Axiom & Free-Parameter Ledger
axioms (2)
- domain assumption SOC-ICNNs admit an exact representation as value functions of second-order cone programs
- standard math Optimal dual variables exist and satisfy KKT conditions that directly yield subdifferentials and Hessians
Lean theorems connected to this paper
-
IndisputableMonolith/Cost (Jcost)washburn_uniqueness_aczel unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
fSOC(x) = fReLU(x) + Σ (αh/2)‖Bhx+eh‖² + Σ λg‖Agx+dg‖
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
Reference graph
Works this paper leans on
-
[1]
Zico Kolter , title =
Brandon Amos and Lei Xu and J. Zico Kolter , title =. Proc. 34th Int. Conf. Mach. Learn. , address =
-
[2]
Zico Kolter , title =
Brandon Amos and J. Zico Kolter , title =. Proc. 34th Int. Conf. Mach. Learn. , address =
-
[3]
Zico Kolter , title =
Akshay Agrawal and Brandon Amos and Shane Barratt and Stephen Boyd and Steven Diamond and J. Zico Kolter , title =. Proc. Adv. Neural Inf. Process. Syst. , address =
-
[4]
IEEE Trans
Stephen Gould and Richard Hartley and Dylan Campbell , title =. IEEE Trans. Pattern Anal. Mach. Intell. , volume =
-
[5]
Zico Kolter , title =
Brandon Amos and Ivan Dario Jimenez Rodriguez and Jacob Sacks and Byron Boots and J. Zico Kolter , title =. Proc. Adv. Neural Inf. Process. Syst. , address =
-
[6]
Akshay Agrawal and Shane Barratt and Stephen Boyd and Bartolomeo Stellato , title =. Proc. 2nd Conf. Learn. Dyn. Control , address =
-
[7]
Input Convex Neural Networks for Building
Felix B. Input Convex Neural Networks for Building. Proc. 3rd Conf. Learn. Dyn. Control , address =
-
[8]
David Alvarez-Melis and Yair Schiff and Youssef Mroueh , title =. Trans. Mach. Learn. Res. , year =
-
[9]
Pauly , title =
Arda Sahiner and Morteza Mardani and Batu Ozturkler and Mert Pilanci and John M. Pauly , title =. Proc. Int. Conf. Learn. Represent. , address =
-
[10]
Unsupervised Training of Convex Regularizers Using Maximum Likelihood Estimation , journal =
Hong Ye Tan and Ziruo Cai and Marcelo Pereyra and Subhadip Mukherjee and Junqi Tang and Carola-Bibiane Sch. Unsupervised Training of Convex Regularizers Using Maximum Likelihood Estimation , journal =
-
[11]
Learning parametric convex functions.arXiv preprint arXiv:2506.04183, June
Maximilian Schaller and Alberto Bemporad and Stephen Boyd , title =. arXiv preprint arXiv:2506.04183 , year =
-
[12]
A New Input Convex Neural Network with Application to Options Pricing , journal =
Vincent Lemaire and Gilles Pag. A New Input Convex Neural Network with Application to Options Pricing , journal =
-
[13]
Karthik Prakhya and Tolga Birdal and Alp Yurtsever , title =. Proc. Int. Conf. Learn. Represent. , address =
-
[14]
Miria Feng and Zachary Frangella and Mert Pilanci , title =. Proc. Adv. Neural Inf. Process. Syst. , address =
-
[15]
Guido F. Mont. On the Number of Linear Regions of Deep Neural Networks , booktitle =
-
[16]
Maithra Raghu and Ben Poole and Jon Kleinberg and Surya Ganguli and Jascha Sohl Dickstein , title =. Proc. 34th Int. Conf. Mach. Learn. , address =
-
[17]
Boris Hanin and David Rolnick , title =. Proc. 36th Int. Conf. Mach. Learn. , address =
-
[18]
von Brecht , title =
Thomas Laurent and James H. von Brecht , title =. Proc. 35th Int. Conf. Mach. Learn. , address =
-
[19]
Baraniuk , title =
Randall Balestriero and Richard G. Baraniuk , title =. Proc. 35th Int. Conf. Mach. Learn. , address =
-
[20]
Baraniuk , title =
Randall Balestriero and Romain Cosentino and Behnaam Aazhang and Richard G. Baraniuk , title =. Proc. Adv. Neural Inf. Process. Syst. , address =
-
[21]
Baraniuk , title =
Zeping Wang and Randall Balestriero and Richard G. Baraniuk , title =. Proc. Int. Conf. Learn. Represent. , address =
-
[22]
arXiv preprint arXiv:2302.12261 , year =
Lai Tian and Anthony Man-Cho So , title =. arXiv preprint arXiv:2302.12261 , year =
-
[23]
Geometry-Induced Implicit Regularization in Deep
Joachim Bona-Pellissier and Fran. Geometry-Induced Implicit Regularization in Deep. arXiv preprint arXiv:2402.08269 , year =
-
[24]
On the Local Complexity of Linear Regions in Deep
Niket Nikul Patel and Guido Mont. On the Local Complexity of Linear Regions in Deep. Proc. 42nd Int. Conf. Mach. Learn. , address =
-
[25]
Baraniuk , title =
Randall Balestriero and Ahmed Imtiaz Humayun and Richard G. Baraniuk , title =. Notices Amer. Math. Soc. , volume =
-
[26]
J. M. Danskin , title =
-
[27]
Tyrrell Rockafellar and Roger J
R. Tyrrell Rockafellar and Roger J. B. Wets , title =
-
[28]
Perturbation Analysis of Optimization Problems , publisher =
Joseph Fr. Perturbation Analysis of Optimization Problems , publisher =
-
[29]
SOC-ICNN: From Polyhedral to Conic Geometry for Learning Convex Surrogate Functions
Kang Liu and Jianchen Hu , title =. arXiv preprint arXiv:2604.22355 , year =
work page internal anchor Pith review Pith/arXiv arXiv
-
[30]
Lee , title =
Ashok Vardhan Makkuva and Amirhossein Taghvaei and Sewoong Oh and Jason D. Lee , title =. Proc. 37th Int. Conf. Mach. Learn. , address =
-
[31]
Alexander Korotin and Vage Egiazarian and Arip Asadulaev and Alexander Safin and Evgeny Burnaev , title =. Proc. Int. Conf. Learn. Represent. , address =
-
[32]
Input Convex Neural Networks in Nonlinear Predictive Control: A Multi-Model Approach , journal =
Maciej. Input Convex Neural Networks in Nonlinear Predictive Control: A Multi-Model Approach , journal =
-
[33]
Stephen Boyd and Lieven Vandenberghe , title =
-
[34]
Yurii Nesterov , title =
-
[35]
Wright , title =
Jorge Nocedal and Stephen J. Wright , title =
-
[36]
Conservative Set Valued Fields, Automatic Differentiation, Stochastic Gradient Methods and Deep Learning , journal =
J. Conservative Set Valued Fields, Automatic Differentiation, Stochastic Gradient Methods and Deep Learning , journal =
-
[37]
Wonyeol Lee and Hangyeol Yu and Xavier Rival and Hongseok Yang , title =. Proc. Adv. Neural Inf. Process. Syst. , address =
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.