The Cost of Relaxation: Evaluating the Error in Convex Neural Network Verification
Pith reviewed 2026-05-10 04:32 UTC · model grok-4.3
The pith
Convex relaxations of neural networks diverge from true outputs by an amount that grows exponentially with depth.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The space of convex relaxations of the integer activation constraints forms a lattice ordered by inclusion, with the original network at the bottom and the full linearization of every neuron at the top. Analytical upper and lower bounds establish that the ℓ∞ distance between the fully relaxed and original outputs grows exponentially in network depth and linearly in input radius, while misclassification probability exhibits step-like behavior as a function of that radius.
What carries the argument
The lattice of convex relaxations of integer constraints, with full linearization of every neuron as the top element.
If this is right
- Deeper networks will exhibit rapidly growing verification errors under full relaxation, limiting its practical use for deep architectures.
- Larger input radii in robustness queries will scale the relaxation error linearly, requiring tighter bounds or partial relaxations.
- Misclassification rates under relaxation will change abruptly once the input radius crosses certain thresholds.
- Verification procedures must incorporate depth-dependent error corrections to maintain reliable soundness guarantees.
Where Pith is reading between the lines
- Intermediate points in the relaxation lattice could be selected to trade off tightness against computational cost in a depth-aware manner.
- The exponential scaling may explain why complete verification remains difficult for deep networks even when relaxations are employed.
- Analogous divergence analysis could be performed for relaxations used in other settings such as optimization or control.
- The step-like misclassification pattern could be tested on architectures beyond image classifiers to assess its generality.
Load-bearing premise
The analysis assumes that the worst-case divergence is captured exactly by the top element of the relaxation lattice without additional restrictions from particular activation functions or network structures.
What would settle it
Direct computation of the ℓ∞ output difference on a network of increasing depth with fixed input radius, checking whether the observed gap follows the claimed exponential scaling.
Figures
read the original abstract
Many neural network (NN) verification systems represent the network's input-output relation as a constraint program. Sound and complete, representations involve integer constraints, for simulating the activations. Recent works convexly relax the integer constraints, improving performance, at the cost of soundness. Convex relaxations consider outputs that are unreachable by the original network. We study the worst case divergence between the original network and its convex relaxations; both qualitatively and quantitatively. The relaxations' space forms a lattice, where the top element corresponds to a full relaxation, with every neuron linearized. The bottom element corresponds to the original network. We provide analytical upper and lower bounds for the $\ell_\infty$-distance between the fully relaxed and original outputs. This distance grows exponentially, w.r.t. the network's depth, and linearly w.r.t. the input's radius. The misclassification probability exhibits a step-like behavior, w.r.t. input radius. Our results are supported by experiments on MNIST, Fashion MNIST and random networks.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript studies the divergence between neural networks and their convex relaxations in verification settings. It frames the relaxations as a lattice ordered by tightness, with the original network (integer constraints for activations) as the bottom element and the fully linearized relaxation (every neuron replaced by its convex envelope) as the top element. Analytical upper and lower bounds are derived for the ℓ∞ distance between the original and fully-relaxed outputs; these bounds are claimed to grow exponentially in network depth and linearly in input radius. The misclassification probability is shown to exhibit step-like behavior as a function of input radius. The theoretical results are supported by experiments on MNIST, Fashion MNIST, and random networks.
Significance. If the derivations hold, the work supplies a clean lattice-theoretic account of relaxation error together with explicit scaling laws. The exponential-in-depth characterization is particularly useful for understanding when convex relaxations remain practically viable in deep networks, and the step-function behavior of misclassification probability offers a concrete, testable prediction for certified robustness. The provision of closed-form bounds rather than purely empirical observations strengthens the contribution to the verification literature.
major comments (2)
- [Theoretical development (around the lattice construction)] The central analytical bounds rely on the lattice being generated by successive relaxations of ReLU (or similar piecewise-linear) neurons; the manuscript should explicitly state the precise class of activation functions for which the top element is the full linearization and verify that the worst-case divergence is indeed attained there (rather than at some intermediate relaxation).
- [Derivation of the ℓ∞ bounds] The claimed exponential growth in depth is load-bearing for the main thesis; the proof should be checked for hidden constants or base that depend on the input radius or weight norms, as these would alter the practical interpretation of the bound.
minor comments (2)
- [Experiments] The experimental section should report the exact network depths, widths, and weight initialization ranges used for the random networks, together with quantitative error values (not only qualitative plots) to allow direct comparison with the analytical bounds.
- [Preliminaries] Notation for the lattice meet and join operations, as well as the precise definition of the input ball radius, should be introduced with a small diagram or table for clarity.
Simulated Author's Rebuttal
We thank the referee for the constructive comments and positive evaluation of our work. We address each major comment below and will revise the manuscript to incorporate the requested clarifications.
read point-by-point responses
-
Referee: The central analytical bounds rely on the lattice being generated by successive relaxations of ReLU (or similar piecewise-linear) neurons; the manuscript should explicitly state the precise class of activation functions for which the top element is the full linearization and verify that the worst-case divergence is indeed attained there (rather than at some intermediate relaxation).
Authors: We agree that the activation class requires explicit statement. The lattice and bounds are derived for feedforward networks using ReLU activations, where each relaxation replaces the ReLU with its convex envelope (the line segment connecting the lower and upper pre-activation bounds). We will add a dedicated paragraph in Section 3 stating this restriction and include a short lemma showing that, for any fixed input box, the output set of the fully relaxed network contains the output set of every intermediate relaxation; consequently the ℓ∞ divergence from the original network is maximized at the top element of the lattice. revision: yes
-
Referee: The claimed exponential growth in depth is load-bearing for the main thesis; the proof should be checked for hidden constants or base that depend on the input radius or weight norms, as these would alter the practical interpretation of the bound.
Authors: We have re-examined the inductive proof of the ℓ∞ bounds (Theorem 4.1). The upper bound is of the form C · r · β^d, where r is the input radius, d the depth, C is a small numerical constant independent of r and the weights, and β ≤ max(1, ||W||_∞) depends only on the maximum absolute weight value across layers. The base β therefore does not depend on the radius; the radius enters solely as a linear factor. We will rewrite the theorem statement and the proof sketch to display this dependence explicitly, thereby removing any ambiguity about the scaling laws. revision: yes
Circularity Check
No significant circularity in analytical bounds derivation
full rationale
The paper defines a lattice of convex relaxations ordered by inclusion, with the original integer-constrained network as the bottom element and the full linearization of every neuron as the top element by construction of the relaxation ordering. It then derives analytical upper and lower bounds on the ℓ∞ output distance between these extremes, establishing the exponential dependence on depth and linear dependence on input radius through direct analysis of the worst-case divergence under this ordering. The step-like misclassification behavior follows as a consequence of the distance bounds crossing decision thresholds. No load-bearing step reduces to a fitted parameter, self-citation chain, or tautological renaming; the central results are self-contained mathematical consequences of the lattice model and network properties.
Axiom & Free-Parameter Ledger
axioms (2)
- domain assumption Convex relaxations of integer constraints for neural network activations form a lattice ordered by inclusion of feasible sets.
- domain assumption The top element of the lattice corresponds to linearizing every neuron independently.
Reference graph
Works this paper leans on
-
[1]
Abduction-Based Explanations for Machine Learning Models , year =
Alexey Ignatiev and Nina Narodytska and Jo. Abduction-Based Explanations for Machine Learning Models , year =. AAAI , optbooktitle =
-
[2]
Yacine Izza and Alexey Ignatiev and Peter J. Stuckey and Jo. Delivering Inflated Explanations , year =. AAAI , optbooktitle =
-
[3]
Computing Reseacrh Repository , title =
Yacine Izza and Xuanxiang Huang and Ant. Computing Reseacrh Repository , title =
-
[4]
Computing Reseacrh Repository , title =
Xuanxiang Huang and Jo. Computing Reseacrh Repository , title =
-
[5]
Min Wu and Haoze Wu and Clark W. Barrett , booktitle =. VeriX: Towards Verified Explainability of Deep Neural Networks , year =
-
[6]
Logic-Based Explainability in Machine Learning , year =
Jo. Logic-Based Explainability in Machine Learning , year =. Reasoning Web , optbooktitle =
- [7]
-
[8]
International Journal of Approximate Reasoning , title =
Xuanxiang Huang and Jo. International Journal of Approximate Reasoning , title =
-
[9]
On Tackling Explanation Redundancy in Decision Trees (Extended Abstract) , year =
Yacine Izza and Alexey Ignatiev and Jo. On Tackling Explanation Redundancy in Decision Trees (Extended Abstract) , year =. IJCAI , optbooktitle =
-
[10]
Theory of Mind Abilities of Large Language Models in Human-Robot Interaction: An Illusion? , year =
Mudit Verma and Siddhant Bhambri and Subbarao Kambhampati , booktitle =. Theory of Mind Abilities of Large Language Models in Human-Robot Interaction: An Illusion? , year =
- [11]
-
[12]
Survey of Hallucination in Natural Language Generation , year =
Ziwei Ji and Nayeon Lee and Rita Frieske and Tiezheng Yu and Dan Su and Yan Xu and Etsuko Ishii and Ye Jin Bang and Andrea Madotto and Pascale Fung , journal =. Survey of Hallucination in Natural Language Generation , year =
- [13]
-
[14]
International Journal on Software Tools for Technology Transfer , title =
Christopher Brix and Mark Niklas M. International Journal on Software Tools for Technology Transfer , title =
-
[15]
Proceedings of the ACM on Programming Languages6(POPL), 1–33 (2022)
Mark Niklas M. doi:10.1145/3498704 , journal =
-
[16]
Rudy Bunel and Jingyue Lu and Ilker Turkaslan and Philip H. S. Torr and Pushmeet Kohli and M. Pawan Kumar , bibsource =. Branch and Bound for Piecewise Linear Neural Network Verification , url =. Journal of Machine Learning Research , pages =
-
[17]
Towards Fast Computation of Certified Robustness for ReLU Networks , publisher =
Tsui. Towards Fast Computation of Certified Robustness for ReLU Networks , publisher =
-
[18]
Efficient Neural Network Robustness Certification with General Activation Functions , booktitle =
Huan Zhang and Tsui. Efficient Neural Network Robustness Certification with General Activation Functions , booktitle =
-
[19]
Kaidi Xu and Huan Zhang and Shiqi Wang and Yihan Wang and Suman Jana and Xue Lin and Cho. ICLR , title =
-
[20]
Shiqi Wang and Huan Zhang and Kaidi Xu and Xue Lin and Suman Jana and Cho. NeurIPS , title =
-
[21]
Guy Katz and Derek A. Huang and Duligur Ibeling and Kyle Julian and Christopher Lazarus and Rachel Lim and Parth Shah and Shantanu Thakoor and Haoze Wu and Aleksandar Zeljic and David L. Dill and Mykel J. Kochenderfer and Clark W. Barrett , booktitle =. The Marabou Framework for Verification and Analysis of Deep Neural Networks , year =
-
[22]
Guy Katz and Clark W. Barrett and David L. Dill and Kyle Julian and Mykel J. Kochenderfer , title =. Formal Methods Syst. Des. , year =
-
[23]
Haoze Wu and Omri Isac and Aleksandar Zeljic and Teruhiro Tagomori and Matthew L. Daggitt and Wen Kokke and Idan Refaeli and Guy Amir and Kyle Julian and Shahaf Bassan and Pei Huang and Ori Lahav and Min Wu and Min Zhang and Ekaterina Komendantskaya and Guy Katz and Clark W. Barrett , booktitle =. Marabou 2.0:
-
[24]
Algorithms for Verifying Deep Neural Networks , author=. Found. Trends Optim. , year=
-
[25]
Timon Gehr and Matthew Mirman and Dana Drachsler. 2018
work page 2018
-
[26]
Aleksander Madry and Aleksandar Makelov and Ludwig Schmidt and Dimitris Tsipras and Adrian Vladu , title =. ICLR , year =
-
[27]
Xing and Laurent El Ghaoui and Michael I
Hongyang Zhang and Yaodong Yu and Jiantao Jiao and Eric P. Xing and Laurent El Ghaoui and Michael I. Jordan , editor =. Theoretically Principled Trade-off between Robustness and Accuracy , booktitle =
-
[28]
Dimitris Tsipras and Shibani Santurkar and Logan Engstrom and Alexander Turner and Aleksander Madry , title =. ICLR , year =
-
[29]
Duchi and Percy Liang , title =
Aditi Raghunathan and Sang Michael Xie and Fanny Yang and John C. Duchi and Percy Liang , title =. ICML , series =
-
[30]
Computing Reseacrh Repository , title =
Joey Huchette and Gonzalo Mu. Computing Reseacrh Repository , title =
-
[31]
Deep neural networks and mixed integer linear optimization , year =
Matteo Fischetti and Jason Jo , journal =. Deep neural networks and mixed integer linear optimization , year =
-
[32]
Integer programming , author =
-
[33]
Stinchcombe and Halbert White , journal =
Kurt Hornik and Maxwell B. Stinchcombe and Halbert White , journal =. Multilayer feedforward networks are universal approximators , year =
- [34]
- [35]
-
[36]
Guy Katz and Clark W. Barrett and David L. Dill and Kyle Julian and Mykel J. Kochenderfer , booktitle =. Reluplex: An Efficient
- [37]
- [38]
-
[39]
Mittelstadt and Chris Russell , journal =
Sandra Wachter and Brent D. Mittelstadt and Chris Russell , journal =. Counterfactual Explanations without Opening the Black Box: Automated Decisions and the
-
[40]
Sharp Minima Can Generalize For Deep Nets , year =
Laurent Dinh and Razvan Pascanu and Samy Bengio and Yoshua Bengio , booktitle =. Sharp Minima Can Generalize For Deep Nets , year =
-
[41]
Fatemeh Alizadeh and Gunnar Stevens and Margarita Esau , journal =. I Don
-
[42]
Lilian Edwards and Michael Veale , journal =. Enslaving the Algorithm: From a "Right to an Explanation" to a "Right to Better Decisions"? , year =
-
[43]
The mnist database of handwritten digit images for machine learning research , year =
Li Deng , journal =. The mnist database of handwritten digit images for machine learning research , year =
- [44]
-
[45]
Andrew Cropper and Sebastijan Dumancic and Richard Evans and Stephen H. Muggleton , journal =. Inductive logic programming at 30 , year =
- [46]
-
[47]
Duarte and Jochen Garcke , journal =
Ribana Roscher and Bastian Bohn and Marco F. Duarte and Jochen Garcke , journal =. Explainable Machine Learning for Scientific Insights and Discoveries , year =
-
[48]
On the Decision Boundaries of Neural Networks:
Motasem Alfarra and Adel Bibi and Hasan Hammoud and Mohamed Gaafar and Bernard Ghanem , journal =. On the Decision Boundaries of Neural Networks:
-
[49]
Geometry and Topology of Deep Neural Networks' Decision Boundaries , year =
Bo Liu , journal =. Geometry and Topology of Deep Neural Networks' Decision Boundaries , year =
-
[50]
Cynthia Rudin and Caroline Wang and Beau Coker , journal =. The
-
[51]
The perceptron: A probabilistic model for information storage and organization in the brain
Frank Rosenblatt , journal =. The perceptron: A probabilistic model for information storage and organization in the brain. , year =
- [52]
-
[53]
Alston S. Householder , title =. Bulletin of Mathematical Biophysics , year =
-
[54]
A Stochastic Approximation Method , year =
Herbert Robbins and Sutton Monro , journal =. A Stochastic Approximation Method , year =
-
[55]
Paul J. Werbos , booktitle =. Backpropagation: past and future , year =
-
[56]
Rajat Raina and Anand Madhavan and Andrew Y. Ng , booktitle =. Large-scale deep unsupervised learning using graphics processors , year =
-
[57]
Alex Krizhevsky and Ilya Sutskever and Geoffrey E. Hinton , booktitle =. ImageNet Classification with Deep Convolutional Neural Networks , year =
-
[58]
Regression Shrinkage and Selection via the Lasso , year =
Robert Tibshirani , journal =. Regression Shrinkage and Selection via the Lasso , year =
-
[59]
One Pixel Attack for Fooling Deep Neural Networks , volume =
Jiawei Su and Danilo Vasconcellos Vargas and Kouichi Sakurai , journal =. One Pixel Attack for Fooling Deep Neural Networks , volume =
-
[60]
Goodfellow and Jonathon Shlens and Christian Szegedy , booktitle =
Ian J. Goodfellow and Jonathon Shlens and Christian Szegedy , booktitle =. Explaining and Harnessing Adversarial Examples , year =
-
[61]
Goodfellow and Rob Fergus , booktitle =
Christian Szegedy and Wojciech Zaremba and Ilya Sutskever and Joan Bruna and Dumitru Erhan and Ian J. Goodfellow and Rob Fergus , booktitle =. Intriguing properties of neural networks , year =
-
[62]
Aleksander Madry and Aleksandar Makelov and Ludwig Schmidt and Dimitris Tsipras and Adrian Vladu , title =. ICLR , publisher =
-
[63]
Maksym Andriushchenko and Francesco Croce and Nicolas Flammarion and Matthias Hein , booktitle =. Square Attack:
-
[64]
How Deep Learning Sees the World:
Joana Cabral Costa and Tiago Roxo and Hugo Proen. How Deep Learning Sees the World:
-
[65]
Nicholas Carlini and Guy Katz and Clark W. Barrett and David L. Dill , journal =
-
[66]
Nicolas Papernot and Patrick D. McDaniel and Ian J. Goodfellow and Somesh Jha and Z. Berkay Celik and Ananthram Swami , booktitle =. Practical Black-Box Attacks against Machine Learning , year =
-
[67]
McDaniel and Xi Wu and Somesh Jha and Ananthram Swami , booktitle =
Nicolas Papernot and Patrick D. McDaniel and Xi Wu and Somesh Jha and Ananthram Swami , booktitle =. Distillation as a Defense to Adversarial Perturbations Against Deep Neural Networks , year =
-
[68]
Ensemble Adversarial Training: Attacks and Defenses , year =
Florian Tram. Ensemble Adversarial Training: Attacks and Defenses , year =. ICLR , optbooktitle =
-
[69]
Anish Athalye and Nicholas Carlini and David A. Wagner , booktitle =. Obfuscated Gradients Give a False Sense of Security: Circumventing Defenses to Adversarial Examples , year =
-
[70]
Xiaowei Huang and Daniel Kroening and Wenjie Ruan and James Sharp and Youcheng Sun and Emese Thamo and Min Wu and Xinping Yi , journal =. A survey of safety and trustworthiness of deep neural networks: Verification, testing, adversarial attack and defence, and interpretability , year =
-
[71]
Mark Huasong Meng and Guangdong Bai and Sin Gee Teo and Zhe Hou and Yan Xiao and Yun Lin and Jin Song Dong , journal =. Adversarial robustness of deep neural networks: A survey from a formal verification perspective , year =
-
[72]
Towards Formal XAI: Formally Approximate Minimal Explanations of Neural Networks , year =
Shahaf Bassan and Guy Katz , booktitle =. Towards Formal XAI: Formally Approximate Minimal Explanations of Neural Networks , year =
-
[73]
A comprehensive survey of robust deep learning in computer vision , journal =. 2023 , author =
work page 2023
-
[74]
Circuit Complexity and Neural Networks , author =
- [75]
-
[76]
The MIT Press, Cambridge, expanded edition , volume=
Perceptron: an introduction to computational geometry , author=. The MIT Press, Cambridge, expanded edition , volume=
-
[77]
Huan Zhang and Hongge Chen and Chaowei Xiao and Sven Gowal and Robert Stanforth and Bo Li and Duane S. Boning and Cho. ICLR , title =
-
[78]
Mann and Pushmeet Kohli , title =
Sven Gowal and Krishnamurthy Dvijotham and Robert Stanforth and Rudy Bunel and Chongli Qin and Jonathan Uesato and Relja Arandjelovic and Timothy A. Mann and Pushmeet Kohli , title =. CoRR , year =
- [79]
-
[80]
Chen Liu and Ryota Tomioka and Volkan Cevher , title =. ICML , publisher =
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.