Parameterized Hardness of Zonotope Containment and Neural Network Verification
Pith reviewed 2026-05-21 21:58 UTC · model grok-4.3
The pith
Deciding positivity of a two-layer ReLU network is W[1]-hard when parameterized only by input dimension.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
We prove that deciding positivity (and thus surjectivity) of a function f:R^d to R computed by a 2-layer ReLU network is W[1]-hard when parameterized by d. This result also implies that zonotope (non-)containment is W[1]-hard with respect to d. Moreover, we show that approximating the maximum within any multiplicative factor in 2-layer ReLU networks, computing the L_p-Lipschitz constant for p in (0, infinity] in 2-layer networks, and approximating the L_p-Lipschitz constant in 3-layer networks are NP-hard and W[1]-hard with respect to d. Our hardness results are the strongest known so far and imply that the naive enumeration-based methods for solving these fundamental problems are all essen
What carries the argument
Parameterized reduction from multicolored clique that preserves the dimension d as the sole parameter while mapping to 2-layer ReLU positivity and to zonotope containment.
Load-bearing premise
The reductions from multicolored clique to ReLU positivity and zonotope containment preserve d as the parameter and run in time f(d) times a polynomial in the input size.
What would settle it
An algorithm that decides 2-layer ReLU positivity correctly in time f(d) times a polynomial in the network size would falsify the W[1]-hardness claim.
Figures
read the original abstract
Neural networks with ReLU activations are a widely used model in machine learning. It is thus important to have a profound understanding of the properties of the functions computed by such networks. Recently, there has been increasing interest in the (parameterized) computational complexity of determining these properties. In this work, we close several gaps and resolve an open problem posted by Froese et al. [COLT '25] regarding the parameterized complexity of various problems related to network verification. In particular, we prove that deciding positivity (and thus surjectivity) of a function $f\colon\mathbb{R}^d\to\mathbb{R}$ computed by a 2-layer ReLU network is W[1]-hard when parameterized by $d$. This result also implies that zonotope (non-)containment is W[1]-hard with respect to $d$, a problem that is of independent interest in computational geometry, control theory, and robotics. Moreover, we show that approximating the maximum within any multiplicative factor in 2-layer ReLU networks, computing the $L_p$-Lipschitz constant for $p\in(0,\infty]$ in 2-layer networks, and approximating the $L_p$-Lipschitz constant in 3-layer networks are NP-hard and W[1]-hard with respect to $d$. Notably, our hardness results are the strongest known so far and imply that the naive enumeration-based methods for solving these fundamental problems are all essentially optimal under the Exponential Time Hypothesis.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript proves W[1]-hardness (parameterized by input dimension d) for deciding positivity of functions computed by 2-layer ReLU networks, which implies the same hardness for zonotope non-containment. It further establishes NP-hardness and W[1]-hardness for approximating the maximum value, computing L_p-Lipschitz constants (p in (0,infty]) in 2-layer networks, and approximating L_p-Lipschitz constants in 3-layer networks. These are claimed to be the strongest known results and to resolve an open problem from Froese et al. [COLT '25], with implications for the optimality of enumeration algorithms under the Exponential Time Hypothesis.
Significance. If the parameterized reductions are correct, the results provide tight complexity bounds for fundamental verification and geometric problems, showing that naive f(d) n^{O(1)} algorithms are essentially optimal and closing gaps in the parameterized complexity landscape for ReLU networks and zonotopes.
major comments (3)
- [Proof sketch for 2-layer ReLU positivity hardness] The proof sketch for W[1]-hardness of 2-layer ReLU positivity (reducing from multicolored clique) must explicitly verify that the constructed network dimension d satisfies d ≤ g(k) for a computable g depending only on the clique parameter k, with the reduction running in f(k) poly(n) time. Any encoding step that ties d to the graph size n would invalidate the parameterized claim.
- [Implication to zonotope containment] For the zonotope non-containment implication, the reduction from the ReLU positivity instance must preserve the parameterization by d without introducing additional factors that grow with the original input size.
- [3-layer Lipschitz approximation hardness] The W[1]-hardness claim for approximating the L_p-Lipschitz constant in 3-layer networks requires a detailed check that the reduction from the source W[1]-hard problem keeps the network depth and dimension parameterized solely by the original parameter.
minor comments (2)
- [Abstract] Clarify in the abstract and introduction whether 'positivity' refers to strict positivity or non-negativity, and how surjectivity follows directly.
- [All hardness proof sections] Ensure all proof sketches include explicit statements on the parameter dependence to allow verification without expanding to full proofs.
Simulated Author's Rebuttal
We thank the referee for the careful reading and constructive comments. We address each major comment below, providing clarifications on the parameterization in our reductions and indicating where we will expand the presentation for clarity.
read point-by-point responses
-
Referee: [Proof sketch for 2-layer ReLU positivity hardness] The proof sketch for W[1]-hardness of 2-layer ReLU positivity (reducing from multicolored clique) must explicitly verify that the constructed network dimension d satisfies d ≤ g(k) for a computable g depending only on the clique parameter k, with the reduction running in f(k) poly(n) time. Any encoding step that ties d to the graph size n would invalidate the parameterized claim.
Authors: We agree that an explicit verification strengthens the presentation. In the full reduction from Multicolored Clique (parameterized by k) given in Section 3, the input dimension d of the constructed 2-layer ReLU network is exactly 2k: one coordinate per color class for vertex selection and one auxiliary coordinate per class for the edge-consistency check. No part of the construction encodes the graph size n into d; the number of hidden neurons is polynomial in n while d depends only on k. The reduction itself runs in time O(k^2 n^2), which is f(k)·poly(n). We will insert a dedicated paragraph immediately after the proof sketch that states the bound d ≤ 2k and the runtime explicitly. revision: yes
-
Referee: [Implication to zonotope containment] For the zonotope non-containment implication, the reduction from the ReLU positivity instance must preserve the parameterization by d without introducing additional factors that grow with the original input size.
Authors: The reduction from 2-layer ReLU positivity to zonotope non-containment (Section 4) is a direct, dimension-preserving equivalence: given a network computing f : ℝ^d → ℝ, we build a zonotope Z in exactly the same dimension d whose containment query is equivalent to positivity of f. No new coordinates or parameters that depend on the original graph size n are added. Consequently the parameterization by d is inherited verbatim. We will add one clarifying sentence to the statement of the implication to make this preservation explicit, but no substantive change to the argument is required. revision: partial
-
Referee: [3-layer Lipschitz approximation hardness] The W[1]-hardness claim for approximating the L_p-Lipschitz constant in 3-layer networks requires a detailed check that the reduction from the source W[1]-hard problem keeps the network depth and dimension parameterized solely by the original parameter.
Authors: We accept the request for an explicit check. The W[1]-hardness result for approximating L_p-Lipschitz constants of 3-layer networks (Section 5) is obtained by a reduction from the already-established 2-layer case. The source instance has dimension d = O(k); the target 3-layer network is built by adding a single fixed-depth layer whose width is polynomial in the source size but whose input dimension remains exactly the same d. Thus both depth (exactly 3) and dimension stay functions of the original parameter k only. We will append a short verification paragraph detailing these bounds in the revised manuscript. revision: yes
Circularity Check
No significant circularity; hardness via standard parameterized reductions from external W[1]-hard problems
full rationale
The paper proves W[1]-hardness for 2-layer ReLU positivity (and derived zonotope non-containment) by explicit parameterized reductions from established problems such as multicolored clique, with the constructed network dimension d bounded as a function of the source parameter k and the reduction running in f(k)·poly time. These steps are described in proof sketches that construct the network weights and biases directly from the graph instance without redefining the target quantities in terms of themselves. The citation to Froese et al. [COLT '25] merely identifies the resolved open problem and does not serve as load-bearing justification for the new reductions or claims. No self-definitional equations, fitted inputs renamed as predictions, or ansatzes smuggled via self-citation appear; the derivation is self-contained against external complexity benchmarks.
Axiom & Free-Parameter Ledger
axioms (1)
- standard math W[1]-hardness of multicolored clique or similar problems when parameterized by clique size
Lean theorems connected to this paper
-
IndisputableMonolith/Foundation/ArithmeticFromLogic.leanreality_from_one_distinction unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
We prove that deciding positivity ... of a function f:R^d→R computed by a 2-layer ReLU network is W[1]-hard when parameterized by d. This result also implies that zonotope (non-)containment is W[1]-hard with respect to d.
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
Forward citations
Cited by 1 Pith paper
-
Nearly-Tight Bounds for Zonotope Containment and Beyond
Nearly-tight O(sqrt(d)) approximation for zonotope containment in the oracle model, with a proof of Talagrand's conjecture for constant Delta-modular zonotopes and a tight Theta(d/log d) bound for general convex bodies.
Reference graph
Works this paper leans on
-
[1]
Understanding deep neural networks with rectified linear units
Raman Arora, Amitabh Basu, Poorya Mianjy, and Anirbit Mukherjee. Understanding deep neural networks with rectified linear units. In International Conference on Learning Representations, 2018
work page 2018
-
[2]
A. E. Baburin and A. V. Pyatkin. Polynomial algorithms for solving the vector sum problem. Journal of Applied and Industrial Mathematics, 1 0 (3): 0 268--272, 2007
work page 2007
-
[3]
Bodlaender, Peter Gritzmann, Victor Klee, and Jan van Leeuwen
Hans L. Bodlaender, Peter Gritzmann, Victor Klee, and Jan van Leeuwen. Computational complexity of norm-maximization. Combinatorica, 10 0 (2): 0 203--225, 1990
work page 1990
-
[4]
End to End Learning for Self-Driving Cars
Mariusz Bojarski, Davide Del Testa, Daniel Dworakowski, Bernhard Firner, Beat Flepp, Prasoon Goyal, Lawrence D. Jackel, Mathew Monfort, Urs Muller, Jiakai Zhang, Xin Zhang, Jake Zhao, and Karol Zieba. End to end learning for self-driving cars. CoRR, abs/1604.07316, 2016
work page internal anchor Pith review Pith/arXiv arXiv 2016
-
[5]
The vector balancing constant for zonotopes
Rainie Bozzai, Victor Reis, and Thomas Rothvoss. The vector balancing constant for zonotopes. In 2023 IEEE 64th Annual Symposium on Foundations of Computer Science (FOCS), pp.\ 1292--1300. IEEE, 2023
work page 2023
-
[6]
New complexity-theoretic frontiers of tractability for neural network training
Cornelius Brand, Robert Ganian, and Mathis Rocton. New complexity-theoretic frontiers of tractability for neural network training. Advances in Neural Information Processing Systems, 36, 2023
work page 2023
-
[7]
Michael B. Cohen and Richard Peng. _p row sampling by lewis weights. In Proceedings of the 47th Annual ACM on Symposium on Theory of Computing (STOC '15) , pp.\ 183--192. ACM , 2015
work page 2015
-
[8]
Marek Cygan, Fedor V. Fomin, Lukasz Kowalik, Daniel Lokshtanov, D \' a niel Marx, Marcin Pilipczuk, Michal Pilipczuk, and Saket Saurabh. Parameterized Algorithms. Springer, 2015
work page 2015
-
[9]
Rodney G. Downey and Michael R. Fellows. Fundamentals of Parameterized Complexity. Springer, 2013
work page 2013
-
[10]
H. Edelsbrunner, J. O’Rourke, and R. Seidel. Constructing arrangements of lines and hyperplanes with applications. SIAM Journal on Computing, 15 0 (2): 0 341--363, 1986
work page 1986
-
[11]
Efficient and accurate estimation of lipschitz constants for deep neural networks
Mahyar Fazlyab, Alexander Robey, Hamed Hassani, Manfred Morari, and George Pappas. Efficient and accurate estimation of lipschitz constants for deep neural networks. Advances in Neural Information Processing Systems, 32, 2019
work page 2019
-
[12]
Complete verification via multi-neuron relaxation guided branch-and-bound
Claudio Ferrari, Mark Niklas M \"u ller, Nikola Jovanovi \'c , and Martin Vechev. Complete verification via multi-neuron relaxation guided branch-and-bound. In International Conference on Learning Representations, 2022
work page 2022
-
[13]
J. - A. Ferrez, Komei Fukuda, and Thomas M. Liebling. Solving the fixed rank convex quadratic maximization in binary variables by a parallel zonotope construction algorithm. European Journal of Operations Research, 166 0 (1): 0 35--50, 2005
work page 2005
-
[14]
Training neural networks is np-hard in fixed dimension
Vincent Froese and Christoph Hertrich. Training neural networks is np-hard in fixed dimension. Advances in Neural Information Processing Systems, 36, 2023
work page 2023
-
[15]
The computational complexity of ReLU network training parameterized by data dimensionality
Vincent Froese, Christoph Hertrich, and Rolf Niedermeier. The computational complexity of ReLU network training parameterized by data dimensionality. Journal of Artificial Intelligence Research, 74: 0 1775--1790, 2022
work page 2022
-
[16]
Open problem: Fixed-parameter tractability of zonotope problems
Vincent Froese, Moritz Grillo, Christoph Hertrich, and Martin Skutella. Open problem: Fixed-parameter tractability of zonotope problems. In Proceedings of Thirty Eighth Conference on Learning Theory, volume 291 of Proceedings of Machine Learning Research, pp.\ 6210--6214. PMLR, 2025 a
work page 2025
-
[17]
Complexity of injectivity and verification of relu neural networks (extended abstract)
Vincent Froese, Moritz Grillo, and Martin Skutella. Complexity of injectivity and verification of relu neural networks (extended abstract). In Proceedings of Thirty Eighth Conference on Learning Theory, volume 291 of Proceedings of Machine Learning Research, pp.\ 2188--2189. PMLR, 2025 b
work page 2025
-
[18]
On the effectiveness of interval bound propagation for training verifiably robust models
Sven Gowal, Krishnamurthy Dvijotham, Robert Stanforth, Rudy Bunel, Chongli Qin, Jonathan Uesato, Relja Arandjelovic, Timothy A. Mann, and Pushmeet Kohli. On the effectiveness of interval bound propagation for training verifiably robust models. CoRR, abs/1810.12715, 2018
-
[19]
Computing safe sets of linear sampled-data systems
Felix Gruber and Matthias Althoff. Computing safe sets of linear sampled-data systems. IEEE Control Systems Letters, 5 0 (2): 0 385--390, 2020
work page 2020
-
[20]
Scalable robust output feedback mpc of linear sampled-data systems
Felix Gruber and Matthias Althoff. Scalable robust output feedback mpc of linear sampled-data systems. In 2021 60th IEEE Conference on Decision and Control (CDC), pp.\ 2563--2570. IEEE, 2021
work page 2021
-
[21]
Deep ReLU networks have surprisingly few activation patterns
Boris Hanin and David Rolnick. Deep ReLU networks have surprisingly few activation patterns. Advances in Neural Information Processing Systems, 32, 2019
work page 2019
-
[22]
Neural networks and (virtual) extended formulations
Christoph Hertrich and Georg Loho. Neural networks and (virtual) extended formulations. CoRR, abs/2411.03006, 2024
-
[23]
Russell Impagliazzo and Ramamohan Paturi. On the complexity of k -SAT . Journal of Computer and System Sciences, 62 0 (2): 0 367--375, 2001
work page 2001
-
[24]
Exactly computing the local lipschitz constant of relu networks
Matt Jordan and Alexandros G Dimakis. Exactly computing the local lipschitz constant of relu networks. Advances in Neural Information Processing Systems, 33, 2020
work page 2020
-
[25]
Zonotope domains for lagrangian neural network verification
Matt Jordan, Jonathan Hayase, Alex Dimakis, and Sewoong Oh. Zonotope domains for lagrangian neural network verification. Advances in Neural Information Processing Systems, 35, 2022
work page 2022
-
[26]
Guy Katz, Clark W. Barrett, David L. Dill, Kyle Julian, and Mykel J. Kochenderfer. Reluplex: a calculus for reasoning about deep neural networks. Formal Methods in System Design, 60 0 (1): 0 87--116, 2022
work page 2022
-
[27]
Towards scalable complete verification of ReLU neural networks via dependency-based branching
Panagiotis Kouvaros and Alessio Lomuscio. Towards scalable complete verification of ReLU neural networks via dependency-based branching. In Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence (IJCAI '21), pp.\ 2643--2650, 2021
work page 2021
-
[28]
On the co-NP -completeness of the zonotope containment problem
Adrian Kulmburg and Matthias Althoff. On the co-NP -completeness of the zonotope containment problem. European Journal of Control, 62: 0 84--91, 2021
work page 2021
-
[29]
Approximability of the containment problem for zonotopes and ellipsotopes
Adrian Kulmburg, Lukas Schafer, and Matthias Althoff. Approximability of the containment problem for zonotopes and ellipsotopes. IEEE Transactions on Automatic Control, 2025
work page 2025
-
[30]
Abdul Majid Mian and S Chowla. On the b2 sequences of sidon. Proc. Nat. Acad. Sci. India. Sect. A, 14: 0 3--4, 1944
work page 1944
-
[31]
On the number of linear regions of deep neural networks
Guido Mont \'u far, Razvan Pascanu, Kyunghyun Cho, and Yoshua Bengio. On the number of linear regions of deep neural networks. Advances in Neural Information Processing Systems, 27, 2014
work page 2014
-
[32]
A complete annotated bibliography of work related to sidon sequences
Kevin O'Bryant. A complete annotated bibliography of work related to sidon sequences. The Electronic Journal of Combinatorics [electronic only], 11, 2004
work page 2004
-
[33]
Advances in verification of ReLU neural networks
Ansgar R \" o ssig and Milena Petkovic. Advances in verification of ReLU neural networks. Journal of Global Optimization, 81 0 (1): 0 109--152, 2021
work page 2021
-
[34]
Linear encodings for polytope containment problems
Sadra Sadraddini and Russ Tedrake. Linear encodings for polytope containment problems. In 2019 IEEE 58th conference on decision and control (CDC), pp.\ 4367--4372. IEEE, 2019
work page 2019
-
[35]
Reachability in simple neural networks
Marco S \" a lzer and Martin Lange. Reachability in simple neural networks. Fundamenta Informaticae, 189 0 (3-4): 0 241--259, 2022
work page 2022
- [36]
-
[37]
Complexity and approximation of finding the longest vector sum
Vladimir Shenmaier. Complexity and approximation of finding the longest vector sum. Computational Mathematics and Mathematical Physics, 58 0 (6): 0 850--857, 2018
work page 2018
-
[38]
Complexity and algorithms for finding a subset of vectors with the longest sum
Vladimir Shenmaier. Complexity and algorithms for finding a subset of vectors with the longest sum. Theoretical Computer Science, 818: 0 60--73, 2020
work page 2020
-
[39]
An abstract domain for certifying neural networks
Gagandeep Singh, Timon Gehr, Markus P\" u schel, and Martin Vechev. An abstract domain for certifying neural networks. Proceedings of the ACM on Programming Languages, 3 0 (POPL), 2019
work page 2019
-
[40]
The computational complexity of counting linear regions in ReLU neural networks
Moritz Stargalla, Christoph Hertrich, and Daniel Reichman. The computational complexity of counting linear regions in ReLU neural networks. arXiv preprint arXiv:2505.16716, 2025. Accepted for publication at NeurIPS 2025
-
[41]
Alfred St \"o hr. Gel \"o ste und ungel \"o ste fragen \"u ber basen der nat \"u rlichen zahlenreihe. ii. Journal f \"u r die reine und angewandte Mathematik , 194: 0 111--140, 1955
work page 1955
-
[42]
Christian Szegedy, Wojciech Zaremba, Ilya Sutskever, Joan Bruna, Dumitru Erhan, Ian J. Goodfellow, and Rob Fergus. Intriguing properties of neural networks. In International Conference on Learning Representations, 2014
work page 2014
-
[43]
Lipschitz regularity of deep neural networks: analysis and efficient estimation
Aladin Virmaux and Kevin Scaman. Lipschitz regularity of deep neural networks: analysis and efficient estimation. Advances in Neural Information Processing Systems, 31, 2018
work page 2018
-
[44]
Towards fast computation of certified robustness for relu networks
Lily Weng, Huan Zhang, Hongge Chen, Zhao Song, Cho-Jui Hsieh, Luca Daniel, Duane Boning, and Inderjit Dhillon. Towards fast computation of certified robustness for relu networks. In International Conference on Machine Learning, 2018 a
work page 2018
-
[45]
Evaluating the robustness of neural networks: An extreme value theory approach
Tsui-Wei Weng, Huan Zhang, Pin-Yu Chen, Jinfeng Yi, Dong Su, Yupeng Gao, Cho-Jui Hsieh, and Luca Daniel. Evaluating the robustness of neural networks: An extreme value theory approach. In International Conference on Learning Representations, 2018 b
work page 2018
-
[46]
Scaling provable adversarial defenses
Eric Wong, Frank Schmidt, Jan Hendrik Metzen, and J Zico Kolter. Scaling provable adversarial defenses. Advances in Neural Information Processing Systems, 31, 2018
work page 2018
-
[47]
Efficient backward reachability using the minkowski difference of constrained zonotopes
Liren Yang, Hang Zhang, Jean-Baptiste Jeannin, and Necmiye Ozay. Efficient backward reachability using the minkowski difference of constrained zonotopes. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 41 0 (11): 0 3969--3980, 2022
work page 2022
-
[48]
Facing up to arrangements: face-count formulas for partitions of space by hyperplanes
Thomas Zaslavsky. Facing up to arrangements: face-count formulas for partitions of space by hyperplanes. Memoirs of the American Mathematical Society, 1 0 (154), 1975
work page 1975
-
[49]
General cutting planes for bound-propagation-based neural network verification
Huan Zhang, Shiqi Wang, Kaidi Xu, Linyi Li, Bo Li, Suman Jana, Cho-Jui Hsieh, and J Zico Kolter. General cutting planes for bound-propagation-based neural network verification. Advances in Neural Information Processing Systems, 35, 2022
work page 2022
-
[50]
" write newline "" before.all 'output.state := FUNCTION n.dashify 't := "" t empty not t #1 #1 substring "-" = t #1 #2 substring "--" = not "--" * t #2 global.max substring 't := t #1 #1 substring "-" = "-" * t #2 global.max substring 't := while if t #1 #1 substring * t #2 global.max substring 't := if while FUNCTION format.date year duplicate empty "emp...
-
[51]
\@ifxundefined[1] #1\@undefined \@firstoftwo \@secondoftwo \@ifnum[1] #1 \@firstoftwo \@secondoftwo \@ifx[1] #1 \@firstoftwo \@secondoftwo [2] @ #1 \@temptokena #2 #1 @ \@temptokena \@ifclassloaded agu2001 natbib The agu2001 class already includes natbib coding, so you should not add it explicitly Type <Return> for now, but then later remove the command n...
-
[52]
\@lbibitem[] @bibitem@first@sw\@secondoftwo \@lbibitem[#1]#2 \@extra@b@citeb \@ifundefined br@#2\@extra@b@citeb \@namedef br@#2 \@nameuse br@#2\@extra@b@citeb \@ifundefined b@#2\@extra@b@citeb @num @parse #2 @tmp #1 NAT@b@open@#2 NAT@b@shut@#2 \@ifnum @merge>\@ne @bibitem@first@sw \@firstoftwo \@ifundefined NAT@b*@#2 \@firstoftwo @num @NAT@ctr \@secondoft...
-
[53]
@open @close @open @close and [1] URL: #1 \@ifundefined chapter * \@mkboth \@ifxundefined @sectionbib * \@mkboth * \@mkboth\@gobbletwo \@ifclassloaded amsart * \@ifclassloaded amsbook * \@ifxundefined @heading @heading NAT@ctr thebibliography [1] @ \@biblabel @NAT@ctr \@bibsetup #1 @NAT@ctr @ @openbib .11em \@plus.33em \@minus.07em 4000 4000 `\.\@m @bibit...
work page 2023
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.