pith. machine review for the scientific record. sign in

arxiv: 2605.03601 · v1 · submitted 2026-05-05 · 💻 cs.LG · cs.DM· math.CO

Recognition: unknown

Most ReLU Networks Admit Identifiable Parameters

Authors on Pith no claims yet

Pith reviewed 2026-05-07 04:00 UTC · model grok-4.3

classification 💻 cs.LG cs.DMmath.CO
keywords ReLU networksparameter identifiabilityfunctional dimensionpolyhedral complexesrealization mapscaling and permutation symmetriesdepth hierarchy
0
0 comments X

The pith

ReLU networks with input and hidden widths at least two admit an open set of identifiable parameters.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper establishes that for any deep ReLU architecture in which the input dimension and every hidden layer have width at least two, there is an open set of parameter vectors such that the realized function determines those parameters uniquely up to the usual scaling and permutation of hidden neurons. This pins the functional dimension of the network exactly to the total number of parameters minus the number of hidden neurons, showing there are no further continuous redundancies. A sympathetic reader would care because the result clarifies the precise amount of overparameterization in these models and proves that deeper networks generically realize functions that cannot be matched by shallower ones. The argument proceeds by introducing weighted polyhedral complexes to track the linear regions of the network and detect any hidden dependencies in the realization map.

Core claim

For every ReLU network architecture whose input and hidden layers have width at least two, there exists an open set of identifiable parameters. This implies that the functional dimension of every such architecture is exactly the number of parameters minus the number of hidden neurons. The authors reach the conclusion by analyzing the realization map through a framework of weighted polyhedral complexes that capture the arrangement of linear pieces and any additional redundancies beyond scaling and permutation symmetries. They also show that even minimal functional representations can retain non-trivial parameter redundancies and that, for an open set of parameters, the realized function lies,

What carries the argument

The weighted polyhedral complex associated with a ReLU network, which records the linear regions together with weights on their facets to isolate redundancies beyond scaling and permutation.

If this is right

  • The functional dimension equals the total number of parameters minus the total number of hidden neurons.
  • Minimal functional representations can still possess non-trivial parameter redundancies.
  • For an open set of parameters the realized function cannot be represented generically by any shallower network.
  • Identifiability holds on an open dense subset of parameter space for every qualifying architecture.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The result suggests that the non-identifiable parameters form a lower-dimensional subset that can be avoided by generic initialization or optimization paths.
  • The generic depth hierarchy implies that increasing depth strictly enlarges the set of realizable functions in a measure-theoretic sense.
  • One could test the prediction by sampling random parameters in small qualifying networks and verifying that the local dimension of the image matches the stated formula.

Load-bearing premise

The input dimension and all hidden layer widths must be at least two, and the identifiability result is stated only for an open set of parameters rather than for every parameter vector.

What would settle it

An explicit small ReLU network with all widths at least two in which the functional dimension is strictly smaller than the parameter count minus hidden neurons, or a direct calculation showing that the realization map has positive-dimensional fibers on a positive-measure set of parameters.

Figures

Figures reproduced from arXiv: 2605.03601 by Guido Mont\'ufar, Moritz Grillo.

Figure 1
Figure 1. Figure 1: Illustration of weighted polyhedral complexes, the canonical polyhedral complex and bent hyper view at source ↗
Figure 2
Figure 2. Figure 2: Illustration of how transparency implies LRA, and how supertransversality can fail. view at source ↗
Figure 3
Figure 3. Figure 3: Illustration of the inductive construction in Theorem 4.10 that satisfies TPIC and LRA. view at source ↗
Figure 4
Figure 4. Figure 4: Illustration of bending and non-bending ridges. view at source ↗
Figure 5
Figure 5. Figure 5: Illustration of Lemma 5.21. H1, . . . , Hn, Lemma 5.20 implies that the linear part of fθ − g is of the form Pn i=1 αi W (2) :,i W (1) i . Hence the affine-linear map h(x) = Ax + (fθ − g)(x) has linear part A + Pn i=1 αi W (2) :,i W (1) i . By assumption, the projection (A + Pn i=1 αi W (2) :,i W (1) i )Q has rank r. Therefore the linear part of h|aff(P ) has rank at least r. On the other hand, h is repres… view at source ↗
Figure 6
Figure 6. Figure 6: Canonical complexes for Example 5.26. Dashed lines indicate breakpoints that are not visible from view at source ↗
Figure 7
Figure 7. Figure 7: Illustration of the inductive construction in (Grigsby et al., 2023) for the architecture (2 view at source ↗
read the original abstract

We study the realization map of deep ReLU networks, focusing on when a function determines its parameters up to scaling and permutation. To analyze hidden redundancies beyond these standard symmetries, we introduce a framework based on weighted polyhedral complexes. Our main result shows that for every architecture whose input and hidden layers have width at least two, there exists an open set of identifiable parameters. This implies that the functional dimension of every such architecture is exactly the number of parameters minus the number of hidden neurons. We further show that minimal functional representations can still have non-trivial parameter redundancies. Finally, we establish a generic depth hierarchy, whereby for an open set of parameters the realized function cannot be represented generically by any shallower network.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

0 major / 2 minor

Summary. The manuscript introduces a framework of weighted polyhedral complexes to analyze the realization map of deep ReLU networks. It proves that, for every architecture in which the input dimension and all hidden-layer widths are at least 2, there exists an open set of parameters that are identifiable up to the standard scaling and permutation symmetries. This yields the exact functional dimension (number of parameters minus number of hidden neurons) for such architectures. The paper further shows that even minimal functional representations can retain non-trivial parameter redundancies and establishes a generic depth hierarchy: for an open set of parameters the realized function cannot be represented by any shallower network.

Significance. If the central claims hold, the work supplies a clean geometric resolution to the question of functional dimension for ReLU networks and demonstrates that, generically, no extra local redundancies exist beyond the obvious one-dimensional scaling symmetry per neuron. The weighted-polyhedral-complex machinery is a new technical tool that directly produces both the open-set identifiability statement and the depth-separation corollary; it is likely to be reusable for related questions about piecewise-linear networks. The explicit width-≥2 hypothesis and the open-set qualifier are stated clearly, and the argument avoids circularity by relying on standard properties of polyhedral complexes rather than on fitted quantities.

minor comments (2)
  1. [§3.2] §3.2, Definition 3.4: the construction of the weighted polyhedral complex is technically correct but would be easier to follow if a low-dimensional (e.g., 1-hidden-layer, width-2) example were worked out explicitly before the general case.
  2. [§5.4] The statement of the depth-hierarchy result (Theorem 5.3) is clear, yet the proof sketch in §5.4 could usefully include a one-sentence reminder of why the open-set condition on the deeper network automatically excludes generic representations by shallower networks.

Simulated Author's Rebuttal

0 responses · 0 unresolved

We thank the referee for their positive review, detailed summary of our contributions, and recommendation to accept the manuscript. No major comments were raised, so we have no points requiring a point-by-point response or revisions.

Circularity Check

0 steps flagged

No significant circularity in the derivation chain

full rationale

The paper introduces a weighted polyhedral complex framework to study the realization map of deep ReLU networks and proves that for input and hidden widths at least 2 there exists an open set of parameters identifiable up to scaling and permutation. This directly implies the functional dimension equals the parameter count minus the hidden neuron count. The argument proceeds from geometric properties of piecewise-linear functions and the standard one-dimensional scaling symmetry per neuron, without any self-definitional reductions, fitted inputs renamed as predictions, or load-bearing self-citations. The width hypothesis is an explicit assumption that rules out degenerate cases, and the open-set qualifier is maintained throughout; no step equates the claimed result to its own inputs by construction.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 1 invented entities

The central claim rests on standard properties of ReLU networks as continuous piecewise-linear functions and on the combinatorial structure of polyhedral complexes; the weighted variant is introduced as a new modeling tool without independent external validation.

axioms (2)
  • domain assumption ReLU networks realize continuous piecewise-linear functions whose linear regions form a polyhedral complex
    Invoked throughout the analysis of the realization map and hidden redundancies.
  • standard math Standard facts from combinatorial geometry about polyhedral complexes and their weighted refinements
    Used to define the framework and prove openness of the identifiable set.
invented entities (1)
  • weighted polyhedral complex no independent evidence
    purpose: To encode both the linear regions and the parameter-dependent weights that determine how regions map across layers
    New modeling device introduced to detect redundancies beyond scaling and permutation; no external falsifiable prediction is supplied.

pith-pipeline@v0.9.0 · 5412 in / 1529 out tokens · 63009 ms · 2026-05-07T04:00:26.307635+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

58 extracted references · 25 canonical work pages · 1 internal anchor

  1. [1]

    Mary Phuong and Christoph H. Lampert. Functional vs.\ parametric equivalence of ReLU networks. In International Conference on Learning Representations, 2020. URL https://openreview.net/forum?id=Bylx-TNKvH

  2. [2]

    Hidden symmetries of R e LU networks

    Elisenda Grigsby, Kathryn Lindsey, and David Rolnick. Hidden symmetries of R e LU networks. In Proceedings of the 40th International Conference on Machine Learning, volume 202 of Proceedings of Machine Learning Research, pages 11734--11760. PMLR, 23--29 Jul 2023. URL https://proceedings.mlr.press/v202/grigsby23a.html

  3. [3]

    Reverse-engineering deep R e LU networks

    David Rolnick and Konrad Kording. Reverse-engineering deep R e LU networks. In Proceedings of the 37th International Conference on Machine Learning, volume 119 of Proceedings of Machine Learning Research, pages 8178--8187. PMLR, 13--18 Jul 2020. URL https://proceedings.mlr.press/v119/rolnick20a.html

  4. [4]

    Sussmann

    Héctor J. Sussmann. Uniqueness of the weights for minimal feedforward nets with a given input-output map. Neural Networks, 5 0 (4): 0 589--593, 1992. ISSN 0893-6080. doi:https://doi.org/10.1016/S0893-6080(05)80037-1. URL https://www.sciencedirect.com/science/article/pii/S0893608005800371

  5. [6]

    Notes on the Symmetries of 2-Layer ReLU-Networks

    Henning Petzka, Martin Trimmel, and Cristian Sminchisescu. Notes on the symmetries of 2-layer ReLU -networks. In NLDL, pages 1--6, 2020. URL https://doi.org/10.7557/18.5150

  6. [7]

    Identifiable equivariant networks are layerwise equivariant, 2026 a

    Vahid Shahverdi, Giovanni Luca Marchetti, Georg Bökman, and Kathlén Kohn. Identifiable equivariant networks are layerwise equivariant, 2026 a . URL https://arxiv.org/abs/2601.21645

  7. [8]

    Better neural network expressivity: Subdividing the simplex, 2026

    Egor Bakaev, Florestan Brunck, Christoph Hertrich, Jack Stade, and Amir Yehudayoff. Better neural network expressivity: Subdividing the simplex, 2026. URL https://arxiv.org/abs/2505.14338

  8. [9]

    Věra Kůrková and Paul C. Kainen. Functionally equivalent feedforward neural networks. Neural Computation, 6 0 (3): 0 543--558, 1994

  9. [10]

    Reconstructing a neural net from its output

    Charles Fefferman. Reconstructing a neural net from its output. Revista Matematica Iberoamericana, 10: 0 507--555, 1994. URL https://api.semanticscholar.org/CorpusID:121350232

  10. [11]

    Affine symmetries and neural network identifiability

    Verner Vlačić and Helmut Bölcskei. Affine symmetries and neural network identifiability. Advances in Mathematics, 376: 0 107485, 2021. ISSN 0001-8708. doi:https://doi.org/10.1016/j.aim.2020.107485. URL https://www.sciencedirect.com/science/article/pii/S0001870820305132

  11. [12]

    Fukumizu, S

    K. Fukumizu and S. Amari. Local minima and plateaus in hierarchical structures of multilayer perceptrons. Neural Networks, 13 0 (3): 0 317--327, 2000. ISSN 0893-6080. doi:https://doi.org/10.1016/S0893-6080(00)00009-5. URL https://www.sciencedirect.com/science/article/pii/S0893608000000095

  12. [13]

    Geometry of the loss landscape in overparameterized neural networks: Symmetries and invariances

    Berfin Simsek, Fran c ois Ged, Arthur Jacot, Francesco Spadaro, Cl \'e ment Hongler, Wulfram Gerstner, and Johanni Brea. Geometry of the loss landscape in overparameterized neural networks: Symmetries and invariances. In International Conference on Machine Learning, pages 9722--9732. PMLR, 2021

  13. [14]

    Topology and geometry of the learning space of ReLU networks: connectivity and singularities

    Marco Nurisso, Pierrick Leroy, Giovanni Petri, and Francesco Vaccarino. Topology and geometry of the learning space of ReLU networks: connectivity and singularities. In The Fourteenth International Conference on Learning Representations, 2026. URL https://openreview.net/forum?id=O4Oy7NsSwG

  14. [15]

    Symmetry in neural network parameter spaces,

    Bo Zhao, Robin Walters, and Rose Yu. Symmetry in neural network parameter spaces, 2025. URL https://arxiv.org/abs/2506.13018

  15. [16]

    Alexander and A

    J. Alexander and A. Hirschowitz. Polynomial interpolation in several variables. Journal of Algebraic Geometry, 4 0 (2): 0 201--222, 1995. MR 1311347 (96f:14065)

  16. [17]

    Learning on a razor s edge: Identifiability and singularity of polynomial neural networks

    Vahid Shahverdi, Giovanni Luca Marchetti, and Kathl \'e n Kohn. Learning on a razor s edge: Identifiability and singularity of polynomial neural networks. In The Fourteenth International Conference on Learning Representations, 2026 b . URL https://openreview.net/forum?id=L5jYWeycAx

  17. [18]

    Identifiability of deep polynomial neural networks

    Konstantin Usevich, Ricardo Augusto Borsoi, Clara D \'e rand, and Marianne Clausel. Identifiability of deep polynomial neural networks. In The Thirty-ninth Annual Conference on Neural Information Processing Systems, 2025. URL https://openreview.net/forum?id=MrUsZfQ9pC

  18. [19]

    Activation degree thresholds and expressiveness of polynomial neural networks

    Bella Finkel, Jose Israel Rodriguez, Chenxi Wu, and Thomas Yahl. Activation degree thresholds and expressiveness of polynomial neural networks. 2025. URL https://arxiv.org/abs/2408.04569

  19. [20]

    Tran, Thieu Vo, An Nguyen The, Tho Tran Huu, Minh-Khoi Nguyen-Nhat, Thanh Tran, Duy-Tung Pham, and Tan Minh Nguyen

    Hoang V. Tran, Thieu Vo, An Nguyen The, Tho Tran Huu, Minh-Khoi Nguyen-Nhat, Thanh Tran, Duy-Tung Pham, and Tan Minh Nguyen. Equivariant neural functional networks for transformers. In The Thirteenth International Conference on Learning Representations, 2025. URL https://openreview.net/forum?id=uBai0ukstY

  20. [21]

    Restricted B oltzmann machines: Introduction and review

    Guido Mont \'u far. Restricted B oltzmann machines: Introduction and review. In Information Geometry and Its Applications, pages 75--115, Cham, 2018. Springer International Publishing. ISBN 978-3-319-97798-0

  21. [22]

    Geometry of the R estricted B oltzmann M achine

    Maria Angelica Cueto, Jason Morton, and Bernd Sturmfels. Geometry of the R estricted B oltzmann M achine. Algebraic Methods in Statistics and Probability II, 31: 0 135--153, 2010. doi:10.1090/conm/516/10134. URL https://arxiv.org/abs/0908.4425

  22. [23]

    Dimension of marginals of K ronecker product models

    Guido Mont\' u far and Jason Morton. Dimension of marginals of K ronecker product models. SIAM Journal on Applied Algebra and Geometry, 1 0 (1): 0 126--151, 2017. doi:10.1137/16M1077489. URL https://doi.org/10.1137/16M1077489

  23. [24]

    A Complete Symmetry Classification of Shallow ReLU Networks

    Pranavkrishnan Ramakrishnan. A complete symmetry classification of shallow relu networks, 2026. URL https://arxiv.org/abs/2604.14037

  24. [25]

    Parameter identifiability of a deep feedforward ReLU neural network

    Joachim Bona-Pellissier, Fran c ois Bachoc, and Fran c ois Malgouyres. Parameter identifiability of a deep feedforward ReLU neural network. Machine Learning, 112 0 (11): 0 4431--4493, 2023. doi:10.1007/s10994-023-06355-4. URL https://doi.org/10.1007/s10994-023-06355-4

  25. [26]

    An embedding of relu networks and an analysis of their identifiability

    Pierre Stock and R \'e mi Gribonval. An embedding of relu networks and an analysis of their identifiability. Constructive Approximation, 57 0 (2): 0 853--899, 2023. doi:10.1007/s00365-022-09578-1. URL https://doi.org/10.1007/s00365-022-09578-1

  26. [27]

    Elisenda Grigsby , Kathryn Lindsey, Robert Meyerhoff, and Chenxi Wu

    J. Elisenda Grigsby , Kathryn Lindsey, Robert Meyerhoff, and Chenxi Wu. Functional dimension of feedforward ReLU neural networks. Advances in Mathematics, 482: 0 110636, 2025. ISSN 0001-8708. doi:https://doi.org/10.1016/j.aim.2025.110636. URL https://www.sciencedirect.com/science/article/pii/S0001870825005341

  27. [28]

    Elisenda Grigsby and Kathryn Lindsey

    J. Elisenda Grigsby and Kathryn Lindsey. On functional dimension and persistent pseudodimension, 2024. URL https://arxiv.org/abs/2410.17191

  28. [29]

    Constraining the outputs of ReLU neural networks, 2025

    Yulia Alexandr and Guido Montúfar. Constraining the outputs of ReLU neural networks, 2025. URL https://arxiv.org/abs/2508.03867

  29. [30]

    Tight bounds on the smallest eigenvalue of the neural tangent kernel for deep relu networks

    Quynh Nguyen, Marco Mondelli, and Guido Mont\'ufar. Tight bounds on the smallest eigenvalue of the neural tangent kernel for deep relu networks. In Proceedings of the 38th International Conference on Machine Learning, volume 139 of Proceedings of Machine Learning Research, pages 8119--8129. PMLR, 18--24 Jul 2021. URL https://proceedings.mlr.press/v139/ngu...

  30. [31]

    Memorization and optimization in deep neural networks with minimum over-parameterization

    Simone Bombari, Mohammad Hossein Amani, and Marco Mondelli. Memorization and optimization in deep neural networks with minimum over-parameterization. In Advances in Neural Information Processing Systems, 2022. URL https://openreview.net/forum?id=x8DNliTBSYY

  31. [32]

    The interpolation phase transition in neural networks: Memorization and generalization under lazy training

    Andrea Montanari and Yiqiao Zhong. The interpolation phase transition in neural networks: Memorization and generalization under lazy training. The Annals of Statistics, 50 0 (5): 0 2816--2847, 2022. doi:10.1214/22-AOS2211

  32. [33]

    Bounds for the smallest eigenvalue of the NTK for arbitrary spherical data of arbitrary dimension

    Kedar Karhadkar, Michael Murray, and Guido Mont\'ufar. Bounds for the smallest eigenvalue of the NTK for arbitrary spherical data of arbitrary dimension. In The Thirty-eighth Annual Conference on Neural Information Processing Systems, 2024 a . URL https://openreview.net/forum?id=mHVmsy9len

  33. [34]

    Mildly overparameterized ReLU networks have a favorable loss landscape

    Kedar Karhadkar, Michael Murray, Hanna Tseran, and Guido Mont\'ufar. Mildly overparameterized ReLU networks have a favorable loss landscape. Transactions on Machine Learning Research, 2024 b . ISSN 2835-8856. URL https://openreview.net/forum?id=10WARaIwFn

  34. [35]

    The power of depth for feedforward neural networks

    Ronen Eldan and Ohad Shamir. The power of depth for feedforward neural networks. volume 49 of Proceedings of Machine Learning Research, pages 907--940, Columbia University, New York, New York, USA, 2016. PMLR. URL http://proceedings.mlr.press/v49/eldan16.html

  35. [36]

    benefits of depth in neural networks

    Matus Telgarsky. benefits of depth in neural networks. In 29th Annual Conference on Learning Theory, volume 49 of Proceedings of Machine Learning Research, pages 1517--1539, Columbia University, New York, New York, USA, 23--26 Jun 2016. PMLR. URL https://proceedings.mlr.press/v49/telgarsky16.html

  36. [37]

    When and why are deep networks better than shallow ones? In Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence, AAAI'17, page 2343–2349

    Hrushikesh Mhaskar, Qianli Liao, and Tomaso Poggio. When and why are deep networks better than shallow ones? In Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence, AAAI'17, page 2343–2349. AAAI Press, 2017

  37. [38]

    On the number of linear regions of deep neural networks

    Guido Mont\' u far, Razvan Pascanu, Kyunghyun Cho, and Yoshua Bengio. On the number of linear regions of deep neural networks. In Advances in Neural Information Processing Systems, volume 27. Curran Associates, Inc., 2014. URL https://proceedings.neurips.cc/paper_files/paper/2014/file/fa6f2a469cc4d61a92d96e74617c3d2a-Paper.pdf

  38. [39]

    On the number of response regions of deep feed forward networks with piece-wise linear activations

    Razvan Pascanu, Guido Mont\'ufar, and Yoshua Bengio. On the number of response regions of deep feed forward networks with piece-wise linear activations. In International Conference on Learning Representations, 2014. URL https://openreview.net/forum?id=bSaT4mmQt84Lx

  39. [40]

    On the expressive power of deep neural networks

    Maithra Raghu, Ben Poole, Jon Kleinberg, Surya Ganguli, and Jascha Sohl-Dickstein. On the expressive power of deep neural networks. In Proceedings of the 34th International Conference on Machine Learning, volume 70 of Proceedings of Machine Learning Research, pages 2847--2854. PMLR, August 2017. URL https://proceedings.mlr.press/v70/raghu17a.html

  40. [41]

    Bounding and counting linear regions of deep neural networks

    Thiago Serra, Christian Tjandraatmadja, and Srikumar Ramalingam. Bounding and counting linear regions of deep neural networks. In Proceedings of the 35th International Conference on Machine Learning, volume 80 of Proceedings of Machine Learning Research, pages 4558--4566. PMLR, July 2018. URL https://proceedings.mlr.press/v80/serra18b.html

  41. [42]

    The geometry of deep networks: Power diagram subdivision

    Randall Balestriero, Romain Cosentino, Behnaam Aazhang, and Richard Baraniuk. The geometry of deep networks: Power diagram subdivision. In Advances in Neural Information Processing Systems 32, pages 15832--15841. Curran Associates, Inc., 2019. URL http://papers.nips.cc/paper/9712-the-geometry-of-deep-networks-power-diagram-subdivision.pdf

  42. [43]

    Topological expressivity of relu neural networks

    Ekin Ergen and Moritz Grillo. Topological expressivity of relu neural networks. In Proceedings of Thirty Seventh Conference on Learning Theory, volume 247 of Proceedings of Machine Learning Research, pages 1599--1642. PMLR, 30 Jun--03 Jul 2024. URL https://proceedings.mlr.press/v247/ergen24a.html

  43. [44]

    Sharp bounds for the number of regions of maxout networks and vertices of M inkowski sums

    Guido Mont\' u far, Yue Ren, and Leon Zhang. Sharp bounds for the number of regions of maxout networks and vertices of M inkowski sums. SIAM Journal on Applied Algebra and Geometry, 6 0 (4): 0 618--649, 2022. URL https://doi.org/10.1137/21M1413699

  44. [45]

    Maxout polytopes, 2025

    Andrei Balakin, Shelby Cox, Georg Loho, and Bernd Sturmfels. Maxout polytopes, 2025. URL https://arxiv.org/abs/2509.21286

  45. [46]

    Towards lower bounds on the depth of ReLU neural networks

    Christoph Hertrich, Amitabh Basu, Marco Di Summa, and Martin Skutella. Towards lower bounds on the depth of ReLU neural networks. SIAM Journal on Discrete Mathematics, 37 0 (2): 0 997--1029, 2023. doi:10.1137/22M1489332. URL https://doi.org/10.1137/22M1489332

  46. [47]

    Lower bounds on the depth of integral ReLU neural networks via lattice polytopes

    Christian Alexander Haase, Christoph Hertrich, and Georg Loho. Lower bounds on the depth of integral ReLU neural networks via lattice polytopes. In The Eleventh International Conference on Learning Representations, 2023. URL https://openreview.net/forum?id=2mvALOAWaxY

  47. [48]

    On the expressiveness of rational ReLU neural networks with bounded depth

    Gennadiy Averkov, Christopher Hojny, and Maximilian Merkert. On the expressiveness of rational ReLU neural networks with bounded depth. In The Thirteenth International Conference on Learning Representations, 2025. URL https://openreview.net/forum?id=uREg3OHjLL

  48. [49]

    Depth-bounds for neural networks via the braid arrangement

    Moritz Leo Grillo, Christoph Hertrich, and Georg Loho. Depth-bounds for neural networks via the braid arrangement. In The Thirty-ninth Annual Conference on Neural Information Processing Systems, 2025. URL https://openreview.net/forum?id=XO9fhSZkBh

  49. [50]

    Tropical geometry of deep neural networks

    Liwen Zhang, Gregory Naitzat, and Lek-Heng Lim. Tropical geometry of deep neural networks. In Proceedings of the 35th International Conference on Machine Learning, volume 80 of Proceedings of Machine Learning Research, pages 5824--5832, Stockholmsmässan, Stockholm Sweden, 10--15 Jul 2018. PMLR. URL http://proceedings.mlr.press/v80/zhang18i.html

  50. [51]

    A tropical approach to neural networks with piecewise linear activations, 2018

    Vasileios Charisopoulos and Petros Maragos. A tropical approach to neural networks with piecewise linear activations, 2018. URL https://arxiv.org/abs/1805.08749

  51. [52]

    The real tropical geometry of neural networks for binary classification

    Marie-Charlotte Brandenburg, Georg Loho, and Guido Mont\'ufar. The real tropical geometry of neural networks for binary classification. Transactions on Machine Learning Research, 2024. ISSN 2835-8856. URL https://openreview.net/forum?id=I7JWf8XA2w

  52. [53]

    When deep learning meets polyhedral theory: A survey, 2023

    Joey Huchette, Gonzalo Muñoz, Thiago Serra, and Calvin Tsay. When deep learning meets polyhedral theory: A survey, 2023

  53. [54]

    Tran and Jidong Wang

    Ngoc M. Tran and Jidong Wang. Minimal representations of tropical rational functions. Algebraic Statistics, 15 0 (1): 0 27–59, May 2024. ISSN 2693-2997. doi:10.2140/astat.2024.15.27. URL http://dx.doi.org/10.2140/astat.2024.15.27

  54. [55]

    Decomposition polyhedra of piecewise linear functions

    Marie-Charlotte Brandenburg, Moritz Leo Grillo, and Christoph Hertrich. Decomposition polyhedra of piecewise linear functions. In The Thirteenth International Conference on Learning Representations, 2025. URL https://openreview.net/forum?id=vVCHWVBsLH

  55. [56]

    Maclagan and B

    D. Maclagan and B. Sturmfels. Introduction to Tropical Geometry. Graduate Studies in Mathematics. 2015

  56. [57]

    Elisenda Grigsby and Kathryn Lindsey

    J. Elisenda Grigsby and Kathryn Lindsey. On transversality of bent hyperplane arrangements and the topological expressiveness of ReLU neural networks. SIAM Journal on Applied Algebra and Geometry, 6 0 (2): 0 216--242, 2022. doi:10.1137/20M1368902. URL https://doi.org/10.1137/20M1368902

  57. [58]

    Algorithmic determination of the combinatorial structure of the linear regions of ReLU neural networks

    Marissa Masden. Algorithmic determination of the combinatorial structure of the linear regions of ReLU neural networks. SIAM Journal on Applied Algebra and Geometry, 9 0 (2): 0 374--404, 2025. doi:10.1137/24M1646996. URL https://doi.org/10.1137/24M1646996

  58. [59]

    A depth hierarchy for computing the maximum in ReLU networks via extremal graph theory, 2026

    Itay Safran. A depth hierarchy for computing the maximum in ReLU networks via extremal graph theory, 2026. URL https://arxiv.org/abs/2601.01417