arxiv: 2605.03601 · v1 · submitted 2026-05-05 · 💻 cs.LG · cs.DM· math.CO

Recognition: unknown

Most ReLU Networks Admit Identifiable Parameters

Moritz Grillo , Guido Mont\'ufar

Authors on Pith no claims yet

Pith reviewed 2026-05-07 04:00 UTC · model grok-4.3

classification 💻 cs.LG cs.DMmath.CO

keywords ReLU networksparameter identifiabilityfunctional dimensionpolyhedral complexesrealization mapscaling and permutation symmetriesdepth hierarchy

0 comments

The pith

ReLU networks with input and hidden widths at least two admit an open set of identifiable parameters.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper establishes that for any deep ReLU architecture in which the input dimension and every hidden layer have width at least two, there is an open set of parameter vectors such that the realized function determines those parameters uniquely up to the usual scaling and permutation of hidden neurons. This pins the functional dimension of the network exactly to the total number of parameters minus the number of hidden neurons, showing there are no further continuous redundancies. A sympathetic reader would care because the result clarifies the precise amount of overparameterization in these models and proves that deeper networks generically realize functions that cannot be matched by shallower ones. The argument proceeds by introducing weighted polyhedral complexes to track the linear regions of the network and detect any hidden dependencies in the realization map.

Core claim

For every ReLU network architecture whose input and hidden layers have width at least two, there exists an open set of identifiable parameters. This implies that the functional dimension of every such architecture is exactly the number of parameters minus the number of hidden neurons. The authors reach the conclusion by analyzing the realization map through a framework of weighted polyhedral complexes that capture the arrangement of linear pieces and any additional redundancies beyond scaling and permutation symmetries. They also show that even minimal functional representations can retain non-trivial parameter redundancies and that, for an open set of parameters, the realized function lies,

What carries the argument

The weighted polyhedral complex associated with a ReLU network, which records the linear regions together with weights on their facets to isolate redundancies beyond scaling and permutation.

If this is right

The functional dimension equals the total number of parameters minus the total number of hidden neurons.
Minimal functional representations can still possess non-trivial parameter redundancies.
For an open set of parameters the realized function cannot be represented generically by any shallower network.
Identifiability holds on an open dense subset of parameter space for every qualifying architecture.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The result suggests that the non-identifiable parameters form a lower-dimensional subset that can be avoided by generic initialization or optimization paths.
The generic depth hierarchy implies that increasing depth strictly enlarges the set of realizable functions in a measure-theoretic sense.
One could test the prediction by sampling random parameters in small qualifying networks and verifying that the local dimension of the image matches the stated formula.

Load-bearing premise

The input dimension and all hidden layer widths must be at least two, and the identifiability result is stated only for an open set of parameters rather than for every parameter vector.

What would settle it

An explicit small ReLU network with all widths at least two in which the functional dimension is strictly smaller than the parameter count minus hidden neurons, or a direct calculation showing that the realization map has positive-dimensional fibers on a positive-measure set of parameters.

Figures

Figures reproduced from arXiv: 2605.03601 by Guido Mont\'ufar, Moritz Grillo.

**Figure 1.** Figure 1: Illustration of weighted polyhedral complexes, the canonical polyhedral complex and bent hyper view at source ↗

**Figure 2.** Figure 2: Illustration of how transparency implies LRA, and how supertransversality can fail. view at source ↗

**Figure 3.** Figure 3: Illustration of the inductive construction in Theorem 4.10 that satisfies TPIC and LRA. view at source ↗

**Figure 4.** Figure 4: Illustration of bending and non-bending ridges. view at source ↗

**Figure 5.** Figure 5: Illustration of Lemma 5.21. H1, . . . , Hn, Lemma 5.20 implies that the linear part of fθ − g is of the form Pn i=1 αi W (2) :,i W (1) i . Hence the affine-linear map h(x) = Ax + (fθ − g)(x) has linear part A + Pn i=1 αi W (2) :,i W (1) i . By assumption, the projection (A + Pn i=1 αi W (2) :,i W (1) i )Q has rank r. Therefore the linear part of h|aff(P ) has rank at least r. On the other hand, h is repres… view at source ↗

**Figure 6.** Figure 6: Canonical complexes for Example 5.26. Dashed lines indicate breakpoints that are not visible from view at source ↗

**Figure 7.** Figure 7: Illustration of the inductive construction in (Grigsby et al., 2023) for the architecture (2 view at source ↗

read the original abstract

We study the realization map of deep ReLU networks, focusing on when a function determines its parameters up to scaling and permutation. To analyze hidden redundancies beyond these standard symmetries, we introduce a framework based on weighted polyhedral complexes. Our main result shows that for every architecture whose input and hidden layers have width at least two, there exists an open set of identifiable parameters. This implies that the functional dimension of every such architecture is exactly the number of parameters minus the number of hidden neurons. We further show that minimal functional representations can still have non-trivial parameter redundancies. Finally, we establish a generic depth hierarchy, whereby for an open set of parameters the realized function cannot be represented generically by any shallower network.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

0 major / 2 minor

Summary. The manuscript introduces a framework of weighted polyhedral complexes to analyze the realization map of deep ReLU networks. It proves that, for every architecture in which the input dimension and all hidden-layer widths are at least 2, there exists an open set of parameters that are identifiable up to the standard scaling and permutation symmetries. This yields the exact functional dimension (number of parameters minus number of hidden neurons) for such architectures. The paper further shows that even minimal functional representations can retain non-trivial parameter redundancies and establishes a generic depth hierarchy: for an open set of parameters the realized function cannot be represented by any shallower network.

Significance. If the central claims hold, the work supplies a clean geometric resolution to the question of functional dimension for ReLU networks and demonstrates that, generically, no extra local redundancies exist beyond the obvious one-dimensional scaling symmetry per neuron. The weighted-polyhedral-complex machinery is a new technical tool that directly produces both the open-set identifiability statement and the depth-separation corollary; it is likely to be reusable for related questions about piecewise-linear networks. The explicit width-≥2 hypothesis and the open-set qualifier are stated clearly, and the argument avoids circularity by relying on standard properties of polyhedral complexes rather than on fitted quantities.

minor comments (2)

[§3.2] §3.2, Definition 3.4: the construction of the weighted polyhedral complex is technically correct but would be easier to follow if a low-dimensional (e.g., 1-hidden-layer, width-2) example were worked out explicitly before the general case.
[§5.4] The statement of the depth-hierarchy result (Theorem 5.3) is clear, yet the proof sketch in §5.4 could usefully include a one-sentence reminder of why the open-set condition on the deeper network automatically excludes generic representations by shallower networks.

Simulated Author's Rebuttal

0 responses · 0 unresolved

We thank the referee for their positive review, detailed summary of our contributions, and recommendation to accept the manuscript. No major comments were raised, so we have no points requiring a point-by-point response or revisions.

Circularity Check

0 steps flagged

No significant circularity in the derivation chain

full rationale

The paper introduces a weighted polyhedral complex framework to study the realization map of deep ReLU networks and proves that for input and hidden widths at least 2 there exists an open set of parameters identifiable up to scaling and permutation. This directly implies the functional dimension equals the parameter count minus the hidden neuron count. The argument proceeds from geometric properties of piecewise-linear functions and the standard one-dimensional scaling symmetry per neuron, without any self-definitional reductions, fitted inputs renamed as predictions, or load-bearing self-citations. The width hypothesis is an explicit assumption that rules out degenerate cases, and the open-set qualifier is maintained throughout; no step equates the claimed result to its own inputs by construction.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 1 invented entities

The central claim rests on standard properties of ReLU networks as continuous piecewise-linear functions and on the combinatorial structure of polyhedral complexes; the weighted variant is introduced as a new modeling tool without independent external validation.

axioms (2)

domain assumption ReLU networks realize continuous piecewise-linear functions whose linear regions form a polyhedral complex
Invoked throughout the analysis of the realization map and hidden redundancies.
standard math Standard facts from combinatorial geometry about polyhedral complexes and their weighted refinements
Used to define the framework and prove openness of the identifiable set.

invented entities (1)

weighted polyhedral complex no independent evidence
purpose: To encode both the linear regions and the parameter-dependent weights that determine how regions map across layers
New modeling device introduced to detect redundancies beyond scaling and permutation; no external falsifiable prediction is supplied.

pith-pipeline@v0.9.0 · 5412 in / 1529 out tokens · 63009 ms · 2026-05-07T04:00:26.307635+00:00 · methodology

Review history (2 revisions) →

discussion (0)

Reference graph

Works this paper leans on

58 extracted references · 25 canonical work pages · 1 internal anchor

[1]

Mary Phuong and Christoph H. Lampert. Functional vs.\ parametric equivalence of ReLU networks. In International Conference on Learning Representations, 2020. URL https://openreview.net/forum?id=Bylx-TNKvH

2020
[2]

Hidden symmetries of R e LU networks

Elisenda Grigsby, Kathryn Lindsey, and David Rolnick. Hidden symmetries of R e LU networks. In Proceedings of the 40th International Conference on Machine Learning, volume 202 of Proceedings of Machine Learning Research, pages 11734--11760. PMLR, 23--29 Jul 2023. URL https://proceedings.mlr.press/v202/grigsby23a.html

2023
[3]

Reverse-engineering deep R e LU networks

David Rolnick and Konrad Kording. Reverse-engineering deep R e LU networks. In Proceedings of the 37th International Conference on Machine Learning, volume 119 of Proceedings of Machine Learning Research, pages 8178--8187. PMLR, 13--18 Jul 2020. URL https://proceedings.mlr.press/v119/rolnick20a.html

2020
[4]

Sussmann

Héctor J. Sussmann. Uniqueness of the weights for minimal feedforward nets with a given input-output map. Neural Networks, 5 0 (4): 0 589--593, 1992. ISSN 0893-6080. doi:https://doi.org/10.1016/S0893-6080(05)80037-1. URL https://www.sciencedirect.com/science/article/pii/S0893608005800371

work page doi:10.1016/s0893-6080(05)80037-1 1992
[6]

Notes on the Symmetries of 2-Layer ReLU-Networks

Henning Petzka, Martin Trimmel, and Cristian Sminchisescu. Notes on the symmetries of 2-layer ReLU -networks. In NLDL, pages 1--6, 2020. URL https://doi.org/10.7557/18.5150

work page doi:10.7557/18.5150 2020
[7]

Identifiable equivariant networks are layerwise equivariant, 2026 a

Vahid Shahverdi, Giovanni Luca Marchetti, Georg Bökman, and Kathlén Kohn. Identifiable equivariant networks are layerwise equivariant, 2026 a . URL https://arxiv.org/abs/2601.21645

work page arXiv 2026
[8]

Better neural network expressivity: Subdividing the simplex, 2026

Egor Bakaev, Florestan Brunck, Christoph Hertrich, Jack Stade, and Amir Yehudayoff. Better neural network expressivity: Subdividing the simplex, 2026. URL https://arxiv.org/abs/2505.14338

work page arXiv 2026
[9]

Věra Kůrková and Paul C. Kainen. Functionally equivalent feedforward neural networks. Neural Computation, 6 0 (3): 0 543--558, 1994

1994
[10]

Reconstructing a neural net from its output

Charles Fefferman. Reconstructing a neural net from its output. Revista Matematica Iberoamericana, 10: 0 507--555, 1994. URL https://api.semanticscholar.org/CorpusID:121350232

1994
[11]

Affine symmetries and neural network identifiability

Verner Vlačić and Helmut Bölcskei. Affine symmetries and neural network identifiability. Advances in Mathematics, 376: 0 107485, 2021. ISSN 0001-8708. doi:https://doi.org/10.1016/j.aim.2020.107485. URL https://www.sciencedirect.com/science/article/pii/S0001870820305132

work page doi:10.1016/j.aim.2020.107485 2021
[12]

Fukumizu, S

K. Fukumizu and S. Amari. Local minima and plateaus in hierarchical structures of multilayer perceptrons. Neural Networks, 13 0 (3): 0 317--327, 2000. ISSN 0893-6080. doi:https://doi.org/10.1016/S0893-6080(00)00009-5. URL https://www.sciencedirect.com/science/article/pii/S0893608000000095

work page doi:10.1016/s0893-6080(00)00009-5 2000
[13]

Geometry of the loss landscape in overparameterized neural networks: Symmetries and invariances

Berfin Simsek, Fran c ois Ged, Arthur Jacot, Francesco Spadaro, Cl \'e ment Hongler, Wulfram Gerstner, and Johanni Brea. Geometry of the loss landscape in overparameterized neural networks: Symmetries and invariances. In International Conference on Machine Learning, pages 9722--9732. PMLR, 2021

2021
[14]

Topology and geometry of the learning space of ReLU networks: connectivity and singularities

Marco Nurisso, Pierrick Leroy, Giovanni Petri, and Francesco Vaccarino. Topology and geometry of the learning space of ReLU networks: connectivity and singularities. In The Fourteenth International Conference on Learning Representations, 2026. URL https://openreview.net/forum?id=O4Oy7NsSwG

2026
[15]

Symmetry in neural network parameter spaces,

Bo Zhao, Robin Walters, and Rose Yu. Symmetry in neural network parameter spaces, 2025. URL https://arxiv.org/abs/2506.13018

work page arXiv 2025
[16]

Alexander and A

J. Alexander and A. Hirschowitz. Polynomial interpolation in several variables. Journal of Algebraic Geometry, 4 0 (2): 0 201--222, 1995. MR 1311347 (96f:14065)

1995
[17]

Learning on a razor s edge: Identifiability and singularity of polynomial neural networks

Vahid Shahverdi, Giovanni Luca Marchetti, and Kathl \'e n Kohn. Learning on a razor s edge: Identifiability and singularity of polynomial neural networks. In The Fourteenth International Conference on Learning Representations, 2026 b . URL https://openreview.net/forum?id=L5jYWeycAx

2026
[18]

Identifiability of deep polynomial neural networks

Konstantin Usevich, Ricardo Augusto Borsoi, Clara D \'e rand, and Marianne Clausel. Identifiability of deep polynomial neural networks. In The Thirty-ninth Annual Conference on Neural Information Processing Systems, 2025. URL https://openreview.net/forum?id=MrUsZfQ9pC

2025
[19]

Activation degree thresholds and expressiveness of polynomial neural networks

Bella Finkel, Jose Israel Rodriguez, Chenxi Wu, and Thomas Yahl. Activation degree thresholds and expressiveness of polynomial neural networks. 2025. URL https://arxiv.org/abs/2408.04569

work page arXiv 2025
[20]

Tran, Thieu Vo, An Nguyen The, Tho Tran Huu, Minh-Khoi Nguyen-Nhat, Thanh Tran, Duy-Tung Pham, and Tan Minh Nguyen

Hoang V. Tran, Thieu Vo, An Nguyen The, Tho Tran Huu, Minh-Khoi Nguyen-Nhat, Thanh Tran, Duy-Tung Pham, and Tan Minh Nguyen. Equivariant neural functional networks for transformers. In The Thirteenth International Conference on Learning Representations, 2025. URL https://openreview.net/forum?id=uBai0ukstY

2025
[21]

Restricted B oltzmann machines: Introduction and review

Guido Mont \'u far. Restricted B oltzmann machines: Introduction and review. In Information Geometry and Its Applications, pages 75--115, Cham, 2018. Springer International Publishing. ISBN 978-3-319-97798-0

2018
[22]

Geometry of the R estricted B oltzmann M achine

Maria Angelica Cueto, Jason Morton, and Bernd Sturmfels. Geometry of the R estricted B oltzmann M achine. Algebraic Methods in Statistics and Probability II, 31: 0 135--153, 2010. doi:10.1090/conm/516/10134. URL https://arxiv.org/abs/0908.4425

work page doi:10.1090/conm/516/10134 2010
[23]

Dimension of marginals of K ronecker product models

Guido Mont\' u far and Jason Morton. Dimension of marginals of K ronecker product models. SIAM Journal on Applied Algebra and Geometry, 1 0 (1): 0 126--151, 2017. doi:10.1137/16M1077489. URL https://doi.org/10.1137/16M1077489

work page doi:10.1137/16m1077489 2017
[24]

A Complete Symmetry Classification of Shallow ReLU Networks

Pranavkrishnan Ramakrishnan. A complete symmetry classification of shallow relu networks, 2026. URL https://arxiv.org/abs/2604.14037

work page internal anchor Pith review Pith/arXiv arXiv 2026
[25]

Parameter identifiability of a deep feedforward ReLU neural network

Joachim Bona-Pellissier, Fran c ois Bachoc, and Fran c ois Malgouyres. Parameter identifiability of a deep feedforward ReLU neural network. Machine Learning, 112 0 (11): 0 4431--4493, 2023. doi:10.1007/s10994-023-06355-4. URL https://doi.org/10.1007/s10994-023-06355-4

work page doi:10.1007/s10994-023-06355-4 2023
[26]

An embedding of relu networks and an analysis of their identifiability

Pierre Stock and R \'e mi Gribonval. An embedding of relu networks and an analysis of their identifiability. Constructive Approximation, 57 0 (2): 0 853--899, 2023. doi:10.1007/s00365-022-09578-1. URL https://doi.org/10.1007/s00365-022-09578-1

work page doi:10.1007/s00365-022-09578-1 2023
[27]

Elisenda Grigsby , Kathryn Lindsey, Robert Meyerhoff, and Chenxi Wu

J. Elisenda Grigsby , Kathryn Lindsey, Robert Meyerhoff, and Chenxi Wu. Functional dimension of feedforward ReLU neural networks. Advances in Mathematics, 482: 0 110636, 2025. ISSN 0001-8708. doi:https://doi.org/10.1016/j.aim.2025.110636. URL https://www.sciencedirect.com/science/article/pii/S0001870825005341

work page doi:10.1016/j.aim.2025.110636 2025
[28]

Elisenda Grigsby and Kathryn Lindsey

J. Elisenda Grigsby and Kathryn Lindsey. On functional dimension and persistent pseudodimension, 2024. URL https://arxiv.org/abs/2410.17191

work page arXiv 2024
[29]

Constraining the outputs of ReLU neural networks, 2025

Yulia Alexandr and Guido Montúfar. Constraining the outputs of ReLU neural networks, 2025. URL https://arxiv.org/abs/2508.03867

work page arXiv 2025
[30]

Tight bounds on the smallest eigenvalue of the neural tangent kernel for deep relu networks

Quynh Nguyen, Marco Mondelli, and Guido Mont\'ufar. Tight bounds on the smallest eigenvalue of the neural tangent kernel for deep relu networks. In Proceedings of the 38th International Conference on Machine Learning, volume 139 of Proceedings of Machine Learning Research, pages 8119--8129. PMLR, 18--24 Jul 2021. URL https://proceedings.mlr.press/v139/ngu...

2021
[31]

Memorization and optimization in deep neural networks with minimum over-parameterization

Simone Bombari, Mohammad Hossein Amani, and Marco Mondelli. Memorization and optimization in deep neural networks with minimum over-parameterization. In Advances in Neural Information Processing Systems, 2022. URL https://openreview.net/forum?id=x8DNliTBSYY

2022
[32]

The interpolation phase transition in neural networks: Memorization and generalization under lazy training

Andrea Montanari and Yiqiao Zhong. The interpolation phase transition in neural networks: Memorization and generalization under lazy training. The Annals of Statistics, 50 0 (5): 0 2816--2847, 2022. doi:10.1214/22-AOS2211

work page doi:10.1214/22-aos2211 2022
[33]

Bounds for the smallest eigenvalue of the NTK for arbitrary spherical data of arbitrary dimension

Kedar Karhadkar, Michael Murray, and Guido Mont\'ufar. Bounds for the smallest eigenvalue of the NTK for arbitrary spherical data of arbitrary dimension. In The Thirty-eighth Annual Conference on Neural Information Processing Systems, 2024 a . URL https://openreview.net/forum?id=mHVmsy9len

2024
[34]

Mildly overparameterized ReLU networks have a favorable loss landscape

Kedar Karhadkar, Michael Murray, Hanna Tseran, and Guido Mont\'ufar. Mildly overparameterized ReLU networks have a favorable loss landscape. Transactions on Machine Learning Research, 2024 b . ISSN 2835-8856. URL https://openreview.net/forum?id=10WARaIwFn

2024
[35]

The power of depth for feedforward neural networks

Ronen Eldan and Ohad Shamir. The power of depth for feedforward neural networks. volume 49 of Proceedings of Machine Learning Research, pages 907--940, Columbia University, New York, New York, USA, 2016. PMLR. URL http://proceedings.mlr.press/v49/eldan16.html

2016
[36]

benefits of depth in neural networks

Matus Telgarsky. benefits of depth in neural networks. In 29th Annual Conference on Learning Theory, volume 49 of Proceedings of Machine Learning Research, pages 1517--1539, Columbia University, New York, New York, USA, 23--26 Jun 2016. PMLR. URL https://proceedings.mlr.press/v49/telgarsky16.html

2016
[37]

When and why are deep networks better than shallow ones? In Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence, AAAI'17, page 2343–2349

Hrushikesh Mhaskar, Qianli Liao, and Tomaso Poggio. When and why are deep networks better than shallow ones? In Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence, AAAI'17, page 2343–2349. AAAI Press, 2017

2017
[38]

On the number of linear regions of deep neural networks

Guido Mont\' u far, Razvan Pascanu, Kyunghyun Cho, and Yoshua Bengio. On the number of linear regions of deep neural networks. In Advances in Neural Information Processing Systems, volume 27. Curran Associates, Inc., 2014. URL https://proceedings.neurips.cc/paper_files/paper/2014/file/fa6f2a469cc4d61a92d96e74617c3d2a-Paper.pdf

2014
[39]

On the number of response regions of deep feed forward networks with piece-wise linear activations

Razvan Pascanu, Guido Mont\'ufar, and Yoshua Bengio. On the number of response regions of deep feed forward networks with piece-wise linear activations. In International Conference on Learning Representations, 2014. URL https://openreview.net/forum?id=bSaT4mmQt84Lx

2014
[40]

On the expressive power of deep neural networks

Maithra Raghu, Ben Poole, Jon Kleinberg, Surya Ganguli, and Jascha Sohl-Dickstein. On the expressive power of deep neural networks. In Proceedings of the 34th International Conference on Machine Learning, volume 70 of Proceedings of Machine Learning Research, pages 2847--2854. PMLR, August 2017. URL https://proceedings.mlr.press/v70/raghu17a.html

2017
[41]

Bounding and counting linear regions of deep neural networks

Thiago Serra, Christian Tjandraatmadja, and Srikumar Ramalingam. Bounding and counting linear regions of deep neural networks. In Proceedings of the 35th International Conference on Machine Learning, volume 80 of Proceedings of Machine Learning Research, pages 4558--4566. PMLR, July 2018. URL https://proceedings.mlr.press/v80/serra18b.html

2018
[42]

The geometry of deep networks: Power diagram subdivision

Randall Balestriero, Romain Cosentino, Behnaam Aazhang, and Richard Baraniuk. The geometry of deep networks: Power diagram subdivision. In Advances in Neural Information Processing Systems 32, pages 15832--15841. Curran Associates, Inc., 2019. URL http://papers.nips.cc/paper/9712-the-geometry-of-deep-networks-power-diagram-subdivision.pdf

2019
[43]

Topological expressivity of relu neural networks

Ekin Ergen and Moritz Grillo. Topological expressivity of relu neural networks. In Proceedings of Thirty Seventh Conference on Learning Theory, volume 247 of Proceedings of Machine Learning Research, pages 1599--1642. PMLR, 30 Jun--03 Jul 2024. URL https://proceedings.mlr.press/v247/ergen24a.html

2024
[44]

Sharp bounds for the number of regions of maxout networks and vertices of M inkowski sums

Guido Mont\' u far, Yue Ren, and Leon Zhang. Sharp bounds for the number of regions of maxout networks and vertices of M inkowski sums. SIAM Journal on Applied Algebra and Geometry, 6 0 (4): 0 618--649, 2022. URL https://doi.org/10.1137/21M1413699

work page doi:10.1137/21m1413699 2022
[45]

Maxout polytopes, 2025

Andrei Balakin, Shelby Cox, Georg Loho, and Bernd Sturmfels. Maxout polytopes, 2025. URL https://arxiv.org/abs/2509.21286

work page arXiv 2025
[46]

Towards lower bounds on the depth of ReLU neural networks

Christoph Hertrich, Amitabh Basu, Marco Di Summa, and Martin Skutella. Towards lower bounds on the depth of ReLU neural networks. SIAM Journal on Discrete Mathematics, 37 0 (2): 0 997--1029, 2023. doi:10.1137/22M1489332. URL https://doi.org/10.1137/22M1489332

work page doi:10.1137/22m1489332 2023
[47]

Lower bounds on the depth of integral ReLU neural networks via lattice polytopes

Christian Alexander Haase, Christoph Hertrich, and Georg Loho. Lower bounds on the depth of integral ReLU neural networks via lattice polytopes. In The Eleventh International Conference on Learning Representations, 2023. URL https://openreview.net/forum?id=2mvALOAWaxY

2023
[48]

On the expressiveness of rational ReLU neural networks with bounded depth

Gennadiy Averkov, Christopher Hojny, and Maximilian Merkert. On the expressiveness of rational ReLU neural networks with bounded depth. In The Thirteenth International Conference on Learning Representations, 2025. URL https://openreview.net/forum?id=uREg3OHjLL

2025
[49]

Depth-bounds for neural networks via the braid arrangement

Moritz Leo Grillo, Christoph Hertrich, and Georg Loho. Depth-bounds for neural networks via the braid arrangement. In The Thirty-ninth Annual Conference on Neural Information Processing Systems, 2025. URL https://openreview.net/forum?id=XO9fhSZkBh

2025
[50]

Tropical geometry of deep neural networks

Liwen Zhang, Gregory Naitzat, and Lek-Heng Lim. Tropical geometry of deep neural networks. In Proceedings of the 35th International Conference on Machine Learning, volume 80 of Proceedings of Machine Learning Research, pages 5824--5832, Stockholmsmässan, Stockholm Sweden, 10--15 Jul 2018. PMLR. URL http://proceedings.mlr.press/v80/zhang18i.html

2018
[51]

A tropical approach to neural networks with piecewise linear activations, 2018

Vasileios Charisopoulos and Petros Maragos. A tropical approach to neural networks with piecewise linear activations, 2018. URL https://arxiv.org/abs/1805.08749

work page arXiv 2018
[52]

The real tropical geometry of neural networks for binary classification

Marie-Charlotte Brandenburg, Georg Loho, and Guido Mont\'ufar. The real tropical geometry of neural networks for binary classification. Transactions on Machine Learning Research, 2024. ISSN 2835-8856. URL https://openreview.net/forum?id=I7JWf8XA2w

2024
[53]

When deep learning meets polyhedral theory: A survey, 2023

Joey Huchette, Gonzalo Muñoz, Thiago Serra, and Calvin Tsay. When deep learning meets polyhedral theory: A survey, 2023

2023
[54]

Tran and Jidong Wang

Ngoc M. Tran and Jidong Wang. Minimal representations of tropical rational functions. Algebraic Statistics, 15 0 (1): 0 27–59, May 2024. ISSN 2693-2997. doi:10.2140/astat.2024.15.27. URL http://dx.doi.org/10.2140/astat.2024.15.27

work page doi:10.2140/astat.2024.15.27 2024
[55]

Decomposition polyhedra of piecewise linear functions

Marie-Charlotte Brandenburg, Moritz Leo Grillo, and Christoph Hertrich. Decomposition polyhedra of piecewise linear functions. In The Thirteenth International Conference on Learning Representations, 2025. URL https://openreview.net/forum?id=vVCHWVBsLH

2025
[56]

Maclagan and B

D. Maclagan and B. Sturmfels. Introduction to Tropical Geometry. Graduate Studies in Mathematics. 2015

2015
[57]

Elisenda Grigsby and Kathryn Lindsey

J. Elisenda Grigsby and Kathryn Lindsey. On transversality of bent hyperplane arrangements and the topological expressiveness of ReLU neural networks. SIAM Journal on Applied Algebra and Geometry, 6 0 (2): 0 216--242, 2022. doi:10.1137/20M1368902. URL https://doi.org/10.1137/20M1368902

work page doi:10.1137/20m1368902 2022
[58]

Algorithmic determination of the combinatorial structure of the linear regions of ReLU neural networks

Marissa Masden. Algorithmic determination of the combinatorial structure of the linear regions of ReLU neural networks. SIAM Journal on Applied Algebra and Geometry, 9 0 (2): 0 374--404, 2025. doi:10.1137/24M1646996. URL https://doi.org/10.1137/24M1646996

work page doi:10.1137/24m1646996 2025
[59]

A depth hierarchy for computing the maximum in ReLU networks via extremal graph theory, 2026

Itay Safran. A depth hierarchy for computing the maximum in ReLU networks via extremal graph theory, 2026. URL https://arxiv.org/abs/2601.01417

work page arXiv 2026