pith. sign in

arxiv: 2606.07728 · v1 · pith:C3GQ42LJnew · submitted 2026-06-05 · 💻 cs.LG

Characterizing the Discrete Geometry of ReLU Networks

Pith reviewed 2026-06-27 22:26 UTC · model grok-4.3

classification 💻 cs.LG
keywords ReLU networkslinear regionsconnectivity graphpiecewise linear functionsdiscrete geometryneural network geometry
0
0 comments X

The pith

ReLU networks induce region connectivity graphs with average degree at most twice the input dimension.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

ReLU networks partition input space into polyhedral linear regions that connect along shared faces. The paper models these connections as an undirected graph and proves bounds on its properties that hold for every fully connected ReLU network. Average degree is at most twice the input dimension no matter the width or depth, while diameter remains bounded by a constant independent of input dimension. These limits apply even though the total number of regions can grow exponentially with dimension. The claims are checked on both synthetic data and networks trained on real tasks.

Core claim

For any fully-connected ReLU network the connectivity graph of its linear regions has average degree upper-bounded by twice the input dimension regardless of width and depth, and its diameter has an upper bound independent of input dimension.

What carries the argument

The connectivity graph whose nodes are the linear regions and whose edges join regions that share a face.

If this is right

  • The region graph stays sparse even as network size increases.
  • Any two regions are separated by a path whose length is bounded independently of input dimension.
  • Exponential growth in the number of regions does not produce correspondingly dense local connectivity.
  • The geometry of the arrangement is constrained by input dimension alone.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same graph view may reveal similar dimension-dependent bounds for other piecewise-linear activations.
  • Optimization or sampling procedures that move between regions may inherit short-path guarantees from the diameter bound.
  • The low average degree suggests that most regions touch only a small number of neighbors even in high dimension.

Load-bearing premise

Linear regions form a complex that partitions the input space and regions connect only through shared faces.

What would settle it

A fully-connected ReLU network whose region connectivity graph has average degree strictly larger than twice the input dimension, or whose diameter grows with input dimension.

Figures

Figures reproduced from arXiv: 2606.07728 by Blake B. Gaines, Jinbo Bi.

Figure 1
Figure 1. Figure 1: (a) An example ReLU network with a 2-dimensional input. (b) The corresponding polyhe [PITH_FULL_IMAGE:figures/full_fig_p002_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: A polyhedral complex [PITH_FULL_IMAGE:figures/full_fig_p003_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: (a) The complex in Fig. 2 with BH 4 removed. (b) BH 4 is added and cells are categorized [PITH_FULL_IMAGE:figures/full_fig_p005_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: (Left) Distributions for the number of faces of polyhedra in ReLU networks trained on [PITH_FULL_IMAGE:figures/full_fig_p008_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: Connectivity graph diameter vs theoretical upper bound. At each distinct value of the [PITH_FULL_IMAGE:figures/full_fig_p008_5.png] view at source ↗
Figure 6
Figure 6. Figure 6: Histograms of polyhedron neighbor counts (i.e., the number of polyhedra that have a [PITH_FULL_IMAGE:figures/full_fig_p009_6.png] view at source ↗
Figure 7
Figure 7. Figure 7: Histograms of polyhedron neighbor counts, separated into those that contain data points [PITH_FULL_IMAGE:figures/full_fig_p010_7.png] view at source ↗
Figure 8
Figure 8. Figure 8: ReLU complex illustrating Theorem 3.5 with [PITH_FULL_IMAGE:figures/full_fig_p019_8.png] view at source ↗
Figure 9
Figure 9. Figure 9: Alternate categorizations for Fig. 3 with different choices of [PITH_FULL_IMAGE:figures/full_fig_p021_9.png] view at source ↗
Figure 10
Figure 10. Figure 10: Connectivity graph modification when adding another final-layer neuron to a neural [PITH_FULL_IMAGE:figures/full_fig_p023_10.png] view at source ↗
Figure 11
Figure 11. Figure 11: The same plot as Fig. 5 but with the lower bound from Theorem 3.8 on the [PITH_FULL_IMAGE:figures/full_fig_p024_11.png] view at source ↗
Figure 12
Figure 12. Figure 12: Distributions for the number of faces of polyhedra in the decompositions of trained ReLU [PITH_FULL_IMAGE:figures/full_fig_p025_12.png] view at source ↗
Figure 13
Figure 13. Figure 13: Histograms of bounded polyhedron neighbor counts for all d-cells in the MNIST net￾work’s complex, colored according to volume (left) and inradius (right). 5 10 15 1 10 100 1000 10k 5 10 15 5 10 15 5 10 15 1 10 100 1000 10k 1 10 100 1000 10k 1 10 100 1000 10k 1 5 25 125 625 3125 Number of Points 12 Epochs 13 Epochs 14 Epochs 15 Epochs 8 Epochs 9 Epochs 10 Epochs 11 Epochs 4 Epochs 5 Epochs 6 Epochs 7 Epoch… view at source ↗
Figure 14
Figure 14. Figure 14: Face count distributions for a network trained on the MNIST dataset. These are con [PITH_FULL_IMAGE:figures/full_fig_p027_14.png] view at source ↗
read the original abstract

It is well established that ReLU networks define continuous piecewise-linear functions, and that their linear regions are polyhedra in the input space. These regions form a complex that fully partitions the input space. The way these regions fit together is fundamental to the behavior of the network, as nonlinearities occur only at the boundaries where these regions connect. However, relatively little is known about the geometry of these complexes beyond bounds on the total number of regions, and calculating the complex exactly is intractable for most networks. In this work, we prove new theoretical results about these complexes that hold for all fully-connected ReLU networks, specifically about their connectivity graphs in which nodes correspond to regions and edges exist between each pair of regions connected by a face. We find that the average degree of this graph is upper bounded by twice the input dimension regardless of the width and depth of the network, and that the diameter of this graph has an upper bound that does not depend on input dimension, despite the number of regions increasing exponentially with input dimension. We corroborate our findings through experiments with networks trained on both synthetic and real-world data, which provide additional insight into the geometry of ReLU networks. Code to reproduce our results can be found at https://github.com/bl-ake/ICLR-2026.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

0 major / 3 minor

Summary. The paper proves that for any fully-connected ReLU network the dual graph on its linear regions (nodes = regions, edges = shared faces) has average degree at most 2d where d is input dimension, independent of width and depth; it also proves that the diameter of this graph admits an upper bound independent of d, even though the number of regions grows exponentially in d. Both results are stated to hold for all such networks and are corroborated by experiments on synthetic and real data.

Significance. If the derivations hold, the results give parameter-free, architecture-independent bounds on the connectivity structure of the linear-region complex. The average-degree bound follows from elementary double-counting and the diameter bound from the fact that paths can cross each distinct hyperplane at most once; both are falsifiable predictions that can be checked on any ReLU network. The experimental section supplies concrete illustrations on trained models. These are genuine strengths of the manuscript.

minor comments (3)
  1. [Introduction / Related Work] The abstract states that the results are 'new theoretical results'; the introduction or related-work section should explicitly contrast the stated bounds with the classical double-counting argument for hyperplane arrangements (e.g., cite standard references on arrangements) so that the precise incremental contribution is clear.
  2. [Experiments] The experimental section should report the precise criterion used to enumerate regions and any data-exclusion rules; without these details it is difficult to reproduce the reported average-degree and diameter statistics.
  3. [Preliminaries] Notation for the dual graph (nodes, edges, faces) is introduced in the abstract but should be formalized once in a dedicated preliminary section with a small illustrative figure.

Simulated Author's Rebuttal

0 responses · 0 unresolved

We thank the referee for the positive summary, the assessment of significance, and the recommendation of minor revision. No major comments were provided in the report.

Circularity Check

0 steps flagged

No significant circularity identified

full rationale

The paper's central claims derive bounds on the dual graph of ReLU linear regions directly from standard polyhedral combinatorics (double-counting of codimension-1 faces, each shared by exactly two regions, and the fact that the arrangement is generated by finitely many hyperplanes). These steps invoke only well-established facts stated in the abstract as background and do not reduce any prediction or theorem to a fitted parameter, self-definition, or load-bearing self-citation. The derivation chain is therefore self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central claim rests on the domain assumption that ReLU networks induce a polyhedral complex partitioning input space; no free parameters or invented entities are introduced.

axioms (1)
  • domain assumption ReLU networks define continuous piecewise-linear functions whose linear regions are polyhedra that form a complex fully partitioning the input space
    Explicitly stated as well-established in the abstract

pith-pipeline@v0.9.1-grok · 5750 in / 1193 out tokens · 24786 ms · 2026-06-27T22:26:01.997119+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

14 extracted references · 8 canonical work pages · 1 internal anchor

  1. [1]

    Randall Balestriero, Romain Cosentino, and Sarath Shekkizhar

    URLhttps://api.semanticscholar.org/CorpusID:160010022. Randall Balestriero, Romain Cosentino, and Sarath Shekkizhar. Characterizing Large Language Model Geometry Helps Solve Toxicity Detection and Generation. InForty-first International Conference on Machine Learning, 2024. URLhttps://openreview.net/forum?id= glfcwSsks8. Arturs Berzins. Polyhedral Complex...

  2. [2]

    URLhttps://ojs.aaai.org/index.php/AAA I/article/view/5729

    doi: 10.1609/aaai.v34i04.5729. URLhttps://ojs.aaai.org/index.php/AAA I/article/view/5729. GSCC: 0000170 102 citations (Semantic Scholar/DOI) [2024-05- 30] 60 citations (Crossref) [2024-05-30]. R. C. Buck. Partition of Space.The American Mathematical Monthly, 50(9):541–544, November

  3. [3]

    doi: 10.1080/00029890.1943.11991447

    ISSN 0002-9890, 1930-0972. doi: 10.1080/00029890.1943.11991447. URLhttps: //www.tandfonline.com/doi/full/10.1080/00029890.1943.11991447. Rudy R Bunel, Ilker Turkaslan, Philip Torr, Pushmeet Kohli, and Pawan K Mudigonda. A Unified View of Piecewise Linear Neural Network Verification. In S. Bengio, H. Wallach, H. Larochelle, K. Grauman, N. Cesa-Bianchi, and...

  4. [4]

    2012 , url =

    doi: 10.1109/MSP.2012.2211477. URLhttps://ieeexplore.ieee.org/docu ment/6296535. Sahil Rajesh Dhayalkar. The Geometry of ReLU Networks through the ReLU Transition Graph, May 2025. URLhttp://arxiv.org/abs/2505.11692. arXiv:2505.11692. Reinhard Diestel.Graph theory. Springer Berlin Heidelberg, New York, NY , 6 edition, 2017. ISBN 9783662536216. Feng-Lei Fan...

  5. [5]

    doi: 10.1016/0166-218X(91)90067-7

    ISSN 0166218X. doi: 10.1016/0166-218X(91)90067-7. URLhttps://linkinghub .elsevier.com/retrieve/pii/0166218X91900677. Alexis Goujon, Arian Etemadi, and Michael Unser. On the number of regions of piecewise linear neural networks.Journal of Computational and Applied Mathematics, 441:115667, May 2024. ISSN 0377-0427. doi: 10.1016/j.cam.2023.115667. URLhttps:/...

  6. [6]

    doi: 10.1287/opre.2019.1973

    ISSN 0030-364X, 1526-5463. doi: 10.1287/opre.2019.1973. URLhttps: //pubsonline.informs.org/doi/10.1287/opre.2019.1973. Joey Huchette, Gonzalo Mu˜noz, Thiago Serra, and Calvin Tsay. When Deep Learning Meets Poly- hedral Theory: A Survey, April 2023. URLhttps://optimization-online.org/?p =22840. GSCC: 0000025. Huma Jamil, Yajing Liu, Christina Cole, Nathani...

  7. [7]

    ISBN 978-989-758-584-5

    SciTePress, 2022. ISBN 978-989-758-584-5. doi: 10.5220/0011274000003277. GSCC: 0000001 Backup Publisher: INSTICC ISSN: 2184-9277. Paul Lezeau, Thomas Walker, Yueqi Cao, Shiv Bhatia, and Anthea Monod. Tropical Expressiv- ity of Neural Networks, October 2024. URLhttp://arxiv.org/abs/2405.20174. arXiv:2405.20174. Yajing Liu, Turgay Caglar, Christopher Peters...

  8. [8]

    On the number of response regions of deep feed forward networks with piece-wise linear activations

    ISSN 2470-6566. doi: 10.1137/24M1646996. URLhttps://epubs.siam.org/d oi/10.1137/24M1646996. Guido F Montufar, Razvan Pascanu, Kyunghyun Cho, and Yoshua Bengio. On the Number of Lin- ear Regions of Deep Neural Networks. InAdvances in Neural Information Processing Systems, volume 27. Curran Associates, Inc., 2014. URLhttps://papers.nips.cc/paper/542 2-on-th...

  9. [9]

    GSCC: 0000110

    URLhttps://proceedings.mlr.press/v119/rolnick20a.html. GSCC: 0000110. Thiago Serra, Christian Tjandraatmadja, and Srikumar Ramalingam. Bounding and Counting Lin- ear Regions of Deep Neural Networks. InProceedings of the 35th International Conference on Machine Learning, pp. 4558–4566. PMLR, July 2018. URLhttps://proceedings.mlr. press/v80/serra18b.html. H...

  10. [10]

    doi: 10.1007/978-3-030-30942-8 39

    ISBN 9783030309411 9783030309428. doi: 10.1007/978-3-030-30942-8 39. URL http://link.springer.com/10.1007/978-3-030-30942-8_39. Martin Trimmel, Henning Petzka, and Cristian Sminchisescu. TropEx: An Algorithm for Extracting Linear Terms in Deep Neural Networks. InInternational Conference on Learning Representa- tions, 2021. URLhttps://openreview.net/forum?...

  11. [11]

    14 Published as a conference paper at ICLR 2026 Shaojie Xu, Joel Vaughan, Jie Chen, Aijun Zhang, and Agus Sudjianto

    URLhttps://openreview.net/forum?id=jhd62iKzRuj. 14 Published as a conference paper at ICLR 2026 Shaojie Xu, Joel Vaughan, Jie Chen, Aijun Zhang, and Agus Sudjianto. Traversing the Local Poly- topes of ReLU Neural Networks. InThe AAAI-22 Workshop on Adversarial Machine Learning and Beyond, 2022. URLhttps://openreview.net/forum?id=EQjwT2-Vaba. Xiaodong Yang...

  12. [12]

    URLhttps://proceedings.neurips.cc/paper _files/paper/2023/file/1d83ad88759cef8192451543e5d59bf6-Paper-C onference.pdf

    Curran Associates, Inc., 2023. URLhttps://proceedings.neurips.cc/paper _files/paper/2023/file/1d83ad88759cef8192451543e5d59bf6-Paper-C onference.pdf. GSCC: 0000000. 15 Published as a conference paper at ICLR 2026 APPENDIX A DETAILEDRELATEDWORK Several works have established lower and upper bounds on themaximumpossible number of poly- hedra (the regions in...

  13. [13]

    Note that this can be the empty set

    Closure under intersection, that is,c 1, c2 ∈ Cimpliesc 1 ∩c 2 ∈ C. Note that this can be the empty set

  14. [14]

    Closure under taking faces. That is, ifc∈ Candc ′ is a face ofc, thenc ′ ∈ C We restrict our discussion to ReLU networks with the following two properties: Genericity(Grigsby & Lindsey (2022), Definition 2.4): A hyperplane arrangement inR d is called genericif all sets ofkhyperplanes intersect in an affine space of dimensiond−kfor1≤k≤d. In a ReLU network,...