pith. machine review for the scientific record. sign in

arxiv: 1710.11278 · v2 · submitted 2017-10-31 · 📊 stat.ML · cs.CC· cs.LG· math.CO· math.ST· stat.TH

Recognition: unknown

Approximating Continuous Functions by ReLU Nets of Minimal Width

Authors on Pith no claims yet
classification 📊 stat.ML cs.CCcs.LGmath.COmath.STstat.TH
keywords netsreluwidthcontinuousdepthfunctionhiddenminimal
0
0 comments X
read the original abstract

This article concerns the expressive power of depth in deep feed-forward neural nets with ReLU activations. Specifically, we answer the following question: for a fixed $d_{in}\geq 1,$ what is the minimal width $w$ so that neural nets with ReLU activations, input dimension $d_{in}$, hidden layer widths at most $w,$ and arbitrary depth can approximate any continuous, real-valued function of $d_{in}$ variables arbitrarily well? It turns out that this minimal width is exactly equal to $d_{in}+1.$ That is, if all the hidden layer widths are bounded by $d_{in}$, then even in the infinite depth limit, ReLU nets can only express a very limited class of functions, and, on the other hand, any continuous function on the $d_{in}$-dimensional unit cube can be approximated to arbitrary precision by ReLU nets in which all hidden layers have width exactly $d_{in}+1.$ Our construction in fact shows that any continuous function $f:[0,1]^{d_{in}}\to\mathbb R^{d_{out}}$ can be approximated by a net of width $d_{in}+d_{out}$. We obtain quantitative depth estimates for such an approximation in terms of the modulus of continuity of $f$.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Forward citations

Cited by 4 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. Approximation of Maximally Monotone Operators : A Graph Convergence Perspective

    cs.LG 2026-05 unverdicted novelty 7.0

    Any maximally monotone operator can be approximated in local graph convergence by continuous encoder-decoder networks, with structure-preserving versions that retain maximal monotonicity via resolvent parameterizations.

  2. Structural Correspondence and Universal Approximation in Diagonal plus Low-Rank Neural Networks

    cs.LG 2026-05 unverdicted novelty 6.0

    Diagonal plus Low-Rank (DLoR) neural networks achieve universal approximation for general activations by additive or multiplicative decompositions of full-rank transformations.

  3. Man, Machine, and Mathematics

    math.OC 2026-04 unverdicted novelty 5.0

    A high-level outline is given for a unified theory that reduces learning to a small set of ideas from dynamical systems, geometry, and physics via definitions of solvable problems and parametrized methods.

  4. Relocation of compact sets in $\mathbb{R}^n$ by diffeomorphisms and linear separability of datasets in $\mathbb{R}^n$

    cs.LG 2026-04 unverdicted novelty 5.0

    Any finite number of compact sets in R^n can be relocated to arbitrary targets by diffeomorphisms of R^n and embedded into R^{n+1} to become linearly separable, allowing width-n DNNs with Leaky-ReLU, ELU or SELU activ...