Universal Approximation with Deep Narrow Networks

Patrick Kidger; Terry Lyons

arxiv: 1905.08539 · v2 · pith:D6DKGRK7new · submitted 2019-05-21 · 💻 cs.LG · math.CA· stat.ML

Universal Approximation with Deep Narrow Networks

Patrick Kidger , Terry Lyons This is my paper

classification 💻 cs.LG math.CAstat.ML

keywords networksactivationwidtharbitraryconsiderdepthfunctionfunctions

0 comments

read the original abstract

The classical Universal Approximation Theorem holds for neural networks of arbitrary width and bounded depth. Here we consider the natural `dual' scenario for networks of bounded width and arbitrary depth. Precisely, let $n$ be the number of inputs neurons, $m$ be the number of output neurons, and let $\rho$ be any nonaffine continuous function, with a continuous nonzero derivative at some point. Then we show that the class of neural networks of arbitrary depth, width $n + m + 2$, and activation function $\rho$, is dense in $C(K; \mathbb{R}^m)$ for $K \subseteq \mathbb{R}^n$ with $K$ compact. This covers every activation function possible to use in practice, and also includes polynomial activation functions, which is unlike the classical version of the theorem, and provides a qualitative difference between deep narrow networks and shallow wide networks. We then consider several extensions of this result. In particular we consider nowhere differentiable activation functions, density in noncompact domains with respect to the $L^p$-norm, and how the width may be reduced to just $n + m + 1$ for `most' activation functions.

This paper has not been read by Pith yet.

discussion (0)

Forward citations

Cited by 3 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

Model-based Dynamic 3D MRI Reconstructions using Neural Fields and Tensor Product Expansions
eess.IV 2026-05 unverdicted novelty 7.0

A model-based dynamic MRI reconstruction method using tensor products of neural fields for continuous representations outperforms prior methods at acceleration factors up to 16 while preserving motion and structure.
Measurement-based quantum machine learning
quant-ph 2024-05 unverdicted novelty 7.0

The authors introduce MuTA as a universal quantum neural network for MBQC and numerically demonstrate its ability to learn gates, classify quantum states, and process data under noise, including photonic hardware constraints.
Man, Machine, and Mathematics
math.OC 2026-04 unverdicted novelty 5.0

A high-level outline is given for a unified theory that reduces learning to a small set of ideas from dynamical systems, geometry, and physics via definitions of solvable problems and parametrized methods.