Representation Benefits of Deep Feedforward Networks

Matus Telgarsky

Authors on Pith no claims yet

classification 💻 cs.LG cs.NE

keywords networksnodesdeeperrorfeedforwardnetworkachievesbenefits

read the original abstract

This note provides a family of classification problems, indexed by a positive integer $k$, where all shallow networks with fewer than exponentially (in $k$) many nodes exhibit error at least $1/6$, whereas a deep network with 2 nodes in each of $2k$ layers achieves zero error, as does a recurrent network with 3 distinct nodes iterated $k$ times. The proof is elementary, and the networks are standard feedforward networks with ReLU (Rectified Linear Unit) nonlinearities.

This paper has not been read by Pith yet.

discussion (0)

Forward citations

Cited by 2 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

ReLU Networks for Exact Generation of Similar Graphs
cs.LG 2026-04 unverdicted novelty 7.0

Constant-depth ReLU networks of size O(n²d) exist that deterministically generate graphs within edit distance d from any given n-vertex input graph.
Self-Play Fine-Tuning Converts Weak Language Models to Strong Language Models
cs.LG 2024-01 unverdicted novelty 6.0

SPIN lets weak LLMs become strong by self-generating training data from previous model versions and training to prefer human-annotated responses over its own outputs, outperforming DPO even with extra GPT-4 data on be...