pith. sign in

arxiv: 1807.03973 · v2 · pith:7QIVXZ3Znew · submitted 2018-07-11 · 🧮 math.NA · cs.NA

ReLU Deep Neural Networks and Linear Finite Elements

classification 🧮 math.NA cs.NA
keywords relucpwllinearfunctionsrepresentationfinitefunctionhidden
0
0 comments X
read the original abstract

In this paper, we investigate the relationship between deep neural networks (DNN) with rectified linear unit (ReLU) function as the activation function and continuous piecewise linear (CPWL) functions, especially CPWL functions from the simplicial linear finite element method (FEM). We first consider the special case of FEM. By exploring the DNN representation of its nodal basis functions, we present a ReLU DNN representation of CPWL in FEM. We theoretically establish that at least $2$ hidden layers are needed in a ReLU DNN to represent any linear finite element functions in $\Omega \subseteq \mathbb{R}^d$ when $d\ge2$. Consequently, for $d=2,3$ which are often encountered in scientific and engineering computing, the minimal number of two hidden layers are necessary and sufficient for any CPWL function to be represented by a ReLU DNN. Then we include a detailed account on how a general CPWL in $\mathbb R^d$ can be represented by a ReLU DNN with at most $\lceil\log_2(d+1)\rceil$ hidden layers and we also give an estimation of the number of neurons in DNN that are needed in such a representation. Furthermore, using the relationship between DNN and FEM, we theoretically argue that a special class of DNN models with low bit-width are still expected to have an adequate representation power in applications. Finally, as a proof of concept, we present some numerical results for using ReLU DNNs to solve a two point boundary problem to demonstrate the potential of applying DNN for numerical solution of partial differential equations.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Forward citations

Cited by 4 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. KAN: Kolmogorov-Arnold Networks

    cs.LG 2024-04 conditional novelty 8.0

    KANs with learnable univariate spline activations on edges achieve better accuracy than MLPs with fewer parameters, faster scaling, and direct visualization for scientific discovery.

  2. Theory of the Frequency Principle for General Deep Neural Networks

    cs.LG 2019-06 unverdicted novelty 6.0

    The paper establishes rigorous theorems proving the Frequency Principle holds for general deep neural networks at initial, intermediate, and final training stages.

  3. Wildfire spread forecasting with Deep Learning

    cs.LG 2025-05 conditional novelty 4.0

    A deep learning framework forecasts final wildfire burned area extent from ignition-time data, with an ablation showing that a four-day pre- to five-day post-ignition temporal window improves F1 and IoU by nearly 5% o...

  4. Deep learning applied to computational mechanics: A comprehensive review, state of the art, and the classics

    cs.LG 2022-12 unverdicted novelty 2.0

    A comprehensive review of deep learning techniques for computational mechanics, including LSTM for constitutive modeling, PINNs for PDE solving, optimizers, and kernel methods.