pith. sign in

arxiv: 2410.00357 · v2 · pith:VIBU36KPnew · submitted 2024-10-01 · 💻 cs.LG · stat.ML

Neural Scaling Laws of Deep ReLU and Deep Operator Network: A Theoretical Study

classification 💻 cs.LG stat.ML
keywords deeplawsneuralscalingoperatorfunctionsnetworkstheoretical
0
0 comments X
read the original abstract

Neural scaling laws play a pivotal role in the performance of deep neural networks and have been observed in a wide range of tasks. However, a complete theoretical framework for understanding these scaling laws remains underdeveloped. In this paper, we explore the neural scaling laws for deep operator networks, which involve learning mappings between function spaces, with a focus on the Chen and Chen style architecture. These approaches, which include the popular Deep Operator Network (DeepONet), approximate the output functions using a linear combination of learnable basis functions and coefficients that depend on the input functions. We establish a theoretical framework to quantify the neural scaling laws by analyzing its approximation and generalization errors. We articulate the relationship between the approximation and generalization errors of deep operator networks and key factors such as network model size and training data size. Moreover, we address cases where input functions exhibit low-dimensional structures, allowing us to derive tighter error bounds. These results also hold for deep ReLU networks and other similar structures. Our results offer a partial explanation of the neural scaling laws in operator learning and provide a theoretical foundation for their applications.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. MVNN: A Measure-Valued Neural Network for Learning McKean-Vlasov Dynamics from Particle Data

    math.NA 2026-04 unverdicted novelty 7.0

    MVNN learns measure-dependent drift terms in McKean-Vlasov equations from particle data using an embedding network, with proofs of well-posedness, propagation of chaos, and universal approximation under low-dimensiona...