Understanding training and generalization in deep learning by Fourier analysis

Zhiqin John Xu

arxiv: 1808.04295 · v4 · pith:VT47ORQQnew · submitted 2018-08-13 · 💻 cs.LG · cs.AI· math.OC· math.ST· stat.ML· stat.TH

Understanding training and generalization in deep learning by Fourier analysis

Zhiqin John Xu This is my paper

classification 💻 cs.LG cs.AImath.OCmath.STstat.MLstat.TH

keywords traininggeneralizationabilityanalysisdeepdnnsfourierfunction

0 comments

read the original abstract

Background: It is still an open research area to theoretically understand why Deep Neural Networks (DNNs)---equipped with many more parameters than training data and trained by (stochastic) gradient-based methods---often achieve remarkably low generalization error. Contribution: We study DNN training by Fourier analysis. Our theoretical framework explains: i) DNN with (stochastic) gradient-based methods often endows low-frequency components of the target function with a higher priority during the training; ii) Small initialization leads to good generalization ability of DNN while preserving the DNN's ability to fit any function. These results are further confirmed by experiments of DNNs fitting the following datasets, that is, natural images, one-dimensional functions and MNIST dataset.

This paper has not been read by Pith yet.

discussion (0)

Forward citations

Cited by 4 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

Spectral Energy Centroid: a Metric for Improving Performance and Analyzing Spectral Bias in Implicit Neural Representations
cs.LG 2026-05 unverdicted novelty 7.0

Spectral Energy Centroid is a new metric that quantifies signal frequency and INR spectral bias, supporting better hyperparameter selection and cross-architecture analysis.
Deep sequence models tend to memorize geometrically; it is unclear why
cs.LG 2025-10 unverdicted novelty 6.0

Deep sequence models develop geometric memory in embeddings that encodes novel global relationships, transforming l-fold composition tasks into 1-step navigation via a natural spectral bias connected to Node2Vec.
Theory of the Frequency Principle for General Deep Neural Networks
cs.LG 2019-06 unverdicted novelty 6.0

The paper establishes rigorous theorems proving the Frequency Principle holds for general deep neural networks at initial, intermediate, and final training stages.
Frequency-adaptive tensor neural networks for high-dimensional multi-scale problems
cs.LG 2025-08 unverdicted novelty 5.0

Frequency-adaptive tensor neural networks are proposed to overcome the frequency principle in TNNs for high-dimensional multi-scale problems by incorporating random Fourier features and 1D DFT on component functions.