A solvable hierarchical model with power-law feature strengths yields explicit power-law scaling of prediction error through sequential recovery of latent directions by a layer-wise spectral algorithm.
arXiv preprint arXiv:2510.04780 , year =
3 Pith papers cite this work. Polarity classification is still indexing.
fields
stat.ML 3years
2026 3representative citing papers
Neural networks prioritize amplitude over phase in Fourier space during training on translation-invariant data; power-law spectra accelerate phase learning despite not aiding classification.
A sparse-activation model predicts double-descent loss with distinct under- and over-parameterized scaling exponents set by sparsity, plus a compute-optimal frontier favoring dataset growth.
citing papers explorer
-
Scaling Laws from Sequential Feature Recovery: A Solvable Hierarchical Model
A solvable hierarchical model with power-law feature strengths yields explicit power-law scaling of prediction error through sequential recovery of latent directions by a layer-wise spectral algorithm.
-
A Fourier perspective on the learning dynamics of neural networks: from sample complexities to mechanistic insights
Neural networks prioritize amplitude over phase in Fourier space during training on translation-invariant data; power-law spectra accelerate phase learning despite not aiding classification.
-
Asymmetric Scaling Laws from Sparse Features
A sparse-activation model predicts double-descent loss with distinct under- and over-parameterized scaling exponents set by sparsity, plus a compute-optimal frontier favoring dataset growth.