arXiv preprint arXiv:2510.04780 , year =

Kernel Ridge Regression under Power-Law Data: Spectrum · 2025 · arXiv 2510.04780

3 Pith papers cite this work. Polarity classification is still indexing.

3 Pith papers citing it

representative citing papers

Scaling Laws from Sequential Feature Recovery: A Solvable Hierarchical Model

stat.ML · 2026-05-14 · accept · novelty 7.0

A solvable hierarchical model with power-law feature strengths yields explicit power-law scaling of prediction error through sequential recovery of latent directions by a layer-wise spectral algorithm.

A Fourier perspective on the learning dynamics of neural networks: from sample complexities to mechanistic insights

stat.ML · 2026-05-16 · conditional · novelty 6.0

Neural networks prioritize amplitude over phase in Fourier space during training on translation-invariant data; power-law spectra accelerate phase learning despite not aiding classification.

Asymmetric Scaling Laws from Sparse Features

stat.ML · 2026-05-22 · unverdicted · novelty 5.0

A sparse-activation model predicts double-descent loss with distinct under- and over-parameterized scaling exponents set by sparsity, plus a compute-optimal frontier favoring dataset growth.

citing papers explorer

Showing 3 of 3 citing papers.

Scaling Laws from Sequential Feature Recovery: A Solvable Hierarchical Model stat.ML · 2026-05-14 · accept · none · ref 187
A solvable hierarchical model with power-law feature strengths yields explicit power-law scaling of prediction error through sequential recovery of latent directions by a layer-wise spectral algorithm.
A Fourier perspective on the learning dynamics of neural networks: from sample complexities to mechanistic insights stat.ML · 2026-05-16 · conditional · none · ref 46
Neural networks prioritize amplitude over phase in Fourier space during training on translation-invariant data; power-law spectra accelerate phase learning despite not aiding classification.
Asymmetric Scaling Laws from Sparse Features stat.ML · 2026-05-22 · unverdicted · none · ref 73
A sparse-activation model predicts double-descent loss with distinct under- and over-parameterized scaling exponents set by sparsity, plus a compute-optimal frontier favoring dataset growth.

arXiv preprint arXiv:2510.04780 , year =

fields

years

verdicts

representative citing papers

citing papers explorer