Advances in neural information processing systems , volume=

Wide neural networks of any depth evolve as linear models under gradient descent , author=

6 Pith papers cite this work. Polarity classification is still indexing.

6 Pith papers citing it

browse 6 citing papers

citation-role summary

background 1

citation-polarity summary

background 1

representative citing papers

Fixed-order PCA: Theory for Overestimated Factor Models

math.ST · 2026-05-18 · unverdicted · novelty 7.0

Establishes asymptotic consistency of factor estimates and √T-normality in factor-augmented regressions for fixed R ≥ r using anisotropic local laws from random matrix theory.

Scaling Laws from Sequential Feature Recovery: A Solvable Hierarchical Model

stat.ML · 2026-05-14 · accept · novelty 7.0

A solvable hierarchical model with power-law feature strengths yields explicit power-law scaling of prediction error through sequential recovery of latent directions by a layer-wise spectral algorithm.

Large Dimensional Kernel Ridge Regression: Extending to Product Kernels

stat.ML · 2026-05-14 · unverdicted · novelty 7.0

Extends high-dimensional KRR to product kernels, proving convergence rates that recover minimax optimality for source condition s ≤ 1, saturation for s > 1, and multiple-descent phenomena with respect to sample size n.

Wahkon: A Statistically Principled Deep RKHS Superposition Network

stat.ME · 2026-05-13 · unverdicted · novelty 6.0

Wahkon unifies Kolmogorov superposition with RKHS regularization to produce a deep network whose penalized estimator is exactly the MAP under a hierarchical GP prior and achieves minimax-optimal rates.

AdamO: A Collapse-Suppressed Optimizer for Offline RL

cs.LG · 2026-05-03 · unverdicted · novelty 6.0

AdamO modifies Adam with an orthogonality correction to ensure the spectral radius of the TD update operator stays below one, providing a theoretical stability guarantee for offline RL.

There Will Be a Scientific Theory of Deep Learning

stat.ML · 2026-04-23 · unverdicted · novelty 2.0

A mechanics of the learning process is emerging in deep learning theory, characterized by dynamics, coarse statistics, and falsifiable predictions across idealized settings, limits, laws, hyperparameters, and universal behaviors.

citing papers explorer

Showing 6 of 6 citing papers.

Fixed-order PCA: Theory for Overestimated Factor Models math.ST · 2026-05-18 · unverdicted · none · ref 126
Establishes asymptotic consistency of factor estimates and √T-normality in factor-augmented regressions for fixed R ≥ r using anisotropic local laws from random matrix theory.
Scaling Laws from Sequential Feature Recovery: A Solvable Hierarchical Model stat.ML · 2026-05-14 · accept · none · ref 153
A solvable hierarchical model with power-law feature strengths yields explicit power-law scaling of prediction error through sequential recovery of latent directions by a layer-wise spectral algorithm.
Large Dimensional Kernel Ridge Regression: Extending to Product Kernels stat.ML · 2026-05-14 · unverdicted · none · ref 91
Extends high-dimensional KRR to product kernels, proving convergence rates that recover minimax optimality for source condition s ≤ 1, saturation for s > 1, and multiple-descent phenomena with respect to sample size n.
Wahkon: A Statistically Principled Deep RKHS Superposition Network stat.ME · 2026-05-13 · unverdicted · none · ref 39
Wahkon unifies Kolmogorov superposition with RKHS regularization to produce a deep network whose penalized estimator is exactly the MAP under a hierarchical GP prior and achieves minimax-optimal rates.
AdamO: A Collapse-Suppressed Optimizer for Offline RL cs.LG · 2026-05-03 · unverdicted · none · ref 63
AdamO modifies Adam with an orthogonality correction to ensure the spectral radius of the TD update operator stays below one, providing a theoretical stability guarantee for offline RL.
There Will Be a Scientific Theory of Deep Learning stat.ML · 2026-04-23 · unverdicted · none · ref 173
A mechanics of the learning process is emerging in deep learning theory, characterized by dynamics, coarse statistics, and falsifiable predictions across idealized settings, limits, laws, hyperparameters, and universal behaviors.

Advances in neural information processing systems , volume=

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer