pith. sign in

arxiv: 2404.06106 · v3 · pith:5DCI66TGnew · submitted 2024-04-09 · 💻 cs.LG

Unifying Low Dimensional Spectra in Deep Learning

classification 💻 cs.LG
keywords deeplearningdimensionalanalyticcanonicaleigenspectraeigenvalueseigenvectors
0
0 comments X
read the original abstract

Low dimensional structures appear ubiquitously in the eigenspectra of deep learning matrices in classification networks trained in the overparameterized regime. While theoretical advances have aimed to explain this phenomenology, they typically succeed only in capturing subsets of the full behavior or rely on assumptions that cannot hold in practice. In this work, we provide an analytic explanation for the bulk plus outlier structure of several canonical deep learning matrices, including the Hessian, gradients, and weights. We achieve this using unconstrained feature models (UFMs), a now-common tool for studying the emergence of deep neural collapse (DNC). We show that DNC is the source of these low dimensional eigenspectra, in each case, the eigenvalues and eigenvectors can be constructed from feature means, the characterizing objects of DNC. This provides a unifying analytic explanation for a wide range of spectral phenomena in deep learning and goes beyond empirical characterizations, which typically focus on eigenvalues, by providing a detailed analysis of eigenvectors. We prove that our results hold for both linear and ReLU networks and provide numerical validation in both the modeling context and standard deep-network architectures on canonical datasets.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. The Implicit Bias of Depth: From Neural Collapse to Softmax Codes

    cs.LG 2026-05 unverdicted novelty 7.0

    Depth induces an implicit low-rank bias in deep unconstrained feature models trained with unregularized multiclass cross-entropy, promoting softmax codes over neural collapse via more efficient norm propagation.