pith. sign in

arxiv: 2310.05351 · v3 · pith:FD4RKBQInew · submitted 2023-10-09 · 💻 cs.LG · cs.AI· cs.CV· cs.IT· math.IT

Generalized Neural Collapse for a Large Number of Classes

classification 💻 cs.LG cs.AIcs.CVcs.ITmath.IT
keywords neuralcollapseclassesfeaturegeneralizednumberdeepdimension
0
0 comments X
read the original abstract

Neural collapse provides an elegant mathematical characterization of learned last layer representations (a.k.a. features) and classifier weights in deep classification models. Such results not only provide insights but also motivate new techniques for improving practical deep models. However, most of the existing empirical and theoretical studies in neural collapse focus on the case that the number of classes is small relative to the dimension of the feature space. This paper extends neural collapse to cases where the number of classes are much larger than the dimension of feature space, which broadly occur for language models, retrieval systems, and face recognition applications. We show that the features and classifier exhibit a generalized neural collapse phenomenon, where the minimum one-vs-rest margins is maximized.We provide empirical study to verify the occurrence of generalized neural collapse in practical deep neural networks. Moreover, we provide theoretical study to show that the generalized neural collapse provably occurs under unconstrained feature model with spherical constraint, under certain technical conditions on feature dimension and number of classes.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. The Implicit Bias of Depth: From Neural Collapse to Softmax Codes

    cs.LG 2026-05 unverdicted novelty 7.0

    Depth induces an implicit low-rank bias in deep unconstrained feature models trained with unregularized multiclass cross-entropy, promoting softmax codes over neural collapse via more efficient norm propagation.