pith. sign in

arxiv: 1511.04707 · v5 · pith:DKKKPAEJnew · submitted 2015-11-15 · 💻 cs.LG

Deep Linear Discriminant Analysis

classification 💻 cs.LG
keywords deepnetworkanalysiscifar-10classclassicdeepldadifferent
0
0 comments X
read the original abstract

We introduce Deep Linear Discriminant Analysis (DeepLDA) which learns linearly separable latent representations in an end-to-end fashion. Classic LDA extracts features which preserve class separability and is used for dimensionality reduction for many classification problems. The central idea of this paper is to put LDA on top of a deep neural network. This can be seen as a non-linear extension of classic LDA. Instead of maximizing the likelihood of target labels for individual samples, we propose an objective function that pushes the network to produce feature distributions which: (a) have low variance within the same class and (b) high variance between different classes. Our objective is derived from the general LDA eigenvalue problem and still allows to train with stochastic gradient descent and back-propagation. For evaluation we test our approach on three different benchmark datasets (MNIST, CIFAR-10 and STL-10). DeepLDA produces competitive results on MNIST and CIFAR-10 and outperforms a network trained with categorical cross entropy (same architecture) on a supervised setting of STL-10.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Forward citations

Cited by 2 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. Implicit Bias in Deep Linear Discriminant Analysis

    cs.LG 2026-03 unverdicted novelty 7.0

    Gradient flow on deep diagonal linear LDA networks with balanced initialization converts additive updates to multiplicative updates, automatically conserving the (2/L) quasi-norm.

  2. Linear Discriminant Analysis with Gradient Optimization

    stat.CO 2025-06 unverdicted novelty 5.0

    LDA-GO uses scalable gradient optimization on low-rank precision matrices with data-driven loss selection for high-dimensional LDA, claiming Bayes optimality and finite-sample error bounds.