Optimal whitening and decorrelation

Agnan Kessy; Alex Lewin; Korbinian Strimmer

arxiv: 1512.00809 · v4 · pith:N5KW2QZCnew · submitted 2015-12-02 · 📊 stat.ME · stat.ML

Optimal whitening and decorrelation

Agnan Kessy , Alex Lewin , Korbinian Strimmer This is my paper

classification 📊 stat.ME stat.ML

keywords whiteningvariablesanalysisoriginalspheredcomponentmatrixmaximally

0 comments

read the original abstract

Whitening, or sphering, is a common preprocessing step in statistical analysis to transform random variables to orthogonality. However, due to rotational freedom there are infinitely many possible whitening procedures. Consequently, there is a diverse range of sphering methods in use, for example based on principal component analysis (PCA), Cholesky matrix decomposition and zero-phase component analysis (ZCA), among others. Here we provide an overview of the underlying theory and discuss five natural whitening procedures. Subsequently, we demonstrate that investigating the cross-covariance and the cross-correlation matrix between sphered and original variables allows to break the rotational invariance and to identify optimal whitening transformations. As a result we recommend two particular approaches: ZCA-cor whitening to produce sphered variables that are maximally similar to the original variables, and PCA-cor whitening to obtain sphered variables that maximally compress the original variables.

This paper has not been read by Pith yet.

discussion (0)

Forward citations

Cited by 2 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

Gaussian boson sampling: Benchmarking quantum advantage
quant-ph 2026-04 unverdicted novelty 6.0

A new classical algorithm for Gaussian boson sampling produces outputs closer to exact results than quantum experiments up to 1152 modes and scales efficiently, indicating hardware errors enable classical simulation.
Deep Multi-View Learning via Task-Optimal CCA
cs.LG 2019-07 unverdicted novelty 6.0

End-to-end deep optimization of CCA plus task loss produces discriminative shared representations that outperform prior multi-view methods on classification and semi-supervised tasks.