Heavy-Tailed Principal Component Analysis

Christopher Khater; Ibrahim Abou-Faycal; Jihad Fahs; Mario Sayde

arxiv: 2603.11308 · v2 · submitted 2026-03-11 · 💻 cs.LG

Heavy-Tailed Principal Component Analysis

Mario Sayde , Christopher Khater , Jihad Fahs , Ibrahim Abou-Faycal This is my paper

Pith reviewed 2026-05-15 12:50 UTC · model grok-4.3

classification 💻 cs.LG

keywords heavy-tailed PCAlogarithmic losssuperstatistical modelrobust dimensionality reductionprincipal componentsinfinite variancecovariance estimation

0 comments

The pith

Principal components of heavy-tailed observations match those of the underlying Gaussian when using logarithmic loss.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

This paper studies PCA on high-dimensional data generated as a random positive scale factor times a Gaussian vector, a model that produces heavy tails including multivariate t and alpha-stable distributions. It formulates the problem with a logarithmic loss that remains finite without requiring finite second moments. The central theoretical result is that the directions recovered by minimizing this loss on the heavy-tailed samples are identical to the eigenvectors of the covariance matrix of the latent Gaussian vectors. The authors then construct estimators for that hidden covariance from the observed heavy-tailed samples and validate them through background-denoising experiments, where the new estimators outperform classical PCA under impulsive noise and remain competitive under Gaussian noise.

Core claim

Under the logarithmic loss, the principal components of heavy-tailed observations generated according to the superstatistical model X = A^{1/2} G coincide with the principal components obtained by standard PCA on the covariance matrix of G.

What carries the argument

The logarithmic loss applied within the superstatistical model X = A^{1/2} G, which makes the minimizers of the loss on the observed heavy-tailed vectors identical to the eigenvectors of the covariance of the latent Gaussian G.

If this is right

Robust estimators for the latent Gaussian covariance can be built directly from heavy-tailed observations.
These estimators recover the true principal directions reliably in the presence of heavy tails and impulsive noise.
The recovered directions remain competitive with classical PCA when the data is in fact Gaussian.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

If real data approximately follows the same scale-Gaussian dependence structure, the equivalence supplies a moment-free route to consistent PCA.
Alternative losses that preserve the same equivalence property could yield other robust variants.
Experiments on data whose tails arise from different dependence mechanisms would show where the coincidence fails.

Load-bearing premise

The observations are generated exactly as a positive random scalar times a Gaussian vector.

What would settle it

Generate samples from the exact model X = A^{1/2} G, run logarithmic-loss PCA on the X samples, and compare the recovered directions to the eigenvectors of the sample covariance of the latent G vectors; systematic mismatch would disprove the claimed coincidence.

Figures

Figures reproduced from arXiv: 2603.11308 by Christopher Khater, Ibrahim Abou-Faycal, Jihad Fahs, Mario Sayde.

**Figure 2.** Figure 2: Comparing first method equation (13), Tyler and the PCA’s covariance estimator under heavy-tailed and Gaussian data. [PITH_FULL_IMAGE:figures/full_fig_p009_2.png] view at source ↗

**Figure 3.** Figure 3: Visualization of Principal Component 1 (PC1) using equation (15) and that given by the standard PCA in comparison [PITH_FULL_IMAGE:figures/full_fig_p010_3.png] view at source ↗

read the original abstract

Principal Component Analysis (PCA) is a cornerstone of dimensionality reduction, yet its classical formulation relies critically on second-order moments and is therefore fragile in the presence of heavy-tailed data and impulsive noise. While numerous robust PCA variants have been proposed, most either assume finite variance, rely on sparsity-driven decompositions, or address robustness through surrogate loss functions without a unified treatment of infinite-variance models. In this paper, we study PCA for high-dimensional data generated according to a superstatistical dependent model of the form $\mathbf{X} = A^{1/2}\mathbf{G}$, where $A$ is a positive random scalar and $\mathbf{G}$ is a Gaussian vector. This framework captures a wide class of heavy-tailed distributions, including multivariate $t$ and sub-Gaussian $\alpha$-stable laws. We formulate PCA under a logarithmic loss, which remains well defined even when moments do not exist. Our main theoretical result shows that, under this loss, the principal components of the heavy-tailed observations coincide with those obtained by applying standard PCA to the covariance matrix of the underlying Gaussian generator. Building on this insight, we propose robust estimators for this covariance matrix directly from heavy-tailed data and compare them with the empirical covariance and Tyler's scatter estimator. Extensive experiments, including background denoising tasks, demonstrate that the proposed approach reliably recovers principal directions and significantly outperforms classical PCA in the presence of heavy-tailed and impulsive noise, while remaining competitive under Gaussian noise.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper shows log-loss PCA recovers the same directions as Gaussian PCA on the generator covariance under the exact superstatistical model, but the claim is narrow and the experiments are underspecified.

read the letter

The central result is that for data generated as X = A^{1/2} G with shared scalar A and Gaussian G, the log-loss minimizer for PCA on the heavy-tailed X coincides with ordinary PCA on Cov(G). This covers multivariate-t and sub-Gaussian alpha-stable cases and gives a moment-free route to the underlying directions. They then estimate that covariance directly from the observed heavy-tailed samples and compare against empirical covariance and Tyler's estimator on background denoising tasks, where the new estimators hold up better under impulsive noise while staying competitive under Gaussian noise. That reduction to the generator covariance is the freshest part; it is not just another surrogate loss but an explicit equivalence for this generative structure. The experiments suggest practical gains in the targeted settings. The main limitations are that the abstract supplies no derivation steps or proof outline, the comparisons are summarized without sample sizes, error bars, or exclusion rules, and the equivalence requires the precise shared-A structure. Other heavy-tail mechanisms, such as independent per-component multipliers, will generally produce different minimizers. The work is internally consistent on its own terms and engages the relevant robust-PCA literature without obvious circularity. It is aimed at researchers handling high-dimensional infinite-variance data in communications, finance, or imaging who want a principled alternative to ad-hoc clipping. The idea is sharp enough to merit referee time even if the experimental reporting and generality claims need tightening.

Referee Report

3 major / 2 minor

Summary. The paper studies PCA for heavy-tailed data generated from the superstatistical model X = A^{1/2} G, where A is a positive random scalar and G is Gaussian. It introduces a logarithmic loss that remains well-defined without finite moments and claims that, under this loss, the principal components recovered from the heavy-tailed observations X coincide exactly with those obtained by applying standard PCA to the covariance matrix of the underlying Gaussian G. The paper proposes estimators for this covariance directly from heavy-tailed data, compares them to the empirical covariance and Tyler's scatter estimator, and reports experiments on background denoising tasks showing improved recovery of principal directions under heavy-tailed and impulsive noise.

Significance. If the central equivalence is rigorously established, the work supplies a principled reduction of robust PCA to ordinary Gaussian PCA for an important class of infinite-variance distributions (multivariate-t and sub-Gaussian alpha-stable laws). This could influence the design of moment-free dimensionality-reduction methods in signal processing and machine learning. The experimental comparisons on denoising tasks provide initial evidence of practical utility, though they require additional statistical detail to be fully convincing.

major comments (3)

[Abstract / theoretical result] Abstract and theoretical development: the central claim that the logarithmic-loss stationarity condition reduces to the eigenvector equation for Cov(G) is asserted without any derivation steps, key intermediate equations, or proof outline. Because this equivalence is the load-bearing theoretical result, the absence of even a sketch prevents verification of whether the common scalar A is exploited exactly as described.
[Experiments] Experimental section: comparisons to Tyler's estimator and the empirical covariance are presented as summary statements without reported sample sizes, number of Monte-Carlo trials, error bars, or explicit exclusion rules for outliers. This information is necessary to assess whether the reported outperformance under heavy-tailed noise is statistically reliable.
[Generative model] Model assumptions: the equivalence is derived under the precise generative structure X = A^{1/2} G with a single shared scalar A for all observations. The manuscript should explicitly discuss whether the result continues to hold (or fails) when heavy tails arise from component-wise independent multipliers or non-elliptical dependence structures, as these are common alternative mechanisms.

minor comments (2)

[Abstract] The abstract states that the model 'includes' multivariate-t and sub-Gaussian alpha-stable laws; a brief sentence clarifying that these inclusions still rely on the shared scalar A would improve precision.
[Method] Notation for the logarithmic loss should be introduced with an explicit equation number in the main text rather than only in the abstract.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for the constructive comments and the positive assessment of the work's potential impact. We address each major comment point by point below. All requested clarifications and additions will be incorporated into the revised manuscript.

read point-by-point responses

Referee: [Abstract / theoretical result] Abstract and theoretical development: the central claim that the logarithmic-loss stationarity condition reduces to the eigenvector equation for Cov(G) is asserted without any derivation steps, key intermediate equations, or proof outline. Because this equivalence is the load-bearing theoretical result, the absence of even a sketch prevents verification of whether the common scalar A is exploited exactly as described.

Authors: We agree that the initial submission omitted a derivation sketch for the central equivalence. In the revised manuscript we will insert a complete proof outline in Section 3. The argument proceeds as follows: the logarithmic loss is the negative log-likelihood under the superstatistical model; its stationarity condition with respect to the subspace projector yields an expectation of A-weighted outer products of the observations; because the same scalar A multiplies every coordinate of each vector, the weighting factors factor out of the expectation and cancel, leaving precisely the eigenvector equation for Cov(G). This step-by-step derivation will make explicit how the shared A is used. revision: yes
Referee: [Experiments] Experimental section: comparisons to Tyler's estimator and the empirical covariance are presented as summary statements without reported sample sizes, number of Monte-Carlo trials, error bars, or explicit exclusion rules for outliers. This information is necessary to assess whether the reported outperformance under heavy-tailed noise is statistically reliable.

Authors: We acknowledge that the experimental section lacked the necessary statistical details. In the revision we will report: 100 independent Monte-Carlo trials for each setting, sample sizes (n=500, d=100 for the synthetic experiments and n=2000 for the denoising task), standard-error bars computed across trials, and the explicit rule that no observations were excluded beyond the generative model itself. These additions will allow readers to evaluate the reliability of the reported improvements. revision: yes
Referee: [Generative model] Model assumptions: the equivalence is derived under the precise generative structure X = A^{1/2} G with a single shared scalar A for all observations. The manuscript should explicitly discuss whether the result continues to hold (or fails) when heavy tails arise from component-wise independent multipliers or non-elliptical dependence structures, as these are common alternative mechanisms.

Authors: We will add a dedicated paragraph in the discussion section (new Section 5.2) addressing model assumptions. The equivalence relies critically on the single shared scalar A; when heavy tails are generated by component-wise independent multipliers the stationarity condition no longer reduces to Cov(G) and the principal directions recovered under the log loss generally differ. We will include a short analytic counter-example for the independent-multiplier case and note that the superstatistical model is therefore a specific but practically relevant subclass (multivariate-t, sub-Gaussian stable) rather than a universal heavy-tail model. revision: yes

Circularity Check

0 steps flagged

No circularity: theoretical equivalence derived from generative model without reduction to fit or self-citation

full rationale

The central claim is a non-trivial theoretical result: under the logarithmic loss, the principal components recovered from observations X = A^{1/2} G coincide with those of standard PCA on Cov(G). This follows directly from the stationarity condition of the loss exploiting the shared scalar A to reduce to the eigenvector equation of the Gaussian covariance; the derivation is self-contained within the stated superstatistical model and does not invoke fitted parameters renamed as predictions, self-citations as load-bearing premises, or any of the enumerated circular patterns. Experiments compare estimators but do not alter the theoretical step. The result is falsifiable outside the paper via data generated from the model versus alternatives.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central claim rests on the generative model X = A^{1/2} G and the choice of logarithmic loss; no explicit free parameters are fitted in the abstract description, though the proposed estimators may involve implicit tuning.

axioms (1)

domain assumption Observations follow the superstatistical model X = A^{1/2} G with A positive random scalar and G Gaussian vector
This is the explicit generative assumption used to derive the equivalence between log-loss PCA and Gaussian covariance PCA.

pith-pipeline@v0.9.0 · 5558 in / 1311 out tokens · 50712 ms · 2026-05-15T12:50:36.383824+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

Our main theoretical result shows that, under this loss, the principal components of the heavy-tailed observations coincide with those obtained by applying standard PCA to the covariance matrix of the underlying Gaussian generator.
IndisputableMonolith/Foundation/ArithmeticFromLogic.lean LogicNat recovery unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

We formulate PCA under a logarithmic loss, which remains well defined even when moments do not exist.

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

44 extracted references · 44 canonical work pages · 1 internal anchor

[1]

On lines and planes of closest fit to systems of points in space,

K. Pearson, “On lines and planes of closest fit to systems of points in space,”Philosophical Magazine, vol. 2, no. 11, pp. 559–572, 1901

work page 1901
[2]

Analysis of a complex of statistical variables into principal components,

H. Hotelling, “Analysis of a complex of statistical variables into principal components,”Journal of Educational Psychology, vol. 24, no. 6, pp. 417–441, 1933

work page 1933
[3]

Principal component analysis based on robust estimators of the covariance or correlation matrix: influence functions and efficiencies,

C. Croux and G. Haesbroeck, “Principal component analysis based on robust estimators of the covariance or correlation matrix: influence functions and efficiencies,”Biometrika, vol. 87, no. 3, pp. 603–618, 09 2000. [Online]. Available: https://doi.org/10.1093/biomet/87.3.603

work page doi:10.1093/biomet/87.3.603 2000
[4]

Robust principal component analysis?

E. J. Cand `es, X. Li, Y . Ma, and J. Wright, “Robust principal component analysis?”Journal of the ACM (JACM), vol. 58, no. 3, pp. 1–37, 2011

work page 2011
[5]

Stable principal component pursuit,

Z. Zhou, X. Li, J. Wright, E. Candes, and Y . Ma, “Stable principal component pursuit,” in2010 IEEE international symposium on information theory. IEEE, 2010, pp. 1518–1522

work page 2010
[6]

Robpca: a new approach to robust principal component analysis,

M. Hubert, P. J. Rousseeuw, and K. Vanden Branden, “Robpca: a new approach to robust principal component analysis,”Technometrics, vol. 47, no. 1, pp. 64–79, 2005

work page 2005
[7]

Robust principal component analysis via outlier pursuit,

H. Xu, C. Caramanis, and S. Mannor, “Robust principal component analysis via outlier pursuit,”IEEE Transactions on Information Theory, vol. 58, no. 5, pp. 3047–3064, 2012

work page 2012
[8]

Tyler’s m-estimator, random matrix theory, and generalized elliptical distributions with applications to finance,

G. Frahm and U. Jaekel, “Tyler’s m-estimator, random matrix theory, and generalized elliptical distributions with applications to finance,” Discussion Papers in Statistics and Econometrics, Tech. Rep., 2007

work page 2007
[9]

Distribution of eigenvalues for some sets of random matrices,

V . A. Mar ˇcenko and L. A. Pastur, “Distribution of eigenvalues for some sets of random matrices,”Mathematics of the USSR-Sbornik, vol. 1, no. 4, p. 457, 1967

work page 1967
[10]

Robust principal component analysis based on maximum correntropy criterion,

R. He, W.-S. Wang, and B.-G. Hu, “Robust principal component analysis based on maximum correntropy criterion,”IEEE Transactions on Image Processing, vol. 20, no. 6, pp. 1485–1494, 2011

work page 2011
[11]

R 1-pca: rotational invariant l 1-norm principal component analysis for robust subspace factorization,

C. Ding, D. Zhou, X. He, and H. Zha, “R 1-pca: rotational invariant l 1-norm principal component analysis for robust subspace factorization,” in Proceedings of the 23rd international conference on Machine learning, 2006, pp. 281–288

work page 2006
[12]

Principal component analysis based on l1-norm maximization,

N. Kwak, “Principal component analysis based on l1-norm maximization,”IEEE transactions on pattern analysis and machine intelligence, vol. 30, no. 9, pp. 1672–1680, 2008

work page 2008
[13]

Estimation of the covariance structure of heavy-tailed distributions,

X. Wei and S. Minsker, “Estimation of the covariance structure of heavy-tailed distributions,”Advances in neural information processing systems, vol. 30, 2017

work page 2017
[14]

Online robust principal component analysis with change point detection,

W. Xiao, X. Huang, F. He, J. Silva, S. Emrani, and A. Chaudhuri, “Online robust principal component analysis with change point detection,”IEEE Transactions on Multimedia, vol. 22, no. 1, pp. 59–68, 2019

work page 2019
[15]

Coherence pursuit: Fast, simple, and robust principal component analysis,

M. Rahmani and G. K. Atia, “Coherence pursuit: Fast, simple, and robust principal component analysis,”IEEE Transactions on Signal Processing, vol. 65, no. 23, pp. 6260–6275, 2017

work page 2017
[16]

Tensor robust principal component analysis: Exact recovery of corrupted low-rank tensors via convex optimization,

C. Lu, J. Feng, Y . Chen, W. Liu, Z. Lin, and S. Yan, “Tensor robust principal component analysis: Exact recovery of corrupted low-rank tensors via convex optimization,” inProceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2016

work page 2016
[17]

Tensor robust principal component analysis with a new tensor nuclear norm,

——, “Tensor robust principal component analysis with a new tensor nuclear norm,”IEEE transactions on pattern analysis and machine intelligence, vol. 42, no. 4, pp. 925–938, 2019

work page 2019
[18]

Entropic principal component analysis using Cauchy–Schwarz divergence,

E. K. Nakao and A. L. M. Levada, “Entropic principal component analysis using Cauchy–Schwarz divergence,”Knowledge and Information Systems, vol. 65, no. 3, pp. 945–971, 2023. [Online]. Available: https://doi.org/10.1007/s10115-023-01783-8

work page doi:10.1007/s10115-023-01783-8 2023
[19]

Cauchy robust principal component analysis with applications to high-dimensional data sets,

A. Fayomi, Y . Pantazis, M. Tsagris, and A. T. A. Wood, “Cauchy robust principal component analysis with applications to high-dimensional data sets,” Statistics and Computing, vol. 34, no. 1, p. 26, 2024. [Online]. Available: https://doi.org/10.1007/s11222-023-10328-x

work page doi:10.1007/s11222-023-10328-x 2024
[20]

Robust pca for high-dimensional data based on characteristic transformation,

L. He, Y . Yang, and B. Zhang, “Robust pca for high-dimensional data based on characteristic transformation,”Australian & New Zealand Journal of Statistics, vol. 65, no. 2, pp. 127–151, 2023. [Online]. Available: https://onlinelibrary.wiley.com/doi/abs/10.1111/anzs.12385

work page doi:10.1111/anzs.12385 2023
[21]

Spherical principal component analysis,

K. Liu, Q. Li, H. Wang, and G. Tang, “Spherical principal component analysis,” inProceedings of the 2019 SIAM international conference on data mining. SIAM, 2019, pp. 387–395

work page 2019
[22]

Statistical properties of kernel principal component analysis,

G. Blanchard, O. Bousquet, and L. Zwald, “Statistical properties of kernel principal component analysis,”Machine Learning, vol. 66, pp. 259–294, 2007

work page 2007
[23]

On the applications of robust pca in image and video processing,

T. Bouwmans, S. Javed, H. Zhang, Z. Lin, and R. Otazo, “On the applications of robust pca in image and video processing,”Proceedings of the IEEE, vol. 106, no. 8, pp. 1427–1457, 2018

work page 2018
[24]

Dynamics ofimplied volatility surfaces,

R. Cont and J. Da Fonseca, “Dynamics ofimplied volatility surfaces,”Quantitative finance, vol. 2, no. 1, p. 45, 2002

work page 2002
[25]

Principal component analysis for data containing outliers and missing elements,

S. Serneels and T. Verdonck, “Principal component analysis for data containing outliers and missing elements,”Computational Statistics & Data Analysis, vol. 52, no. 3, pp. 1712–1727, 2008

work page 2008
[26]

Robust pca unrolling network for super-resolution vessel extraction in x-ray coronary angiography,

B. Qin, H. Mao, Y . Liu, J. Zhao, Y . Lv, Y . Zhu, S. Ding, and X. Chen, “Robust pca unrolling network for super-resolution vessel extraction in x-ray coronary angiography,”IEEE Transactions on Medical Imaging, vol. 41, no. 11, pp. 3087–3098, 2022

work page 2022
[27]

Random features for large-scale kernel machines,

A. Rahimi and B. Recht, “Random features for large-scale kernel machines,” inAdvances in Neural Information Processing Systems, J. Platt, D. Koller, Y . Singer, and S. Roweis, Eds., vol. 20. Curran Associates, Inc., 2007

work page 2007
[28]

The generalization error of random features regression: Precise asymptotics and the double descent curve,

S. Mei and A. Montanari, “The generalization error of random features regression: Precise asymptotics and the double descent curve,”Communications on Pure and Applied Mathematics, vol. 75, 2019. [Online]. Available: https://api.semanticscholar.org/CorpusID:199668852

work page 2019
[29]

Generalisation error in learning with random features and the hidden manifold model*,

F. Gerace, B. Loureiro, F. Krzakala, M. M ´ezard, and L. Zdeborov ´a, “Generalisation error in learning with random features and the hidden manifold model*,”Journal of Statistical Mechanics: Theory and Experiment, vol. 2021, no. 12, p. 124013, dec 2021. [Online]. Available: https://dx.doi.org/10.1088/1742-5468/ac3ae6

work page doi:10.1088/1742-5468/ac3ae6 2021
[30]

Classification of heavy-tailed features in high dimensions: a superstatistical approach,

U. Adomaityte, G. Sicuro, and P. Vivo, “Classification of heavy-tailed features in high dimensions: a superstatistical approach,” inNeural Information Processing Systems, 2023. [Online]. Available: https://api.semanticscholar.org/CorpusID:257985400

work page 2023
[31]

Superstatistics: Theory and applications,

C. Beck, “Superstatistics: Theory and applications,”Continuum Mechanics and Thermodynamics, vol. 16, 03 2003

work page 2003
[32]

Scale mixtures of gaussians and the statistics of natural images,

M. J. Wainwright and E. Simoncelli, “Scale mixtures of gaussians and the statistics of natural images,” inAdvances in Neural Information Processing Systems, S. Solla, T. Leen, and K. M ¨uller, Eds., vol. 12. MIT Press, 1999

work page 1999
[33]

High-dimensional robust regression under heavy-tailed data: Asymptotics and universality,

U. Adomaityte, L. Defilippis, B. Loureiro, and G. Sicuro, “High-dimensional robust regression under heavy-tailed data: Asymptotics and universality,” Journal of Statistical Mechanics: Theory and Experiment, no. 11, p. 114002, 2024. [Online]. Available: https://doi.org/10.1088/1742-5468/ad65e6

work page doi:10.1088/1742-5468/ad65e6 2024
[34]

and Taqqu, M

Samoradnitsky, G. and Taqqu, M. S.,Stable Non-Gaussian Random Processes: Stochastic Models with Infinite Variance. New York: Chapman and Hall, June 1994

work page 1994
[35]

On the multivariate t distribution,

M. Roth, “On the multivariate t distribution,” Link ¨oping University Electronic Press, Link ¨oping, Tech. Rep., 04 2013

work page 2013
[36]

Cauchy Principal Component Analysis

P. Xie and E. P. Xing, “Cauchy principal component analysis,”arXiv preprint arXiv:1412.6506, 2014. [Online]. Available: https://arxiv.org/abs/1412.6506

work page internal anchor Pith review Pith/arXiv arXiv 2014
[37]

Aub’s heavy-tails package,

AUB-HTP Project, “Aub’s heavy-tails package,” https://github.com/AUB-HTP, 2026, accessed: 2026-04-03

work page 2026
[38]

Heavy-tailed linear regression and k-means,

M. Sayde, J. Fahs, and I. Abou-Faycal, “Heavy-tailed linear regression and k-means,”Information, vol. 16, no. 3, p. 184, 2025

work page 2025
[39]

Gnedenko, B. V . and Kolmogorov, A. N. ,Limit Distributions for Sums of Independent Random Variables. Reading Massachusetts: Addison-Wesley Publishing Company, 1968

work page 1968
[40]

D. G. Luenberger,Optimization by Vector Space Methods, 1st ed., ser. Wiley Professional Paperback Series. New York: Wiley, Sep. 1997. [Online]. Available: https://books.google.com.lb/books?id=lZU0CAH4RccC

work page 1997
[41]

Lecture notes on matrix analysis,

M. W. Meckes, “Lecture notes on matrix analysis,” 2019, see Theorem 3.13 (Fan’s maximal principle). [Online]. Available: https://case.edu/artsci/math/mwmeckes/matrix-analysis.pdf

work page 2019
[42]

On the distribution of the quotient of two chance variables,

J. Curtiss, “On the distribution of the quotient of two chance variables,”The Annals of Mathematical Statistics, vol. 12, no. 4, pp. 409–421, 1941

work page 1941
[43]

Information measures, inequalities and performance bounds for parameter estimation in impulsive noise environments,

J. Fahs and I. Abou-Faycal, “Information measures, inequalities and performance bounds for parameter estimation in impulsive noise environments,” IEEE Transactions on Information Theory, vol. 64, no. 3, pp. 1825–1844, 2017

work page 2017
[44]

Information measures, inequalities and performance bounds for parameter estimation in impulsive noise environments,

——, “Information measures, inequalities and performance bounds for parameter estimation in impulsive noise environments,”IEEE Transactions on Information Theory, vol. 64, no. 3, pp. 1825–1844, 2018

work page 2018

[1] [1]

On lines and planes of closest fit to systems of points in space,

K. Pearson, “On lines and planes of closest fit to systems of points in space,”Philosophical Magazine, vol. 2, no. 11, pp. 559–572, 1901

work page 1901

[2] [2]

Analysis of a complex of statistical variables into principal components,

H. Hotelling, “Analysis of a complex of statistical variables into principal components,”Journal of Educational Psychology, vol. 24, no. 6, pp. 417–441, 1933

work page 1933

[3] [3]

Principal component analysis based on robust estimators of the covariance or correlation matrix: influence functions and efficiencies,

C. Croux and G. Haesbroeck, “Principal component analysis based on robust estimators of the covariance or correlation matrix: influence functions and efficiencies,”Biometrika, vol. 87, no. 3, pp. 603–618, 09 2000. [Online]. Available: https://doi.org/10.1093/biomet/87.3.603

work page doi:10.1093/biomet/87.3.603 2000

[4] [4]

Robust principal component analysis?

E. J. Cand `es, X. Li, Y . Ma, and J. Wright, “Robust principal component analysis?”Journal of the ACM (JACM), vol. 58, no. 3, pp. 1–37, 2011

work page 2011

[5] [5]

Stable principal component pursuit,

Z. Zhou, X. Li, J. Wright, E. Candes, and Y . Ma, “Stable principal component pursuit,” in2010 IEEE international symposium on information theory. IEEE, 2010, pp. 1518–1522

work page 2010

[6] [6]

Robpca: a new approach to robust principal component analysis,

M. Hubert, P. J. Rousseeuw, and K. Vanden Branden, “Robpca: a new approach to robust principal component analysis,”Technometrics, vol. 47, no. 1, pp. 64–79, 2005

work page 2005

[7] [7]

Robust principal component analysis via outlier pursuit,

H. Xu, C. Caramanis, and S. Mannor, “Robust principal component analysis via outlier pursuit,”IEEE Transactions on Information Theory, vol. 58, no. 5, pp. 3047–3064, 2012

work page 2012

[8] [8]

Tyler’s m-estimator, random matrix theory, and generalized elliptical distributions with applications to finance,

G. Frahm and U. Jaekel, “Tyler’s m-estimator, random matrix theory, and generalized elliptical distributions with applications to finance,” Discussion Papers in Statistics and Econometrics, Tech. Rep., 2007

work page 2007

[9] [9]

Distribution of eigenvalues for some sets of random matrices,

V . A. Mar ˇcenko and L. A. Pastur, “Distribution of eigenvalues for some sets of random matrices,”Mathematics of the USSR-Sbornik, vol. 1, no. 4, p. 457, 1967

work page 1967

[10] [10]

Robust principal component analysis based on maximum correntropy criterion,

R. He, W.-S. Wang, and B.-G. Hu, “Robust principal component analysis based on maximum correntropy criterion,”IEEE Transactions on Image Processing, vol. 20, no. 6, pp. 1485–1494, 2011

work page 2011

[11] [11]

R 1-pca: rotational invariant l 1-norm principal component analysis for robust subspace factorization,

C. Ding, D. Zhou, X. He, and H. Zha, “R 1-pca: rotational invariant l 1-norm principal component analysis for robust subspace factorization,” in Proceedings of the 23rd international conference on Machine learning, 2006, pp. 281–288

work page 2006

[12] [12]

Principal component analysis based on l1-norm maximization,

N. Kwak, “Principal component analysis based on l1-norm maximization,”IEEE transactions on pattern analysis and machine intelligence, vol. 30, no. 9, pp. 1672–1680, 2008

work page 2008

[13] [13]

Estimation of the covariance structure of heavy-tailed distributions,

X. Wei and S. Minsker, “Estimation of the covariance structure of heavy-tailed distributions,”Advances in neural information processing systems, vol. 30, 2017

work page 2017

[14] [14]

Online robust principal component analysis with change point detection,

W. Xiao, X. Huang, F. He, J. Silva, S. Emrani, and A. Chaudhuri, “Online robust principal component analysis with change point detection,”IEEE Transactions on Multimedia, vol. 22, no. 1, pp. 59–68, 2019

work page 2019

[15] [15]

Coherence pursuit: Fast, simple, and robust principal component analysis,

M. Rahmani and G. K. Atia, “Coherence pursuit: Fast, simple, and robust principal component analysis,”IEEE Transactions on Signal Processing, vol. 65, no. 23, pp. 6260–6275, 2017

work page 2017

[16] [16]

Tensor robust principal component analysis: Exact recovery of corrupted low-rank tensors via convex optimization,

C. Lu, J. Feng, Y . Chen, W. Liu, Z. Lin, and S. Yan, “Tensor robust principal component analysis: Exact recovery of corrupted low-rank tensors via convex optimization,” inProceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2016

work page 2016

[17] [17]

Tensor robust principal component analysis with a new tensor nuclear norm,

——, “Tensor robust principal component analysis with a new tensor nuclear norm,”IEEE transactions on pattern analysis and machine intelligence, vol. 42, no. 4, pp. 925–938, 2019

work page 2019

[18] [18]

Entropic principal component analysis using Cauchy–Schwarz divergence,

E. K. Nakao and A. L. M. Levada, “Entropic principal component analysis using Cauchy–Schwarz divergence,”Knowledge and Information Systems, vol. 65, no. 3, pp. 945–971, 2023. [Online]. Available: https://doi.org/10.1007/s10115-023-01783-8

work page doi:10.1007/s10115-023-01783-8 2023

[19] [19]

Cauchy robust principal component analysis with applications to high-dimensional data sets,

A. Fayomi, Y . Pantazis, M. Tsagris, and A. T. A. Wood, “Cauchy robust principal component analysis with applications to high-dimensional data sets,” Statistics and Computing, vol. 34, no. 1, p. 26, 2024. [Online]. Available: https://doi.org/10.1007/s11222-023-10328-x

work page doi:10.1007/s11222-023-10328-x 2024

[20] [20]

Robust pca for high-dimensional data based on characteristic transformation,

L. He, Y . Yang, and B. Zhang, “Robust pca for high-dimensional data based on characteristic transformation,”Australian & New Zealand Journal of Statistics, vol. 65, no. 2, pp. 127–151, 2023. [Online]. Available: https://onlinelibrary.wiley.com/doi/abs/10.1111/anzs.12385

work page doi:10.1111/anzs.12385 2023

[21] [21]

Spherical principal component analysis,

K. Liu, Q. Li, H. Wang, and G. Tang, “Spherical principal component analysis,” inProceedings of the 2019 SIAM international conference on data mining. SIAM, 2019, pp. 387–395

work page 2019

[22] [22]

Statistical properties of kernel principal component analysis,

G. Blanchard, O. Bousquet, and L. Zwald, “Statistical properties of kernel principal component analysis,”Machine Learning, vol. 66, pp. 259–294, 2007

work page 2007

[23] [23]

On the applications of robust pca in image and video processing,

T. Bouwmans, S. Javed, H. Zhang, Z. Lin, and R. Otazo, “On the applications of robust pca in image and video processing,”Proceedings of the IEEE, vol. 106, no. 8, pp. 1427–1457, 2018

work page 2018

[24] [24]

Dynamics ofimplied volatility surfaces,

R. Cont and J. Da Fonseca, “Dynamics ofimplied volatility surfaces,”Quantitative finance, vol. 2, no. 1, p. 45, 2002

work page 2002

[25] [25]

Principal component analysis for data containing outliers and missing elements,

S. Serneels and T. Verdonck, “Principal component analysis for data containing outliers and missing elements,”Computational Statistics & Data Analysis, vol. 52, no. 3, pp. 1712–1727, 2008

work page 2008

[26] [26]

Robust pca unrolling network for super-resolution vessel extraction in x-ray coronary angiography,

B. Qin, H. Mao, Y . Liu, J. Zhao, Y . Lv, Y . Zhu, S. Ding, and X. Chen, “Robust pca unrolling network for super-resolution vessel extraction in x-ray coronary angiography,”IEEE Transactions on Medical Imaging, vol. 41, no. 11, pp. 3087–3098, 2022

work page 2022

[27] [27]

Random features for large-scale kernel machines,

A. Rahimi and B. Recht, “Random features for large-scale kernel machines,” inAdvances in Neural Information Processing Systems, J. Platt, D. Koller, Y . Singer, and S. Roweis, Eds., vol. 20. Curran Associates, Inc., 2007

work page 2007

[28] [28]

The generalization error of random features regression: Precise asymptotics and the double descent curve,

S. Mei and A. Montanari, “The generalization error of random features regression: Precise asymptotics and the double descent curve,”Communications on Pure and Applied Mathematics, vol. 75, 2019. [Online]. Available: https://api.semanticscholar.org/CorpusID:199668852

work page 2019

[29] [29]

Generalisation error in learning with random features and the hidden manifold model*,

F. Gerace, B. Loureiro, F. Krzakala, M. M ´ezard, and L. Zdeborov ´a, “Generalisation error in learning with random features and the hidden manifold model*,”Journal of Statistical Mechanics: Theory and Experiment, vol. 2021, no. 12, p. 124013, dec 2021. [Online]. Available: https://dx.doi.org/10.1088/1742-5468/ac3ae6

work page doi:10.1088/1742-5468/ac3ae6 2021

[30] [30]

Classification of heavy-tailed features in high dimensions: a superstatistical approach,

U. Adomaityte, G. Sicuro, and P. Vivo, “Classification of heavy-tailed features in high dimensions: a superstatistical approach,” inNeural Information Processing Systems, 2023. [Online]. Available: https://api.semanticscholar.org/CorpusID:257985400

work page 2023

[31] [31]

Superstatistics: Theory and applications,

C. Beck, “Superstatistics: Theory and applications,”Continuum Mechanics and Thermodynamics, vol. 16, 03 2003

work page 2003

[32] [32]

Scale mixtures of gaussians and the statistics of natural images,

M. J. Wainwright and E. Simoncelli, “Scale mixtures of gaussians and the statistics of natural images,” inAdvances in Neural Information Processing Systems, S. Solla, T. Leen, and K. M ¨uller, Eds., vol. 12. MIT Press, 1999

work page 1999

[33] [33]

High-dimensional robust regression under heavy-tailed data: Asymptotics and universality,

U. Adomaityte, L. Defilippis, B. Loureiro, and G. Sicuro, “High-dimensional robust regression under heavy-tailed data: Asymptotics and universality,” Journal of Statistical Mechanics: Theory and Experiment, no. 11, p. 114002, 2024. [Online]. Available: https://doi.org/10.1088/1742-5468/ad65e6

work page doi:10.1088/1742-5468/ad65e6 2024

[34] [34]

and Taqqu, M

Samoradnitsky, G. and Taqqu, M. S.,Stable Non-Gaussian Random Processes: Stochastic Models with Infinite Variance. New York: Chapman and Hall, June 1994

work page 1994

[35] [35]

On the multivariate t distribution,

M. Roth, “On the multivariate t distribution,” Link ¨oping University Electronic Press, Link ¨oping, Tech. Rep., 04 2013

work page 2013

[36] [36]

Cauchy Principal Component Analysis

P. Xie and E. P. Xing, “Cauchy principal component analysis,”arXiv preprint arXiv:1412.6506, 2014. [Online]. Available: https://arxiv.org/abs/1412.6506

work page internal anchor Pith review Pith/arXiv arXiv 2014

[37] [37]

Aub’s heavy-tails package,

AUB-HTP Project, “Aub’s heavy-tails package,” https://github.com/AUB-HTP, 2026, accessed: 2026-04-03

work page 2026

[38] [38]

Heavy-tailed linear regression and k-means,

M. Sayde, J. Fahs, and I. Abou-Faycal, “Heavy-tailed linear regression and k-means,”Information, vol. 16, no. 3, p. 184, 2025

work page 2025

[39] [39]

Gnedenko, B. V . and Kolmogorov, A. N. ,Limit Distributions for Sums of Independent Random Variables. Reading Massachusetts: Addison-Wesley Publishing Company, 1968

work page 1968

[40] [40]

D. G. Luenberger,Optimization by Vector Space Methods, 1st ed., ser. Wiley Professional Paperback Series. New York: Wiley, Sep. 1997. [Online]. Available: https://books.google.com.lb/books?id=lZU0CAH4RccC

work page 1997

[41] [41]

Lecture notes on matrix analysis,

M. W. Meckes, “Lecture notes on matrix analysis,” 2019, see Theorem 3.13 (Fan’s maximal principle). [Online]. Available: https://case.edu/artsci/math/mwmeckes/matrix-analysis.pdf

work page 2019

[42] [42]

On the distribution of the quotient of two chance variables,

J. Curtiss, “On the distribution of the quotient of two chance variables,”The Annals of Mathematical Statistics, vol. 12, no. 4, pp. 409–421, 1941

work page 1941

[43] [43]

Information measures, inequalities and performance bounds for parameter estimation in impulsive noise environments,

J. Fahs and I. Abou-Faycal, “Information measures, inequalities and performance bounds for parameter estimation in impulsive noise environments,” IEEE Transactions on Information Theory, vol. 64, no. 3, pp. 1825–1844, 2017

work page 2017

[44] [44]

Information measures, inequalities and performance bounds for parameter estimation in impulsive noise environments,

——, “Information measures, inequalities and performance bounds for parameter estimation in impulsive noise environments,”IEEE Transactions on Information Theory, vol. 64, no. 3, pp. 1825–1844, 2018

work page 2018