Principal component analysis for big data

Fan, J · 2018 · stat.ME · arXiv 1801.01602

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

open full Pith review browse 1 citing papers arXiv PDF

abstract

Big data is transforming our world, revolutionizing operations and analytics everywhere, from financial engineering to biomedical sciences. The complexity of big data often makes dimension reduction techniques necessary before conducting statistical inference. Principal component analysis, commonly referred to as PCA, has become an essential tool for multivariate data analysis and unsupervised dimension reduction, the goal of which is to find a lower dimensional subspace that captures most of the variation in the dataset. This article provides an overview of methodological and theoretical developments of PCA over the last decade, with focus on its applications to big data analytics. We first review the mathematical formulation of PCA and its theoretical development from the view point of perturbation analysis. We then briefly discuss the relationship between PCA and factor analysis as well as its applications to large covariance estimation and multiple testing. PCA also finds important applications in many modern machine learning problems, and we focus on community detection, ranking, mixture model and manifold learning in this paper.

representative citing papers

Building a GPU-Accelerated Multivariate Statistics Platform

stat.CO · 2026-04-26 · unverdicted · novelty 3.0 · 2 refs

Case study of a single-pass GPU implementation for sufficient statistics in large-scale multivariate analysis on multi-GPU hardware.

citing papers explorer

Showing 1 of 1 citing paper.

Building a GPU-Accelerated Multivariate Statistics Platform stat.CO · 2026-04-26 · unverdicted · none · ref 5 · 2 links · internal anchor
Case study of a single-pass GPU implementation for sufficient statistics in large-scale multivariate analysis on multi-GPU hardware.

Principal component analysis for big data

fields

years

verdicts

representative citing papers

citing papers explorer