High-dimensional Linear Discriminant Analysis: Optimality, Adaptive Algorithm, and Missing Data

Linjun Zhang; T. Tony Cai

High-dimensional Linear Discriminant Analysis: Optimality, Adaptive Algorithm, and Missing Data

Not yet reviewed by Pith; the record is open.

Re-run · record.json Download PDF Read on arXiv ↗

This paper has not been read by Pith yet. Machine review is queued; the pith claim, tier, and objections will appear here once it completes.

SPECIMEN: schema-true, not a live event

T0 review · schema-true

One-sentence machine reading of the paper's core claim.

pith:XXXXXXXX · record.json · timestamp

arxiv 1804.03018 v1 pith:626HGBL2 submitted 2018-04-09 stat.ME

High-dimensional Linear Discriminant Analysis: Optimality, Adaptive Algorithm, and Missing Data

T. Tony Cai , Linjun Zhang This is my paper

classification stat.ME

keywords analysisdataadaptiveclassificationdiscriminanthigh-dimensionallinearmissing

verification ladder T0 review T1 audit T2 compute T3 formal T4 reserved

0 comments

read the original abstract

This paper aims to develop an optimality theory for linear discriminant analysis in the high-dimensional setting. A data-driven and tuning free classification rule, which is based on an adaptive constrained $\ell_1$ minimization approach, is proposed and analyzed. Minimax lower bounds are obtained and this classification rule is shown to be simultaneously rate optimal over a collection of parameter spaces. In addition, we consider classification with incomplete data under the missing completely at random (MCR) model. An adaptive classifier with theoretical guarantees is introduced and optimal rate of convergence for high-dimensional linear discriminant analysis under the MCR model is established. The technical analysis for the case of missing data is much more challenging than that for the complete data. We establish a large deviation result for the generalized sample covariance matrix, which serves as a key technical tool and can be of independent interest. An application to lung cancer and leukemia studies is also discussed.

discussion (0)

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

High-dimensional principal component analysis with heterogeneous missingness
stat.ME 2019-06 unverdicted novelty 6.0

primePCA iteratively imputes missing entries via projection onto current principal component estimates and updates the estimate with the leading right singular space, achieving geometric error convergence in the noise...