Robust subgaussian estimation of a mean vector in nearly linear time
read the original abstract
We construct an algorithm, running in time $\tilde{\mathcal O}(N d + uK d)$, which is robust to outliers and heavy-tailed data and which achieves the subgaussian rate from [Lugosi, Mendelson] \begin{equation}\label{eq:intro_subgaus_rate} \sqrt{\frac{{\rm Tr}(\Sigma)}{N}}+\sqrt{\frac{||\Sigma||_{op}K}{N}} \end{equation}with probability at least $1-\exp(-c_0K)-\exp(-c_1 u)$ where $\Sigma$ is the covariance matrix of the informative data, $K\in\{1, \ldots, K\}$ is some parameter (number of block means) and $u>0$ is another parameter of the algorithm. This rate is achieved when $K\geq c_1 |\mathcal O|$ where $|\mathcal O|$ is the number of outliers in the database and under the only assumption that the informative data have a second moment. The algorithm is fully data-dependent and does not use in its construction the proportion of outliers nor the rate above. Its construction combines recently developed tools for Median-of-Means estimators and covering-Semi-definite Programming [Chen, Diakonikolas, Ge] and [Peng, Tangwongsan, Zhang].
This paper has not been read by Pith yet.
Forward citations
Cited by 2 Pith papers
-
A Unified Approach to Robust Mean Estimation
A connection between Huber's contamination and heavy-tailed models yields unified robust mean estimators that are both computationally efficient and statistically optimal under certain conditions.
-
Quantum Entropy Scoring for Fast Robust Mean Estimation and Improved Outlier Detection
QUE-scoring via quantum entropy regularization achieves optimal robust mean estimation in Õ(nd) time and outperforms prior outlier detection methods in experiments.
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.