pith. sign in

arxiv: 2502.14772 · v1 · pith:YSRMQUNEnew · submitted 2025-02-20 · 💻 cs.DS · cs.LG· math.ST· stat.ML· stat.TH

Efficient Multivariate Robust Mean Estimation Under Mean-Shift Contamination

classification 💻 cs.DS cs.LGmath.STstat.MLstat.TH
keywords contaminationmeanmean-shiftestimationrobustalphaalgorithmdrawn
0
0 comments X
read the original abstract

We study the algorithmic problem of robust mean estimation of an identity covariance Gaussian in the presence of mean-shift contamination. In this contamination model, we are given a set of points in $\mathbb{R}^d$ generated i.i.d. via the following process. For a parameter $\alpha<1/2$, the $i$-th sample $x_i$ is obtained as follows: with probability $1-\alpha$, $x_i$ is drawn from $\mathcal{N}(\mu, I)$, where $\mu \in \mathbb{R}^d$ is the target mean; and with probability $\alpha$, $x_i$ is drawn from $\mathcal{N}(z_i, I)$, where $z_i$ is unknown and potentially arbitrary. Prior work characterized the information-theoretic limits of this task. Specifically, it was shown that, in contrast to Huber contamination, in the presence of mean-shift contamination consistent estimation is possible. On the other hand, all known robust estimators in the mean-shift model have running times exponential in the dimension. Here we give the first computationally efficient algorithm for high-dimensional robust mean estimation with mean-shift contamination that can tolerate a constant fraction of outliers. In particular, our algorithm has near-optimal sample complexity, runs in sample-polynomial time, and approximates the target mean to any desired accuracy. Conceptually, our result contributes to a growing body of work that studies inference with respect to natural noise models lying in between fully adversarial and random settings.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. Adaptive Confidence Intervals in Efron's Gaussian Two-Groups Model

    math.ST 2026-04 unverdicted novelty 8.0

    In Efron's two-groups model, adaptive confidence intervals for the null location have minimax length of order σ(n^{-1/4} + ε^{1/2}/sqrt(log(enε²))) when ε unknown and σ known, degrading to Ω(σ n^{-1/8}) when σ unknown...