Asymptotic minimaxity of False Discovery Rate thresholding for sparse exponential data

David Donoho; Jiashun Jin

read the original abstract

We apply FDR thresholding to a non-Gaussian vector whose coordinates X_i, i=1,..., n, are independent exponential with individual means $\mu_i$. The vector $\mu =(\mu_i)$ is thought to be sparse, with most coordinates 1 but a small fraction significantly larger than 1; roughly, most coordinates are simply `noise,' but a small fraction contain `signal.' We measure risk by per-coordinate mean-squared error in recovering $\log(\mu_i)$, and study minimax estimation over parameter spaces defined by constraints on the per-coordinate p-norm of $\log(\mu_i)$: $\frac{1}{n}\sum_{i=1}^n\log^p(\mu_i)\leq \eta^p$. We show for large n and small $\eta$ that FDR thresholding can be nearly Minimax. The FDR control parameter 0<q<1 plays an important role: when $q\leq 1/2$, the FDR estimator is nearly minimax, while choosing a fixed q>1/2 prevents near minimaxity. These conclusions mirror those found in the Gaussian case in Abramovich et al. [Ann. Statist. 34 (2006) 584--653]. The techniques developed here seem applicable to a wide range of other distributional assumptions, other loss measures and non-i.i.d. dependency structures.

Asymptotic minimaxity of False Discovery Rate thresholding for sparse exponential data

discussion (0)