Empirical Null Estimation using Discrete Mixture Distributions and its Application to Protein Domain Data

DoHwan Park; Iris Ivy Gauran; Johan Lim; John Spouge; John Zylstra; Junyong Park; Maricel Kann; Thomas Peterson

arxiv: 1608.07204 · v1 · pith:VP4J7FGOnew · submitted 2016-08-25 · 📊 stat.ME

Empirical Null Estimation using Discrete Mixture Distributions and its Application to Protein Domain Data

Iris Ivy Gauran , Junyong Park , Johan Lim , DoHwan Park , John Zylstra , Thomas Peterson , Maricel Kann , John Spouge This is my paper

classification 📊 stat.ME

keywords valuedomainmutationnullproteinconsidercountscut-off

0 comments

read the original abstract

In recent mutation studies, analyses based on protein domain positions are gaining popularity over gene-centric approaches since the latter have limitations in considering the functional context that the position of the mutation provides. This presents a large-scale simultaneous inference problem, with hundreds of hypothesis tests to consider at the same time. This paper aims to select significant mutation counts while controlling a given level of Type I error via False Discovery Rate (FDR) procedures. One main assumption is that there exists a cut-off value such that smaller counts than this value are generated from the null distribution. We present several data-dependent methods to determine the cut-off value. We also consider a two-stage procedure based on screening process so that the number of mutations exceeding a certain value should be considered as significant mutations. Simulated and protein domain data sets are used to illustrate this procedure in estimation of the empirical null using a mixture of discrete distributions.

This paper has not been read by Pith yet.

Empirical Null Estimation using Discrete Mixture Distributions and its Application to Protein Domain Data

discussion (0)