pith. sign in

arxiv: 1405.1250 · v1 · pith:ORXRHW2Enew · submitted 2014-05-06 · 📊 stat.CO · cs.NA· math.NA

New tight approximations for Fisher's exact test

classification 📊 stat.CO cs.NAmath.NA
keywords approximationsdatafishertestaccurateapproximateapproximationbounds
0
0 comments X
read the original abstract

Fisher's exact test is often a preferred method to estimate the significance of statistical dependence. However, in large data sets the test is usually too worksome to be applied, especially in an exhaustive search (data mining). The traditional solution is to approximate the significance with the $\chi^2$-measure, but the accuracy is often unacceptable. As a solution, we introduce a family of upper bounds, which are fast to calculate and approximate Fisher's $p$-value accurately. In addition, the new approximations are not sensitive to the data size, distribution, or smallest expected counts like the $\chi^2$-based approximation. According to both theoretical and experimental analysis, the new approximations produce accurate results for all sufficiently strong dependencies. The basic form of the approximation can fail with weak dependencies, but the general form of the upper bounds can be adjusted to be arbitrarily accurate.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.