pith. sign in

arxiv: 1407.1176 · v1 · pith:2UVPKZGKnew · submitted 2014-07-04 · 📊 stat.ML · cs.LG

Identifying Higher-order Combinations of Binary Features

classification 📊 stat.ML cs.LG
keywords approachbinarydatasetshypothesesstatisticallyteradaaddressbenchmark
0
0 comments X
read the original abstract

Finding statistically significant interactions between binary variables is computationally and statistically challenging in high-dimensional settings, due to the combinatorial explosion in the number of hypotheses. Terada et al. recently showed how to elegantly address this multiple testing problem by excluding non-testable hypotheses. Still, it remains unclear how their approach scales to large datasets. We here proposed strategies to speed up the approach by Terada et al. and evaluate them thoroughly in 11 real-world benchmark datasets. We observe that one approach, incremental search with early stopping, is orders of magnitude faster than the current state-of-the-art approach.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.