Identifying Higher-order Combinations of Binary Features

Felipe Llinares; Karsten M. Borgwardt; Mahito Sugiyama

arxiv: 1407.1176 · v1 · pith:2UVPKZGKnew · submitted 2014-07-04 · 📊 stat.ML · cs.LG

Identifying Higher-order Combinations of Binary Features

Felipe Llinares , Mahito Sugiyama , Karsten M. Borgwardt This is my paper

classification 📊 stat.ML cs.LG

keywords approachbinarydatasetshypothesesstatisticallyteradaaddressbenchmark

0 comments

read the original abstract

Finding statistically significant interactions between binary variables is computationally and statistically challenging in high-dimensional settings, due to the combinatorial explosion in the number of hypotheses. Terada et al. recently showed how to elegantly address this multiple testing problem by excluding non-testable hypotheses. Still, it remains unclear how their approach scales to large datasets. We here proposed strategies to speed up the approach by Terada et al. and evaluate them thoroughly in 11 real-world benchmark datasets. We observe that one approach, incremental search with early stopping, is orders of magnitude faster than the current state-of-the-art approach.

This paper has not been read by Pith yet.

Identifying Higher-order Combinations of Binary Features

discussion (0)