Fast Learning from Sparse Data

David Heckerman; David Maxwell Chickering

arxiv: 1301.6685 · v2 · pith:R4S4TH3Wnew · submitted 2013-01-23 · 💻 cs.LG · stat.ML

Fast Learning from Sparse Data

David Maxwell Chickering , David Heckerman This is my paper

classification 💻 cs.LG stat.ML

keywords dataalgorithmalgorithmsclusteringdecisionlearningmodelsnaive-bayes

0 comments

read the original abstract

We describe two techniques that significantly improve the running time of several standard machine-learning algorithms when data is sparse. The first technique is an algorithm that effeciently extracts one-way and two-way counts--either real or expected-- from discrete data. Extracting such counts is a fundamental step in learning algorithms for constructing a variety of models including decision trees, decision graphs, Bayesian networks, and naive-Bayes clustering models. The second technique is an algorithm that efficiently performs the E-step of the EM algorithm (i.e. inference) when applied to a naive-Bayes clustering model. Using real-world data sets, we demonstrate a dramatic decrease in running time for algorithms that incorporate these techniques.

This paper has not been read by Pith yet.

Fast Learning from Sparse Data

discussion (0)