pith. sign in

arxiv: 1301.6685 · v2 · pith:R4S4TH3Wnew · submitted 2013-01-23 · 💻 cs.LG · stat.ML

Fast Learning from Sparse Data

classification 💻 cs.LG stat.ML
keywords dataalgorithmalgorithmsclusteringdecisionlearningmodelsnaive-bayes
0
0 comments X
read the original abstract

We describe two techniques that significantly improve the running time of several standard machine-learning algorithms when data is sparse. The first technique is an algorithm that effeciently extracts one-way and two-way counts--either real or expected-- from discrete data. Extracting such counts is a fundamental step in learning algorithms for constructing a variety of models including decision trees, decision graphs, Bayesian networks, and naive-Bayes clustering models. The second technique is an algorithm that efficiently performs the E-step of the EM algorithm (i.e. inference) when applied to a naive-Bayes clustering model. Using real-world data sets, we demonstrate a dramatic decrease in running time for algorithms that incorporate these techniques.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.