pith. sign in

arxiv: 1411.7013 · v3 · pith:7FA5GU3Cnew · submitted 2014-11-25 · 📊 stat.CO · stat.ME

k-POD: A Method for k-Means Clustering of Missing Data

classification 📊 stat.CO stat.ME
keywords datamissingclusteringmeanswhenapplicationscompletemethod
0
0 comments X
read the original abstract

The $k$-means algorithm is often used in clustering applications but its usage requires a complete data matrix. Missing data, however, is common in many applications. Mainstream approaches to clustering missing data reduce the missing data problem to a complete data formulation through either deletion or imputation but these solutions may incur significant costs. Our $k$-POD method presents a simple extension of $k$-means clustering for missing data that works even when the missingness mechanism is unknown, when external information is unavailable, and when there is significant missingness in the data.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.