pith. sign in

arxiv: cs/0509033 · v1 · submitted 2005-09-13 · 💻 cs.AI

K-Histograms: An Efficient Clustering Algorithm for Categorical Dataset

classification 💻 cs.AI
keywords algorithmclusteringcategoricaldatak-histogramefficienthistogramsresults
0
0 comments X
read the original abstract

Clustering categorical data is an integral part of data mining and has attracted much attention recently. In this paper, we present k-histogram, a new efficient algorithm for clustering categorical data. The k-histogram algorithm extends the k-means algorithm to categorical domain by replacing the means of clusters with histograms, and dynamically updates histograms in the clustering process. Experimental results on real datasets show that k-histogram algorithm can produce better clustering results than k-modes algorithm, the one related with our work most closely.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.