Clustering Mixed Datasets Using Homogeneity Analysis with Applications to Big Data

Rajiv Sambasivan; Sourish Das

arxiv: 1608.04961 · v3 · pith:TUY4BZFGnew · submitted 2016-08-17 · 📊 stat.ML

Clustering Mixed Datasets Using Homogeneity Analysis with Applications to Big Data

Rajiv Sambasivan , Sourish Das This is my paper

classification 📊 stat.ML

keywords analysisdatasetsdatahomogeneityapproachattributescategoricalclustering

0 comments

read the original abstract

Datasets with a mixture of numerical and categorical attributes are routinely encountered in many application domains. In this work we examine an approach to clustering such datasets using homogeneity analysis. Homogeneity analysis determines a euclidean representation of the data. This can be analyzed by leveraging the large body of tools and techniques for data with a euclidean representation. Experiments conducted as part of this study suggest that this approach can be useful in the analysis and exploration of big datasets with a mixture of numerical and categorical attributes.

This paper has not been read by Pith yet.

Clustering Mixed Datasets Using Homogeneity Analysis with Applications to Big Data

discussion (0)