pith. sign in

arxiv: 1710.04886 · v1 · pith:CM6O6SOZnew · submitted 2017-10-13 · ⚛️ physics.data-an · cs.DS

High Dimensional Cluster Analysis Using Path Lengths

classification ⚛️ physics.data-an cs.DS
keywords clusteringdatahighmultiplepartitionspathtechniquetechniques
0
0 comments X
read the original abstract

A hierarchical scheme for clustering data is presented which applies to spaces with a high number of dimension ($N_{_{D}}>3$). The data set is first reduced to a smaller set of partitions (multi-dimensional bins). Multiple clustering techniques are used, including spectral clustering, however, new techniques are also introduced based on the path length between partitions that are connected to one another. A Line-Of-Sight algorithm is also developed for clustering. A test bank of 12 data sets with varying properties is used to expose the strengths and weaknesses of each technique. Finally, a robust clustering technique is discussed based on reaching a consensus among the multiple approaches, overcoming the weaknesses found individually.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.