pith. sign in

arxiv: 0710.3058 · v1 · submitted 2007-10-16 · ⚛️ physics.soc-ph

Taxonomy and clustering in collaborative systems: the case of the on-line encyclopedia Wikipedia

classification ⚛️ physics.soc-ph
keywords clusteringcasecommunitydivisionencyclopediagivennatureon-line
0
0 comments X
read the original abstract

In this paper we investigate the nature and structure of the relation between imposed classifications and real clustering in a particular case of a scale-free network given by the on-line encyclopedia Wikipedia. We find a statistical similarity in the distributions of community sizes both by using the top-down approach of the categories division present in the archive and in the bottom-up procedure of community detection given by an algorithm based on the spectral properties of the graph. Regardless the statistically similar behaviour the two methods provide a rather different division of the articles, thereby signaling that the nature and presence of power laws is a general feature for these systems and cannot be used as a benchmark to evaluate the suitability of a clustering method.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.