pith. sign in

arxiv: 1806.06945 · v2 · pith:OJCI6TX4new · submitted 2018-06-18 · 📊 stat.ML · cs.LG· math.ST· stat.TH

Overlapping Clustering Models, and One (class) SVM to Bind Them All

classification 📊 stat.ML cs.LGmath.STstat.TH
keywords belongmodelsmultipleoverlappingclusteringdatasetsexemplarsperson
0
0 comments X
read the original abstract

People belong to multiple communities, words belong to multiple topics, and books cover multiple genres; overlapping clusters are commonplace. Many existing overlapping clustering methods model each person (or word, or book) as a non-negative weighted combination of "exemplars" who belong solely to one community, with some small noise. Geometrically, each person is a point on a cone whose corners are these exemplars. This basic form encompasses the widely used Mixed Membership Stochastic Blockmodel of networks (Airoldi et al., 2008) and its degree-corrected variants (Jin et al., 2017), as well as topic models such as LDA (Blei et al., 2003). We show that a simple one-class SVM yields provably consistent parameter inference for all such models, and scales to large datasets. Experimental results on several simulated and real datasets show our algorithm (called SVM-cone) is both accurate and scalable.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.