pith. sign in

arxiv: 1810.11971 · v1 · pith:WTAO7VVGnew · submitted 2018-10-29 · 📊 stat.ML · cs.LG

Semi-crowdsourced Clustering with Deep Generative Models

classification 📊 stat.ML cs.LG
keywords modelclusteringdatadeepgenerativeinferencenoisypairwise
0
0 comments X
read the original abstract

We consider the semi-supervised clustering problem where crowdsourcing provides noisy information about the pairwise comparisons on a small subset of data, i.e., whether a sample pair is in the same cluster. We propose a new approach that includes a deep generative model (DGM) to characterize low-level features of the data, and a statistical relational model for noisy pairwise annotations on its subset. The two parts share the latent variables. To make the model automatically trade-off between its complexity and fitting data, we also develop its fully Bayesian variant. The challenge of inference is addressed by fast (natural-gradient) stochastic variational inference algorithms, where we effectively combine variational message passing for the relational part and amortized learning of the DGM under a unified framework. Empirical results on synthetic and real-world datasets show that our model outperforms previous crowdsourced clustering methods.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.