pith. sign in

arxiv: 1201.1657 · v1 · pith:AX3YLHJ6new · submitted 2012-01-08 · 📊 stat.ML · cs.AI

A Split-Merge MCMC Algorithm for the Hierarchical Dirichlet Process

classification 📊 stat.ML cs.AI
keywords mcmcdatasplit-mergealgorithmdirichletinferencemodelprocess
0
0 comments X
read the original abstract

The hierarchical Dirichlet process (HDP) has become an important Bayesian nonparametric model for grouped data, such as document collections. The HDP is used to construct a flexible mixed-membership model where the number of components is determined by the data. As for most Bayesian nonparametric models, exact posterior inference is intractable---practitioners use Markov chain Monte Carlo (MCMC) or variational inference. Inspired by the split-merge MCMC algorithm for the Dirichlet process (DP) mixture model, we describe a novel split-merge MCMC sampling algorithm for posterior inference in the HDP. We study its properties on both synthetic data and text corpora. We find that split-merge MCMC for the HDP can provide significant improvements over traditional Gibbs sampling, and we give some understanding of the data properties that give rise to larger improvements.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. Learning with fuzzy hypergraphs: a topical approach to query-oriented text summarization

    cs.CL 2019-06 unverdicted novelty 6.0

    Introduces a fuzzy hypergraph using topical sentence representations and submodular optimization to select sentences maximizing query relevance, centrality, and topic coverage for extractive summarization.