Introduces a fuzzy hypergraph using topical sentence representations and submodular optimization to select sentences maximizing query relevance, centrality, and topic coverage for extractive summarization.
A Split-Merge MCMC Algorithm for the Hierarchical Dirichlet Process
1 Pith paper cite this work. Polarity classification is still indexing.
abstract
The hierarchical Dirichlet process (HDP) has become an important Bayesian nonparametric model for grouped data, such as document collections. The HDP is used to construct a flexible mixed-membership model where the number of components is determined by the data. As for most Bayesian nonparametric models, exact posterior inference is intractable---practitioners use Markov chain Monte Carlo (MCMC) or variational inference. Inspired by the split-merge MCMC algorithm for the Dirichlet process (DP) mixture model, we describe a novel split-merge MCMC sampling algorithm for posterior inference in the HDP. We study its properties on both synthetic data and text corpora. We find that split-merge MCMC for the HDP can provide significant improvements over traditional Gibbs sampling, and we give some understanding of the data properties that give rise to larger improvements.
fields
cs.CL 1years
2019 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
Learning with fuzzy hypergraphs: a topical approach to query-oriented text summarization
Introduces a fuzzy hypergraph using topical sentence representations and submodular optimization to select sentences maximizing query relevance, centrality, and topic coverage for extractive summarization.