pith. sign in

arxiv: 1711.07104 · v2 · pith:OYE6ACAYnew · submitted 2017-11-19 · 📊 stat.ML

A Double Parametric Bootstrap Test for Topic Models

classification 📊 stat.ML
keywords topictestbootstrapcorporadatadoublelikelihoodmodel
0
0 comments X p. Extension
pith:OYE6ACAY Add to your LaTeX paper What is a Pith Number?
\usepackage{pith}
\pithnumber{OYE6ACAY}

Prints a linked pith:OYE6ACAY badge after your title and writes the identifier into PDF metadata. Compiles on arXiv with no extra files. Learn more

read the original abstract

Non-negative matrix factorization (NMF) is a technique for finding latent representations of data. The method has been applied to corpora to construct topic models. However, NMF has likelihood assumptions which are often violated by real document corpora. We present a double parametric bootstrap test for evaluating the fit of an NMF-based topic model based on the duality of the KL divergence and Poisson maximum likelihood estimation. The test correctly identifies whether a topic model based on an NMF approach yields reliable results in simulated and real data.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.