pith. machine review for the scientific record. sign in

arxiv: 1301.0858 · v1 · submitted 2013-01-05 · 📊 stat.ML

Recognition: unknown

A New Geometric Approach to Latent Topic Modeling and Discovery

Authors on Pith no claims yet
classification 📊 stat.ML
keywords algorithmapproachesdiscoverylatenttopicappliedapproachapproximations
0
0 comments X
read the original abstract

A new geometrically-motivated algorithm for nonnegative matrix factorization is developed and applied to the discovery of latent "topics" for text and image "document" corpora. The algorithm is based on robustly finding and clustering extreme points of empirical cross-document word-frequencies that correspond to novel "words" unique to each topic. In contrast to related approaches that are based on solving non-convex optimization problems using suboptimal approximations, locally-optimal methods, or heuristics, the new algorithm is convex, has polynomial complexity, and has competitive qualitative and quantitative performance compared to the current state-of-the-art approaches on synthetic and real-world datasets.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.