Discovering topics in text datasets by visualizing relevant words

Franziska Horn; Gr\'egoire Montavon; Klaus-Robert M\"uller; Leila Arras; Wojciech Samek

arxiv: 1707.06100 · v1 · pith:JYJE44RUnew · submitted 2017-07-18 · 💻 cs.CL

Discovering topics in text datasets by visualizing relevant words

Franziska Horn , Leila Arras , Gr\'egoire Montavon , Klaus-Robert M\"uller , Wojciech Samek This is my paper

classification 💻 cs.CL

keywords documentstopicscontentsdiscoveringrelevanttextsvisualizingwords

0 comments

read the original abstract

When dealing with large collections of documents, it is imperative to quickly get an overview of the texts' contents. In this paper we show how this can be achieved by using a clustering algorithm to identify topics in the dataset and then selecting and visualizing relevant words, which distinguish a group of documents from the rest of the texts, to summarize the contents of the documents belonging to each topic. We demonstrate our approach by discovering trending topics in a collection of New York Times article snippets.

This paper has not been read by Pith yet.

Discovering topics in text datasets by visualizing relevant words

discussion (0)