Document Visualization using Topic Clouds
pith:OCO6CA62 Add to your LaTeX paper
What is a Pith Number?\usepackage{pith}
\pithnumber{OCO6CA62}
Prints a linked pith:OCO6CA62 badge after your title and writes the identifier into PDF metadata. Compiles on arXiv with no extra files. Learn more
read the original abstract
Traditionally a document is visualized by a word cloud. Recently, distributed representation methods for documents have been developed, which map a document to a set of topic embeddings. Visualizing such a representation is useful to present the semantics of a document in higher granularity; it is also challenging, as there are multiple topics, each containing multiple words. We propose to visualize a set of topics using Topic Cloud, which is a pie chart consisting of topic slices, where each slice contains important words in this topic. To make important topics/words visually prominent, the sizes of topic slices and word fonts are proportional to their importance in the document. A topic cloud can help the user quickly evaluate the quality of derived document representations. For NLP practitioners, It can be used to qualitatively compare the topic quality of different document representation algorithms, or to inspect how model parameters impact the derived representations.
This paper has not been read by Pith yet.
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.