This survey compiles 137 papers on Topological Data Analysis in NLP, categorizing them into theoretical explanations of language and practical integrations into ML systems while noting open challenges.
An Introduction to Topological Data Analysis for Physicists: From LGM to FRBs
1 Pith paper cite this work. Polarity classification is still indexing.
abstract
Topological Data Analysis (TDA) is a novel, and relatively new approach to analysing high-dimensional data sets. It does this by focussing on global properties like the shape and connectivity of the data giving it a significant advantage over more conventional tools based on cluster analysis, a localised property of the data. However, some of its mathematical foundations, like algebraic topology and discrete Morse theory, are perceived as an intimidatingly steep upramp into the subject. Consequently, it has enjoyed much less popularity as a data-analysis tool than less abstract methods. This article aims to change this. By focusing on a small set of simple examples, chosen primarily for their pedagogical value, we introduce and explain TDA's two principle branches; persistent homology and the Mapper algorithm. We then illustrate the universality of the method by discussing its application to the intriguing data set of fast radio burst (FRB) observations. We close the article with a discussion of the resilience of topological data analysis to noise and some statistical and computational challenges faced by the method.
fields
cs.CL 1years
2024 1verdicts
ACCEPT 1representative citing papers
citing papers explorer
-
Topological Data Analysis Applications in Natural Language Processing: A Survey
This survey compiles 137 papers on Topological Data Analysis in NLP, categorizing them into theoretical explanations of language and practical integrations into ML systems while noting open challenges.