pith. sign in

arxiv: 0901.2924 · v1 · submitted 2009-01-19 · ⚛️ physics.soc-ph · cs.CL

Universal Complex Structures in Written Language

classification ⚛️ physics.soc-ph cs.CL
keywords zipfdifferentlanguageuniversalwordscomplexcomplexityhowever
0
0 comments X
read the original abstract

Quantitative linguistics has provided us with a number of empirical laws that characterise the evolution of languages and competition amongst them. In terms of language usage, one of the most influential results is Zipf's law of word frequencies. Zipf's law appears to be universal, and may not even be unique to human language. However, there is ongoing controversy over whether Zipf's law is a good indicator of complexity. Here we present an alternative approach that puts Zipf's law in the context of critical phenomena (the cornerstone of complexity in physics) and establishes the presence of a large scale "attraction" between successive repetitions of words. Moreover, this phenomenon is scale-invariant and universal -- the pattern is independent of word frequency and is observed in texts by different authors and written in different languages. There is evidence, however, that the shape of the scaling relation changes for words that play a key role in the text, implying the existence of different "universality classes" in the repetition of words. These behaviours exhibit striking parallels with complex catastrophic phenomena.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.