pith. sign in

arxiv: 2406.05335 · v3 · pith:T57MNTLGnew · submitted 2024-06-08 · ❄️ cond-mat.dis-nn · cs.LG

Phase transition in large language models and the criticality of natural languages

classification ❄️ cond-mat.dis-nn cs.LG
keywords languagesnaturalphasellmstransitiontextsbehaviorcritical
0
0 comments X
read the original abstract

Generation of text and speech in natural languages can be modeled as a stochastic process. This idea dates back to the seminal work of Markov and, later, to that of Shannon and also underlies the recent development of large language models (LLMs). The stochastic processes corresponding to natural languages should be distinct from those that generate nonlinguistic sequences. One of the features that discriminate linguistic and nonlinguistic sequences is power-law behavior, which is universally observed across different languages. In statistical physics, such behavior suggests that natural languages are critical: They lie near a phase transition point in a parametrized space of stochastic processes. However, testing this conjecture is not straightforward. A phase transition, even if it exists, cannot be directly observed in real-world natural languages because they do not have any controllable parameters. Here, we use LLMs as controllable effective models of natural languages. Through statistical analyses of texts generated by LLMs, we find that, when a parameter analogous to physical temperature is varied, LLMs undergo a phase transition. The transition separates a low-temperature phase with complex repetitive structures in generated texts from a high-temperature phase in which LLMs generate incomprehensible texts. At the critical point between these phases, generated texts display the power-law behavior similar to that of natural languages and most closely resemble natural languages as measured by a standard metric in natural language processing. These findings strongly suggest that natural languages are indeed critical.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Forward citations

Cited by 4 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. Turbulence-like 5/3 spectral scaling in contextual representations of language as a complex system

    cs.CL 2026-04 unverdicted novelty 7.0

    Contextual language embeddings exhibit a robust 5/3 power-law spectrum in token-sequence fluctuations, analogous to Kolmogorov turbulence.

  2. Escaping Mode Collapse in LLM Generation via Geometric Regulation

    cs.CL 2026-05 unverdicted novelty 6.0

    Reinforced Mode Regulation (RMR) uses low-rank damping on the value cache to prevent geometric collapse and mode collapse in autoregressive LLM generation, supporting stable output down to 0.8 nats/step entropy.

  3. A Limit Theory of Foundation Models: A Mathematical Approach to Understanding Emergent Intelligence and Scaling Laws

    cs.LG 2026-04 unverdicted novelty 6.0

    Emergent intelligence is recast as the existence of the limit of performance E(N,P,K) as N,P,K to infinity, with necessary and sufficient conditions derived via nonlinear Lipschitz operator theory and scaling laws obt...

  4. A Limit Theory of Foundation Models: A Mathematical Approach to Understanding Emergent Intelligence and Scaling Laws

    cs.LG 2026-04 unverdicted novelty 5.0

    Emergent intelligence corresponds to the limit of a performance function E(N,P,K) as N, P, K go to infinity, originating from a parameter-limit architecture whose existence is governed by Lipschitz conditions, with sc...