Diachronic Word Embeddings Reveal Statistical Laws of Semantic Change
read the original abstract
Understanding how words change their meanings over time is key to models of language and cultural evolution, but historical data on meaning is scarce, making theories hard to develop and test. Word embeddings show promise as a diachronic tool, but have not been carefully evaluated. We develop a robust methodology for quantifying semantic change by evaluating word embeddings (PPMI, SVD, word2vec) against known historical changes. We then use this methodology to reveal statistical laws of semantic evolution. Using six historical corpora spanning four languages and two centuries, we propose two quantitative laws of semantic change: (i) the law of conformity---the rate of semantic change scales with an inverse power-law of word frequency; (ii) the law of innovation---independent of frequency, words that are more polysemous have higher rates of semantic change.
This paper has not been read by Pith yet.
Forward citations
Cited by 2 Pith papers
-
Diachronic Embedding for Temporal Knowledge Graph Completion
Proposes a model-agnostic diachronic entity embedding function to extend static KG embedding models for temporal knowledge graph completion, with a proof that the SimplE combination is fully expressive.
-
Survey in Characterizing Semantic Change
The survey organizes prior work on semantic change characterization into three classes, summarizes selected publications in a table, and discusses research needs and trends.
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.