An Empirical Study of Smoothing Techniques for Language Modeling
read the original abstract
We present an extensive empirical comparison of several smoothing techniques in the domain of language modeling, including those described by Jelinek and Mercer (1980), Katz (1987), and Church and Gale (1991). We investigate for the first time how factors such as training data size, corpus (e.g., Brown versus Wall Street Journal), and n-gram order (bigram versus trigram) affect the relative performance of these methods, which we measure through the cross-entropy of test data. In addition, we introduce two novel smoothing techniques, one a variation of Jelinek-Mercer smoothing and one a very simple linear interpolation technique, both of which outperform existing methods.
This paper has not been read by Pith yet.
Forward citations
Cited by 1 Pith paper
-
Beyond Monolingual Assumptions: A Survey of Code-Switched NLP in the Era of Large Language Models across Modalities
A comprehensive survey of code-switched NLP research with LLMs across modalities, covering 327 studies, 15+ tasks, 30+ datasets, and 80+ languages while outlining challenges and a future roadmap.
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.