Character n-gram Embeddings to Improve RNN Language Models

Jun Suzuki; Masaaki Nagata; Sho Takase

arxiv: 1906.05506 · v1 · pith:AP64NI7Knew · submitted 2019-06-13 · 💻 cs.CL

Character n-gram Embeddings to Improve RNN Language Models

Sho Takase , Jun Suzuki , Masaaki Nagata This is my paper

classification 💻 cs.CL

keywords characterembeddingslanguagemethodproposedwordn-gramtasks

0 comments

read the original abstract

This paper proposes a novel Recurrent Neural Network (RNN) language model that takes advantage of character information. We focus on character n-grams based on research in the field of word embedding construction (Wieting et al. 2016). Our proposed method constructs word embeddings from character n-gram embeddings and combines them with ordinary word embeddings. We demonstrate that the proposed method achieves the best perplexities on the language modeling datasets: Penn Treebank, WikiText-2, and WikiText-103. Moreover, we conduct experiments on application tasks: machine translation and headline generation. The experimental results indicate that our proposed method also positively affects these tasks.

This paper has not been read by Pith yet.

Character n-gram Embeddings to Improve RNN Language Models

discussion (0)