Reusing Weights in Subword-aware Neural Language Models

Rustem Takhanov; Zhenisbek Assylbekov

arxiv: 1802.08375 · v2 · pith:V5P5FHTTnew · submitted 2018-02-23 · 💻 cs.CL · cs.NE· stat.ML

Reusing Weights in Subword-aware Neural Language Models

Zhenisbek Assylbekov , Rustem Takhanov This is my paper

classification 💻 cs.CL cs.NEstat.ML

keywords modelmodelsweightscompetitivelanguagemorpheme-awareneuralreused

0 comments

read the original abstract

We propose several ways of reusing subword embeddings and other weights in subword-aware neural language models. The proposed techniques do not benefit a competitive character-aware model, but some of them improve the performance of syllable- and morpheme-aware models while showing significant reductions in model sizes. We discover a simple hands-on principle: in a multi-layer input embedding model, layers should be tied consecutively bottom-up if reused at output. Our best morpheme-aware model with properly reused weights beats the competitive word-level model by a large margin across multiple languages and has 20%-87% fewer parameters.

This paper has not been read by Pith yet.

Reusing Weights in Subword-aware Neural Language Models

discussion (0)