Concatenated Power Mean Word Embeddings as Universal Cross-Lingual Sentence Representations

Andreas R\"uckl\'e; Iryna Gurevych; Maxime Peyrard; Steffen Eger

arxiv: 1803.01400 · v2 · pith:MCTQD5FNnew · submitted 2018-03-04 · 💻 cs.CL

Concatenated Power Mean Word Embeddings as Universal Cross-Lingual Sentence Representations

Andreas R\"uckl\'e , Steffen Eger , Maxime Peyrard , Iryna Gurevych This is my paper

classification 💻 cs.CL

keywords embeddingswordmeanpoweraveragebaselinecomplexdifferent

0 comments

read the original abstract

Average word embeddings are a common baseline for more sophisticated sentence embedding techniques. However, they typically fall short of the performances of more complex models such as InferSent. Here, we generalize the concept of average word embeddings to power mean word embeddings. We show that the concatenation of different types of power mean word embeddings considerably closes the gap to state-of-the-art methods monolingually and substantially outperforms these more complex techniques cross-lingually. In addition, our proposed method outperforms different recently proposed baselines such as SIF and Sent2Vec by a solid margin, thus constituting a much harder-to-beat monolingual baseline. Our data and code are publicly available.

This paper has not been read by Pith yet.

discussion (0)

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

Survey on reinforcement learning for language processing
cs.CL 2021-04 unverdicted novelty 2.0

This survey reviews reinforcement learning applications to natural language processing problems, especially conversational systems, including problem descriptions, suitability of RL, advantages, limitations, and promi...