pith. sign in

arxiv: 1805.06648 · v1 · pith:EQ4ZXBMXnew · submitted 2018-05-17 · 💻 cs.CL

Extrapolation in NLP

classification 💻 cs.CL
keywords extrapolationmodelstrainingargueattentioncapturedatadecomposable
0
0 comments X
read the original abstract

We argue that extrapolation to examples outside the training space will often be easier for models that capture global structures, rather than just maximise their local fit to the training data. We show that this is true for two popular models: the Decomposable Attention Model and word2vec.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.