pith. sign in

arxiv: 1903.05683 · v1 · pith:BHEA3YXSnew · submitted 2019-03-13 · 💻 cs.CL

Low-Resource Syntactic Transfer with Unsupervised Source Reordering

classification 💻 cs.CL
keywords languagessourcetreebankstransferaccuracyachievebiblecorpus
0
0 comments X
read the original abstract

We describe a cross-lingual transfer method for dependency parsing that takes into account the problem of word order differences between source and target languages. Our model only relies on the Bible, a considerably smaller parallel data than the commonly used parallel data in transfer methods. We use the concatenation of projected trees from the Bible corpus, and the gold-standard treebanks in multiple source languages along with cross-lingual word representations. We demonstrate that reordering the source treebanks before training on them for a target language improves the accuracy of languages outside the European language family. Our experiments on 68 treebanks (38 languages) in the Universal Dependencies corpus achieve a high accuracy for all languages. Among them, our experiments on 16 treebanks of 12 non-European languages achieve an average UAS absolute improvement of 3.3% over a state-of-the-art method.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.