arxiv: 1606.05759 · v1 · pith:6WMFZ76Lnew · submitted 2016-06-18 · 💻 cs.CL

Egyptian Arabic to English Statistical Machine Translation System for NIST OpenMT'2015

Hassan Sajjad , Nadir Durrani , Francisco Guzman , Preslav Nakov , Ahmed Abdelali , Stephan Vogel , Wael Salloum , Ahmed El Kholy

show 1 more author

Nizar Habash

This is my paper

classification 💻 cs.CL

keywords modelsystemarabiccompetitionegyptianfeaturesfocusedlanguage

0 comments p. Extension

Add this Pith Number to your LaTeX paper

What is a Pith Number?

\usepackage{pith}
\pithnumber{6WMFZ76L}

Prints a linked pith:6WMFZ76L badge after your title and writes the identifier into PDF metadata. Compiles on arXiv with no extra files. Learn more

read the original abstract

The paper describes the Egyptian Arabic-to-English statistical machine translation (SMT) system that the QCRI-Columbia-NYUAD (QCN) group submitted to the NIST OpenMT'2015 competition. The competition focused on informal dialectal Arabic, as used in SMS, chat, and speech. Thus, our efforts focused on processing and standardizing Arabic, e.g., using tools such as 3arrib and MADAMIRA. We further trained a phrase-based SMT system using state-of-the-art features and components such as operation sequence model, class-based language model, sparse features, neural network joint model, genre-based hierarchically-interpolated language model, unsupervised transliteration mining, phrase-table merging, and hypothesis combination. Our system ranked second on all three genres.

This paper has not been read by Pith yet.

Egyptian Arabic to English Statistical Machine Translation System for NIST OpenMT'2015

discussion (0)