pith. sign in

arxiv: 1211.4488 · v1 · pith:FNBVW5YQnew · submitted 2012-11-19 · 💻 cs.CL · cs.AI

A Rule-Based Approach For Aligning Japanese-Spanish Sentences From A Comparable Corpora

classification 💻 cs.CL cs.AI
keywords japanese-spanishparallelapproachcorpuslacklanguagesrule-basedsentences
0
0 comments X
read the original abstract

The performance of a Statistical Machine Translation System (SMT) system is proportionally directed to the quality and length of the parallel corpus it uses. However for some pair of languages there is a considerable lack of them. The long term goal is to construct a Japanese-Spanish parallel corpus to be used for SMT, whereas, there are a lack of useful Japanese-Spanish parallel Corpus. To address this problem, In this study we proposed a method for extracting Japanese-Spanish Parallel Sentences from Wikipedia using POS tagging and Rule-Based approach. The main focus of this approach is the syntactic features of both languages. Human evaluation was performed over a sample and shows promising results, in comparison with the baseline.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.