pith. machine review for the scientific record. sign in

arxiv: 2508.17458 · v2 · submitted 2025-08-24 · 💻 cs.CL

Recognition: unknown

Evaluating the Impact of Verbal Multiword Expressions on Machine Translation

Authors on Pith no claims yet
classification 💻 cs.CL
keywords translationmachineexpressionsmultiwordverbalconstructionsdatasetsimpact
0
0 comments X
read the original abstract

Verbal multiword expressions (VMWEs) remain difficult for machine translation because their meanings are often not recoverable from their component words. In this study, we analyze the impact of three VMWE categories -- verbal idioms, verb-particle constructions, and light verb constructions -- on machine translation quality from English to multiple languages. Using both established multiword expression datasets and standard machine translation datasets, we evaluate how state-of-the-art translation systems handle these expressions. Our experimental results consistently show that VMWEs negatively affect translation quality, with deeper analysis indicating that this degradation is primarily attributable to the VMWE itself rather than general sentence-level difficulty. We release our code and evaluation framework to test new MT systems for the community.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.