arXiv preprint arXiv:2310.13988 , year=

GEMBA-MQM: Detecting translation quality error spans with GPT-4 , author= · arXiv 2310.13988

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

representative citing papers

Who Watches the Watchmen? Humans Disagree With Translation Metrics on Unseen Domains

cs.CL · 2026-04-19 · unverdicted · novelty 6.0

Automatic translation metrics show lower agreement with humans on unseen technical domains than humans show with each other, and their robustness claims weaken when benchmarked against inter-annotator agreement instead of raw scores.

citing papers explorer

Showing 1 of 1 citing paper.

Who Watches the Watchmen? Humans Disagree With Translation Metrics on Unseen Domains cs.CL · 2026-04-19 · unverdicted · none · ref 26
Automatic translation metrics show lower agreement with humans on unseen technical domains than humans show with each other, and their robustness claims weaken when benchmarked against inter-annotator agreement instead of raw scores.

arXiv preprint arXiv:2310.13988 , year=

fields

years

verdicts

representative citing papers

citing papers explorer