MENT benchmark plus RATE agentic evaluator raise combined system- and segment-level correlation with human judgments by at least 3.2 points over prior MT metrics and LLM judges.
Crucially, RATE consistently maintains superior performance over the baseline, confirming that our framework’s effectiveness holds across different backbone mod- els
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.CL 1years
2026 1verdicts
CONDITIONAL 1representative citing papers
citing papers explorer
-
Beyond Literal Mapping: Benchmarking and Improving Non-Literal Translation Evaluation
MENT benchmark plus RATE agentic evaluator raise combined system- and segment-level correlation with human judgments by at least 3.2 points over prior MT metrics and LLM judges.