Human readers prefer human literary translations over AI-generated ones for immersion and clarity despite finding MT adequate and struggling to identify the source.
Creativity Bias: How Machine Evaluation Struggles with Creativity in Literary Translations
2 Pith papers cite this work. Polarity classification is still indexing.
abstract
This article investigates the performance of automatic evaluation metrics (AEMs) and LLM-as-a-judge evaluation on literary translation across multiple languages, genres, and translation modalities. The aim is to assess how well these tools align with professionals when evaluating translation, creativity (creative shifts & errors), and see if they can substitute laborious manual annotations. A dataset of literary translations across three modalities (human translation, machine translation, and post-editing), three genres and three language pairs was created and annotated in detail for creativity by experienced professional literary translators. The results show that both AEMs and LLM-as-a-judge evaluations correlate poorly with professional evaluations on creativity, with LLM-as-a-judge showing a systematic bias in favour of machine-translated texts and penalising creative and culturally appropriate solutions. Moreover, performance is consistently worse for more literary genres such as poetry. This highlights fundamental limitations of current automatic evaluation tools for literary translation and the need to create new tools that do not frequently consider out of routine translations as errors.
fields
cs.CL 2years
2026 2verdicts
UNVERDICTED 2representative citing papers
LitSeg segments literary texts using narrative analysis via multi-stage prompting and offers a distilled lightweight version for efficient use in RAG systems.
citing papers explorer
-
AI translation of literary texts is "fine", but readers still prefer human translations
Human readers prefer human literary translations over AI-generated ones for immersion and clarity despite finding MT adequate and struggling to identify the source.
-
LitSeg: Narrative-Aware Document Segmentation for Literary RAG
LitSeg segments literary texts using narrative analysis via multi-stage prompting and offers a distilled lightweight version for efficient use in RAG systems.