pith. sign in

arxiv: 1906.08996 · v1 · pith:OYX2LEHUnew · submitted 2019-06-21 · 💻 cs.CL

Incremental Adaptation of NMT for Professional Post-editors: A User Study

Pith reviewed 2026-05-25 19:09 UTC · model grok-4.3

classification 💻 cs.CL
keywords neural machine translationonline learningpost-editinguser studyincremental adaptationprofessional translatorsadaptive systems
0
0 comments X

The pith

Incremental updates to neural machine translation models during post-editing reduce human effort and improve quality.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper tests an online learning approach for neural machine translation where the system updates its models using data from professional translators' post-edits in real time. Professional translators participated in a user study comparing the adaptive system to a static one. The study found that the adaptive system required less post-editing effort, produced higher quality translations, and received positive feedback from users. This setup matters because it shows how continuously generated revision data can be used to make machine translation more efficient in professional settings.

Core claim

By applying online learning to update the neural machine translation system with new bilingual data generated during the post-editing process, the system achieves a reduction in the amount of human effort required for post-editing, improvements in translation quality, and a positive perception by professional users.

What carries the argument

Online learning paradigm for incremental model updates using post-editing generated bilingual data.

If this is right

  • Post-editors need to make fewer corrections when working with the incrementally adapted system.
  • Overall translation quality increases as the model learns from the edits.
  • Professional users view the adaptive system favorably compared to non-adaptive ones.
  • The approach demonstrates feasibility for real-world professional translation workflows.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • Continuous adaptation could personalize translation models to specific translators or domains over time.
  • If the quality assumption holds, similar online learning might apply to other human-AI collaboration tasks like editing or annotation.
  • Long-term deployment might reduce the need for initial training data by bootstrapping from ongoing use.

Load-bearing premise

The data from post-edits is sufficiently accurate and plentiful to improve the model without causing it to degrade.

What would settle it

Observing that model updates from post-editing data lead to increased post-editing effort or lower quality scores in a controlled study would falsify the benefits.

Figures

Figures reproduced from arXiv: 1906.08996 by Alexandre Helle, \'Alvaro Peris, Amando Estela, Francisco Casacuberta, Laurent Bi\'e, Manuel Herranz, Mercedes Garc\'ia-Mart\'inez, Miguel Domingo.

Figure 1
Figure 1. Figure 1: User Interface from SDL Trados Studio. From top to bottom, the first row and the leftmost column correspond to the user menus. On the next row, the middle column contains information about the segment that is being translated: on the left, the source sentence and, on the right, the MT translation. The right column displays the content of the terminological dictionary (if any). The document that is being tr… view at source ↗
Figure 2
Figure 2. Figure 2: hBLEU per sentence of static and adaptive systems for both test sets (T1 and T2). Individual sentence scores are plotted for each system, static (red crosses) and adaptive (blue dots). The sentences were processed sequentially, hence, we can observe the progress of the system with its usage. To this end, we show a fit of the scores of each system, in dashed red and solid blue lines, for static and adaptive… view at source ↗
read the original abstract

A common use of machine translation in the industry is providing initial translation hypotheses, which are later supervised and post-edited by a human expert. During this revision process, new bilingual data are continuously generated. Machine translation systems can benefit from these new data, incrementally updating the underlying models under an online learning paradigm. We conducted a user study on this scenario, for a neural machine translation system. The experimentation was carried out by professional translators, with a vast experience in machine translation post-editing. The results showed a reduction in the required amount of human effort needed when post-editing the outputs of the system, improvements in the translation quality and a positive perception of the adaptive system by the users.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

0 major / 3 minor

Summary. The manuscript reports a user study in which professional translators post-edit outputs from a neural machine translation system that is incrementally updated online with the bilingual data generated during editing. The central empirical claim is that the adaptive system reduces post-editing effort, improves final translation quality, and is viewed positively by the participants relative to a non-adaptive baseline.

Significance. If the measured reductions in effort and gains in quality are robust, the work supplies direct evidence from domain professionals that online adaptation can be integrated into real post-editing workflows without degradation. The involvement of experienced translators and the focus on incremental rather than batch updates are concrete strengths that increase the result's relevance to industrial MT deployment.

minor comments (3)
  1. [Abstract] The abstract states positive outcomes but does not mention participant count, number of documents, or the precise metrics (e.g., TER, time, or edit distance) used to quantify effort reduction; adding one sentence with these quantities would allow readers to gauge the strength of the claims immediately.
  2. [Results / Experimental protocol] Section 4 (or the results section) should explicitly state whether any safeguards (e.g., learning-rate decay, data filtering, or rollback mechanisms) were applied during the online updates to prevent quality degradation from noisy post-edits; if none were used, a brief justification would strengthen the reproducibility of the protocol.
  3. [Figures] Figure 2 (or whichever figure shows per-user effort curves) would benefit from error bars or per-participant variance to indicate consistency across the professional cohort.

Simulated Author's Rebuttal

0 responses · 0 unresolved

We thank the referee for the positive assessment of our user study on incremental online adaptation of NMT and for recommending minor revision. No specific major comments were raised in the report.

Circularity Check

0 steps flagged

Empirical user study; no derivation chain present

full rationale

The paper reports results from a controlled user study with professional translators performing post-editing on an incrementally adapted NMT system. No equations, parameter fitting, predictions, or uniqueness theorems appear in the abstract or described experimental protocol. Central claims rest on observed metrics (effort, quality, user perception) rather than any reduction of outputs to inputs by construction. Self-citations, if present, are not load-bearing for any mathematical result. This matches the default non-circular case for empirical work.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The paper is an empirical user study and therefore rests on standard experimental assumptions rather than new mathematical entities or fitted parameters.

axioms (1)
  • domain assumption Professional translators' post-editing behavior and the generated bilingual data are representative of real-world professional use cases.
    The central claim depends on the study participants and task reflecting actual industry conditions.

pith-pipeline@v0.9.0 · 5671 in / 1048 out tokens · 40330 ms · 2026-05-25T19:09:59.156599+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

45 extracted references · 45 canonical work pages · 5 internal anchors

  1. [1]

    A., Mesa-Lao, B., Ortiz-Mart \'i nez, D., Saint-Amand, H., Sanchis-Trilles, G., and Tsoukala, C

    Alabau, V., Bonk, R., Buck, C., Carl, M., Casacuberta, F., Garc \'i a-Mart \'i nez, M., Gonz \'a lez-Rubio, J., Koehn, P., Leiva, L. A., Mesa-Lao, B., Ortiz-Mart \'i nez, D., Saint-Amand, H., Sanchis-Trilles, G., and Tsoukala, C. (2013). CASMACAT : An open source workbench for advanced computer aided translation. The Prague Bulletin of Mathematical Lingui...

  2. [2]

    Alabau, V., Carl, M., Casacuberta, F., García-Martínez, M., González-Rubio, J., Mesa-Lao, B., Ortiz-Martínez, D., Schaeffer, M., and Sanchis-Trilles, G. (2016). New Directions in Empirical Translation Process Research , chapter Learning Advanced Post-editing, pages 95--110. New Frontiers in Translation Studies

  3. [3]

    Arenas, A. G. (2008). Productivity and quality in the post-editing of outputs from translation memories and machine translation. Localisation Focus , 7(1):11--21

  4. [4]

    Aziz, W., Castilho, S., and Specia, L. (2012). Pet: a tool for post-editing and assessing machine translation. In In proceedings of The International Conference on Language Resources and Evaluation , pages 3982--3987

  5. [5]

    Bahdanau, D., Cho, K., and Bengio, Y. (2015). Neural machine translation by jointly learning to align and translate. arXiv:1409.0473

  6. [6]

    Bentivogli, L., Bertoldi, N., Cettolo, M., Federico, M., Negri, M., and Turchi, M. (2016). On the evaluation of adaptive machine translation for human post-editing. IEEE/ACM Transactions on Audio, Speech and Language Processing , 24(2):388--399

  7. [7]

    and Yuret, D

    Bi c ici, E. and Yuret, D. (2015). Optimizing instance selection for statistical machine translation with feature decay algorithms. IEEE/ACM Transactions on Audio, Speech and Language Processing , 23(2):339--350

  8. [8]

    Proceedings of the Eighth Workshop on Statistical Machine Translation

    Bojar, O., Buck, C., Callison-Burch, C., Haddow, B., Koehn, P., Monz, C., Post, M., Saint-Amand, H., Soricut, R., and Specia, L., editors (2013). Proceedings of the Eighth Workshop on Statistical Machine Translation . Association for Computational Linguistics

  9. [9]

    M., Sudarikov, R., Tamchyna, A., and Variš, D

    Bojar, O., Haddow, B., , D. M., Sudarikov, R., Tamchyna, A., and Variš, D. (2017). Report on building translation systems for public health domain (deliverable D 1.1). Technical Report H2020-ICT-2014-1-644402, Technical report, Health in my Language (HimL)

  10. [10]

    F., Cocke, J., Pietra, S

    Brown, P. F., Cocke, J., Pietra, S. A. D., Pietra, V. J. D., Jelinek, F., Lafferty, J. D., Mercer, R. L., and Roossin, P. S. (1990). A statistical approach to machine translation. Computational Linguistics , 16:79--85

  11. [11]

    Castilho, S., Moorkens, J., Gaspari, F., Calixto, I., Tinsley, J., and Way, A. (2017). Is neural machine translation the new state of the art? The Prague Bulletin of Mathematical Linguistics , 108(1):109--120

  12. [12]

    and Cherry, C

    Chen, B. and Cherry, C. (2014). A systematic comparison of smoothing techniques for sentence-level bleu. In Proceedings of the Ninth Workshop on Statistical Machine Translation , pages 362--367

  13. [13]

    and Macken, L

    Daems, J. and Macken, L. (2019). Interactive adaptive smt versus interactive adaptive nmt: a user experience evaluation. Machine Translation , pages 1--18

  14. [14]

    Denkowski, M., Dyer, C., and Lavie, A. (2014). Learning from post-editing: Online model adaptation for statistical machine translation. In Proceedings of the 14th Conference of the European Chapter of the Association for Computational Linguistics , pages 395--404

  15. [15]

    Federico, M., Bertoldi, N., Cettolo, M., Negri, M., Turchi, M., Trombetti, M., Cattelan, A., Farina, A., Lupinetti, D., Martines, A., Massidda, A., Schwenk, H., Barrault, L., Blain, F., Koehn, P., Buck, C., and Germann, U. (2014). The matecat tool. In Proceedings of the 25th International Conference on Computational Linguistics: System Demonstrations , pa...

  16. [16]

    Gehring, J., Auli, M., Grangier, D., Yarats, D., and Dauphin, Y. N. (2017). Convolutional sequence to sequence learning. arXiv:1705.03122

  17. [17]

    A., Schmidhuber, J., and Cummins, F

    Gers, F. A., Schmidhuber, J., and Cummins, F. (2000). Learning to forget: Continual prediction with LSTM . Neural computation , 12(10):2451--2471

  18. [18]

    Green, S., Heer, J., and Manning, C. D. (2013a). The efficacy of human post-editing for language translation. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems , pages 439--448

  19. [19]

    Green, S., Wang, S., Cer, D., and Manning, C. D. (2013b). Fast and adaptive online training of feature-rich translation models. In Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics , volume 1, pages 311--321

  20. [20]

    and Cadwell, P

    Hu, K. and Cadwell, P. (2016). A comparative study of post-editing guidelines. In Proceedings of the 19th Annual Conference of the European Association for Machine Translation , pages 34206--353

  21. [21]

    Jia, Y., Carl, M., and Wang, X. (2019). Post-editing neural machine translation versus phrase-based machine translation for english--chinese. Machine Translation , pages 1--21

  22. [22]

    Karimova, S., Simianer, P., and Riezler, S. (2018). A user-study on online adaptation of neural machine translation to human post-edits. Machine Translation , 32(4):309--324

  23. [23]

    Kingma, D. P. and Ba, J. (2014). Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980

  24. [24]

    Klein, G., Kim, Y., Deng, Y., Senellart, J., and Rush, A. M. (2017). Open NMT : Open-source toolkit for neural machine translation. In Proceedings of the Association for the Computational Linguistics , pages 67--72

  25. [25]

    Koponen, M., Salmi, L., and Nikulin, M. (2019). A product and process analysis of post-editor corrections on neural, statistical and rule-based machine translation output. Machine Translation , pages 1--30

  26. [26]

    Kothur, S. S. R., Knowles, R., and Koehn, P. (2018). Document-level adaptation for neural machine translation. In Proceedings of the 2nd Workshop on Neural Machine Translation and Generation , pages 64--73

  27. [27]

    K., Kreutzer, J., and Riezler, S

    Lam, T. K., Kreutzer, J., and Riezler, S. (2018). A reinforcement learning approach to interactive-predictive neural machine translation. In Proceedings of the European Association for Machine Translation conference , pages 169--178

  28. [28]

    Nielsen, J. (1993). Usability Engineering . Morgan Kaufmann Publishers Inc

  29. [29]

    Ortiz-Mart \'i nez, D. (2016). Online learning for statistical machine translation. Computational Linguistics , 42(1):121--161

  30. [30]

    Papineni, K., Roukos, S., Ward, T., and Zhu, W.-J. (2002). BLEU : a method for automatic evaluation of machine translation. In Proceedings of the Annual Meeting of the Association for Computational Linguistics , pages 311--318

  31. [31]

    and Casacuberta, F

    Peris, \'A . and Casacuberta, F. (2019). Online learning for effort reduction in interactive neural machine translation. Computer Speech & Language. In Press

  32. [32]

    Peris, \'A ., Cebri \'a n, L., and Casacuberta, F. (2017). Online learning for neural machine translation post-editing. arXiv:1706.03196

  33. [33]

    Post, M. (2018). A call for clarity in reporting bleu scores. In Proceedings of the Third Conference on Machine Translation: Research Papers , pages 186--191

  34. [34]

    and Maxwell, J

    Riezler, S. and Maxwell, J. T. (2005). On some pitfalls in automatic evaluation and significance testing for mt. In Proceedings of the ACL workshop on intrinsic and extrinsic evaluation measures for machine translation and/or summarization , pages 57--64

  35. [35]

    and Monro, S

    Robbins, H. and Monro, S. (1951). A stochastic approximation method. The Annals of Mathematical Statistics , pages 400--407

  36. [36]

    E., Hinton, G

    Rumelhart, D. E., Hinton, G. E., and Williams, R. J. (1986). Learning representations by back-propagating errors. nature , 323(6088):533

  37. [37]

    Sennrich, R., Haddow, B., and Birch, A. (2016). Neural machine translation of rare words with subword units. In Proceedings of the Annual Meeting of the Association for Computational Linguistics , pages 1715--1725

  38. [38]

    Snover, M., Dorr, B., Schwartz, R., Micciulla, L., and Makhoul, J. (2006). A study of translation edit rate with targeted human annotation. In Proceedings of the Association for Machine Translation in the Americas , pages 223--231

  39. [39]

    Sutskever, I., Vinyals, O., and Le, Q. V. (2014). Sequence to sequence learning with neural networks. In Proceedings of the Advances in Neural Information Processing Systems , volume 27, pages 3104--3112

  40. [40]

    Szegedy, C., Liu, W., Jia, Y., Sermanet, P., Reed, S., Anguelov, D., Erhan, D., Vanhoucke, V., and Rabinovich, A. (2015). Going deeper with convolutions. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition , pages 1--9

  41. [41]

    A., and Federico, M

    Turchi, M., Negri, M., Farajian, M. A., and Federico, M. (2017). Continuous learning from human post-edits for neural machine translation. The Prague Bulletin of Mathematical Linguistics , 108(1):233--244

  42. [42]

    and Le \'o n, M

    Vasconcellos, M. and Le \'o n, M. (1985). SPANAM and ENGSPAN : machine translation at the pan american health organization. Computational Linguistics , 11(2-3)

  43. [43]

    N., Kaiser, ., and Polosukhin, I

    Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., Kaiser, ., and Polosukhin, I. (2017). Attention is all you need. In Advances in Neural Information Processing Systems , pages 5998--6008

  44. [44]

    Wu , Y., Schuster , M., Chen , Z., Le , Q. V., Norouzi , M., Macherey , W., Krikun , M., Cao , Y., Gao , Q., Macherey , K., Klingner , J., Shah , A., Johnson , M., Liu , X., Kaiser , ., Gouws , S., Kato , Y., Kudo , T., Kazawa , H., Stevens , K., Kurian , G., Patil , N., Wang , W., Young , C., Smith , J., Riesa , J., Rudnick , A., Vinyals , O., Corrado , ...

  45. [45]

    Wuebker, J., Simianer, P., and DeNero, J. (2018). Compact personalized models for neural machine translation. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing , pages 881--886