Bridging the Linguistic Divide: A Survey on Leveraging Large Language Models for Machine Translation
Pith reviewed 2026-05-22 21:36 UTC · model grok-4.3
The pith
LLM-based machine translation evolves traditional systems by shifting gains toward data quality, preference alignment, and context use rather than model scale.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
LLM-based MT is an evolution of traditional MT systems, where gains increasingly depend on data quality, preference alignment, and context utilization rather than scale alone.
What carries the argument
Systematic categorization of methods by data regime, language setting, and technique type, including prompting, parameter-efficient tuning, synthetic data, and RL with human or weak feedback.
If this is right
- Low-resource translation improves when synthetic data quality, diversity, and preference signals are prioritized over volume.
- Document-level and discourse-aware MT mostly extends sentence pipelines via context selection, post-editing, or reranking.
- Mixture-of-Experts and MT-specialized LLMs create explicit trade-offs between scalability and task specialization.
- LLM evaluators complement but do not replace learned metrics due to their own biases.
- Open challenges center on robust, inclusive, and controllable systems that work across languages and settings.
Where Pith is reading between the lines
- Future work could test whether targeted preference optimization on translation pairs yields larger gains than general alignment.
- The emphasis on context utilization suggests experiments comparing structured discourse features against raw context windows.
- Trade-offs noted for accessibility may imply that smaller specialized models could match large general ones on specific language pairs.
- The survey's framing invites direct comparisons of data curation costs versus scaling costs on shared benchmarks.
Load-bearing premise
The selected papers and their groupings represent the full range of LLM MT work without systematic gaps or omitted counterexamples.
What would settle it
Identification of a substantial body of LLM MT research where translation quality improves primarily through increased model scale with no corresponding advances in data quality or preference signals.
read the original abstract
Large Language Models (LLMs) are rapidly reshaping machine translation (MT), particularly by introducing instruction-following, in-context learning, and preference-based alignment into what has traditionally been a supervised encoder-decoder paradigm. This survey provides a comprehensive and up-to-date overview of how LLMs are being leveraged for MT across data regimes, languages, and application settings. We systematically analyze prompting-based methods, parameter-efficient and full fine-tuning strategies, synthetic data generation, preference-based optimization, and reinforcement learning with human and weakly supervised feedback. Special attention is given to low-resource translation, where we examine the roles of synthetic data quality, diversity, and preference signals, as well as the limitations of current RLHF pipelines. We further review recent advances in Mixture-of-Experts models, MT-focused LLMs, and multilingual alignment, highlighting trade-offs between scalability, specialization, and accessibility. Beyond sentence-level translation, we survey emerging document-level and discourse-aware MT methods with LLMs, showing that most approaches extend sentence-level pipelines through structured context selection, post-editing, or reranking rather than requiring fundamentally new data regimes or architectures. Finally, we discuss LLM-based evaluation, its strengths and biases, and its role alongside learned metrics. Overall, this survey positions LLM-based MT as an evolution of traditional MT systems, where gains increasingly depend on data quality, preference alignment, and context utilization rather than scale alone, and outlines open challenges for building robust, inclusive, and controllable translation systems.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper is a survey on leveraging Large Language Models for Machine Translation. It systematically reviews prompting-based methods, parameter-efficient and full fine-tuning, synthetic data generation, preference-based optimization and RL with human/weak feedback, low-resource translation (including synthetic data quality and RLHF limitations), Mixture-of-Experts and MT-focused LLMs, multilingual alignment, document-level and discourse-aware approaches (mostly extending sentence-level pipelines via context selection or post-editing), and LLM-based evaluation. The central positioning is that LLM-based MT evolves traditional systems, with gains now driven more by data quality, preference alignment, and context utilization than by scale alone.
Significance. As a timely synthesis of a fast-moving area, the survey can help researchers navigate trade-offs between scalability, specialization, and accessibility while identifying open challenges for robust and inclusive MT. Its value depends on balanced coverage across the cited regimes and settings.
major comments (1)
- [Abstract] Abstract: the positioning that 'gains increasingly depend on data quality, preference alignment, and context utilization rather than scale alone' is the survey's core synthesis claim. To make this load-bearing statement robust, the manuscript should include an explicit synthesis subsection (perhaps in the discussion or conclusion) that aggregates quantitative or comparative evidence from the reviewed works demonstrating diminishing returns to scale versus gains from the other factors.
Simulated Author's Rebuttal
We thank the referee for the thoughtful and constructive comment. We agree that strengthening the core synthesis claim with an explicit subsection will improve the manuscript and will implement the suggested revision.
read point-by-point responses
-
Referee: [Abstract] Abstract: the positioning that 'gains increasingly depend on data quality, preference alignment, and context utilization rather than scale alone' is the survey's core synthesis claim. To make this load-bearing statement robust, the manuscript should include an explicit synthesis subsection (perhaps in the discussion or conclusion) that aggregates quantitative or comparative evidence from the reviewed works demonstrating diminishing returns to scale versus gains from the other factors.
Authors: We agree that an explicit synthesis subsection would make the central claim more robust. In the revised manuscript we will add a dedicated subsection (placed in the Discussion) that aggregates comparative evidence from the reviewed literature. This subsection will cite specific studies showing (a) diminishing returns to further scaling in low-resource and domain-specific settings and (b) larger gains obtained from data-quality improvements, preference alignment, and context utilization. Where direct head-to-head comparisons exist in the cited works, we will highlight them; where they are indirect, we will note the limitation while still summarizing the overall pattern observed across the surveyed papers. revision: yes
Circularity Check
No significant circularity
full rationale
This is a survey paper with no derivations, equations, fitted parameters, or new empirical claims. The central positioning of LLM-based MT as an evolution of traditional systems is a high-level synthesis of external literature rather than a self-referential derivation. No load-bearing steps reduce to the paper's own inputs by construction, self-citation chains, or ansatz smuggling. The paper is self-contained as a review and carries no circularity burden.
Axiom & Free-Parameter Ledger
Forward citations
Cited by 1 Pith paper
-
One Model to Translate Them All? A Journey to Mount Doom for Multilingual Model Merging
Merging fine-tuned models for multilingual translation fails because fine-tuning redistributes language-specific neurons rather than sharpening them, increasing representational divergence in output-generating layers.
Reference graph
Works this paper leans on
-
[1]
In: KOERNER, E.F.K., ASHER, R.E
Hutchins, W.J.: Machine translation: A brief history. In: KOERNER, E.F.K., ASHER, R.E. (eds.) Concise History of the Language Sciences, pp. 431–445. Pergamon, Amster- dam (1995). https://doi.org/10.1016/B978-0-08-042580-1.50066-0 . https://www.sciencedirect.com/science/article/pii/B9780080425801500660
-
[2]
Computational Linguistics19(2), 263–311 (1993)
Brown, P.F., Della Pietra, S.A., Della Pietra, V.J., Mercer, R.L.: The mathe- matics of statistical machine translation: Parameter estimation. Computational Linguistics19(2), 263–311 (1993)
work page 1993
-
[3]
Cambridge University Press, ??? (2009)
Koehn, P.: Statistical Machine Translation. Cambridge University Press, ??? (2009)
work page 2009
-
[4]
Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput.9(8), 1735–1780 (1997) https://doi.org/10.1162/neco.1997.9.8.1735
-
[5]
In: Moschitti, A., Pang, B., Daelemans, W
Cho, K., Merri¨ enboer, B., Gulcehre, C., Bahdanau, D., Bougares, F., Schwenk, H., Bengio, Y.: Learning phrase representations using RNN encoder–decoder for statistical machine translation. In: Moschitti, A., Pang, B., Daelemans, W. (eds.) Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 1724–1734. Associ...
-
[6]
Sutskever, I., Vinyals, O., Le, Q.V.: Sequence to sequence learning with neural networks. In: Proceedings of the 28th International Conference on Neural Infor- mation Processing Systems - Volume 2. NIPS’14, pp. 3104–3112. MIT Press, Cambridge, MA, USA (2014)
work page 2014
-
[7]
Neural Machine Translation by Jointly Learning to Align and Translate
Bahdanau, D., Cho, K., Bengio, Y.: Neural Machine Translation by Jointly Learning to Align and Translate (2016). https://arxiv.org/abs/1409.0473
work page internal anchor Pith review Pith/arXiv arXiv 2016
-
[8]
In: M` arquez, L., Callison-Burch, C., Su, 44 J
Luong, T., Pham, H., Manning, C.D.: Effective approaches to attention- based neural machine translation. In: M` arquez, L., Callison-Burch, C., Su, 44 J. (eds.) Proceedings of the 2015 Conference on Empirical Methods in Nat- ural Language Processing, pp. 1412–1421. Association for Computational Linguistics, Lisbon, Portugal (2015). https://doi.org/10.1865...
-
[9]
In: Guyon, I., Luxburg, U.V., Bengio, S., Wallach, H., Fergus, R., Vishwanathan, S., Garnett, R
Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L.u., Polosukhin, I.: Attention is all you need. In: Guyon, I., Luxburg, U.V., Bengio, S., Wallach, H., Fergus, R., Vishwanathan, S., Garnett, R. (eds.) Advances in Neural Information Processing Systems, vol. 30. Curran Associates, Inc., ??? (2017)
work page 2017
-
[10]
Zhang, Y., Asamoah Owusu, D., Carpuat, M., Gao, G.: Facilitating global team meetings between language-based subgroups: When and how can machine translation help? Proc. ACM Hum.-Comput. Interact.6(CSCW1) (2022) https: //doi.org/10.1145/3512937
-
[11]
In: Su, J., Duh, K., Carreras, X
Zoph, B., Yuret, D., May, J., Knight, K.: Transfer learning for low-resource neural machine translation. In: Su, J., Duh, K., Carreras, X. (eds.) Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, pp. 1568–1575. Association for Computational Linguistics, Austin, Texas (2016). https://doi.org/10.18653/v1/D16-1163 .https...
-
[12]
In: Walker, M., Ji, H., Stent, A
Gu, J., Hassan, H., Devlin, J., Li, V.O.K.: Universal neural machine transla- tion for extremely low resource languages. In: Walker, M., Ji, H., Stent, A. (eds.) Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 344–354. Association fo...
-
[13]
In: Korhonen, A., Traum, D., M` arquez, L
Kim, Y., Gao, Y., Ney, H.: Effective cross-lingual transfer of neural machine translation models without shared vocabularies. In: Korhonen, A., Traum, D., M` arquez, L. (eds.) Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 1246–1257. Association for Computational Linguistics, Florence, Italy (2019). https://do...
-
[14]
Ji, B., Zhang, Z., Duan, X., Zhang, M., Chen, B., Luo, W.: Cross-lingual pre- training based transfer for zero-shot neural machine translation. Proceedings of the AAAI Conference on Artificial Intelligence34(01), 115–122 (2020) https: //doi.org/10.1609/aaai.v34i01.5341
-
[15]
(eds.) Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pp
Aji, A.F., Bogoychev, N., Heafield, K., Sennrich, R.: In neural machine transla- tion, what does transfer learning transfer? In: Jurafsky, D., Chai, J., Schluter, N., Tetreault, J. (eds.) Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pp. 7701–7710. Association for Computational Linguistics, Online (2020). https:/...
-
[17]
Data augmentation for low-resource neural machine translation
Fadaee, M., Bisazza, A., Monz, C.: Data augmentation for low-resource neu- ral machine translation. In: Barzilay, R., Kan, M.-Y. (eds.) Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), pp. 567–573. Association for Computational Lin- guistics, Vancouver, Canada (2017). https://doi.org/10.1865...
-
[18]
Generalized data augmentation for low-resource translation
Xia, M., Kong, X., Anastasopoulos, A., Neubig, G.: Generalized data augmen- tation for low-resource translation. In: Korhonen, A., Traum, D., M` arquez, L. (eds.) Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 5786–5796. Association for Computational Linguistics, Florence, Italy (2019). https://doi.org/10.1865...
-
[19]
Ohuoba, A., Sharoff, S., Walker, C.: Quantifying the contribution of MWEs and polysemy in translation errors for English–Igbo MT. In: Scarton, C., Prescott, C., Bayliss, C., Oakley, C., Wright, J., Wrigley, S., Song, X., Gow- Smith, E., Bawden, R., S´ anchez-Cartagena, V.M., Cadwell, P., Lapshinova- Koltunski, E., Cabarr˜ ao, V., Chatzitheodorou, K., Nurm...
work page 2024
-
[20]
In: Al-Onaizan, Y., Bansal, M., Chen, Y.- N
Yao, B., Jiang, M., Bobinac, T., Yang, D., Hu, J.: Benchmarking machine translation with cultural awareness. In: Al-Onaizan, Y., Bansal, M., Chen, Y.- N. (eds.) Findings of the Association for Computational Linguistics: EMNLP 2024, pp. 13078–13096. Association for Computational Linguistics, Miami, Florida, USA (2024). https://doi.org/10.18653/v1/2024.find...
-
[21]
iScience27(10), 110878 (2024) https:// doi.org/10.1016/j.isci.2024.110878
Naveen, P., Trojovsk´ y, P.: Overview and challenges of machine translation for contextually appropriate translations. iScience27(10), 110878 (2024) https:// doi.org/10.1016/j.isci.2024.110878
-
[22]
In: Rogers, A., Boyd-Graber, J., Okazaki, N
Dale, D., Voita, E., Barrault, L., Costa-juss` a, M.R.: Detecting and mitigating hallucinations in machine translation: Model internal workings alone do well, sentence similarity Even better. In: Rogers, A., Boyd-Graber, J., Okazaki, N. 46 (eds.) Proceedings of the 61st Annual Meeting of the Association for Com- putational Linguistics (Volume 1: Long Pape...
work page 2023
-
[24]
In: Zong, C., Xia, F., Li, W., Navigli, R
Zhang, X., Zhang, J., Chen, Z., He, K.: Crafting adversarial examples for neural machine translation. In: Zong, C., Xia, F., Li, W., Navigli, R. (eds.) Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), pp. 1967–1977. As...
-
[25]
Sadrizadeh, S., Dolamic, L., Frossard, P.: TransFool: An Adversarial Attack against Neural Machine Translation Models (2023). https://arxiv.org/abs/2302. 00944
work page 2023
-
[26]
Sadrizadeh, S., Aghdam, A.D., Dolamic, L., Frossard, P.: Targeted adversarial attacks against neural machine translation. In: ICASSP 2023 - 2023 IEEE Inter- national Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10095342
-
[27]
IEEE Access8, 199523–199538 (2020) https://doi.org/10.1109/ACCESS
Mohamed, Y.A., Khanan, A., Bashir, M., Mohamed, A.H.H.M., Adiel, M.A.E., Elsadig, M.A.: The impact of artificial intelligence on language translation: A review. IEEE Access12, 25553–25579 (2024) https://doi.org/10.1109/ACCESS. 2024.3366802
-
[28]
Ul Qumar, S.M., Azim, M., Quadri, S.M.K.: Neural machine translation: A survey of methods used for low resource languages. In: 2023 10th International Conference on Computing for Sustainable Global Development (INDIACom), pp. 1640–1647 (2023)
work page 2023
-
[29]
Haddow, B., Bawden, R., Miceli Barone, A.V., Helcl, J., Birch, A.: Survey of low- resource machine translation. Computational Linguistics48(3), 673–732 (2022) https://doi.org/10.1162/coli a 00446
-
[30]
Maruf, S., Saleh, F., Haffari, G.: A survey on document-level neural machine translation: Methods and evaluation. ACM Comput. Surv.54(2) (2021) https: //doi.org/10.1145/3441691 47
-
[31]
In: Bouamor, H., Pino, J., Bali, K
Wang, L., Lyu, C., Ji, T., Zhang, Z., Yu, D., Shi, S., Tu, Z.: Document-level machine translation with large language models. In: Bouamor, H., Pino, J., Bali, K. (eds.) Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, pp. 16646–16661. Association for Computational Lin- guistics, Singapore (2023). https://doi.org/10.1...
-
[32]
https://arxiv.org/ abs/2405.12669
Shen, H., Shao, L., Li, W., Lan, Z., Liu, Z., Su, J.: A Survey on Multi-modal Machine Translation: Tasks, Methods and Challenges (2024). https://arxiv.org/ abs/2405.12669
-
[33]
Expert Systems with Applications213, 118993 (2023) https://doi.org/10.1016/j.eswa.2022.118993
N´ u˜ nez-Marcos, A., Perez-de-Vi˜ naspre, O., Labaka, G.: A survey on sign language machine translation. Expert Systems with Applications213, 118993 (2023) https://doi.org/10.1016/j.eswa.2022.118993
-
[34]
Moslem, Y., Haque, R., Kelleher, J., Way, A.: Domain-specific text generation for machine translation. In: Duh, K., Guzm´ an, F. (eds.) Proceedings of the 15th Biennial Conference of the Association for Machine Translation in the Ameri- cas (Volume 1: Research Track), pp. 14–30. Association for Machine Translation in the Americas, Orlando, USA (2022).http...
work page 2022
-
[35]
https://github.com/kingoflolz/mesh-transformer-jax (2021)
Wang, B., Komatsuzaki, A.: GPT-J-6B: A 6 Billion Parameter Autoregressive Language Model. https://github.com/kingoflolz/mesh-transformer-jax (2021)
work page 2021
-
[36]
Transactions of the Association for Computational Linguistics12, 58–79 (2024) https://doi.org/10
Shliazhko, O., Fenogenova, A., Tikhonova, M., Kozlova, A., Mikhailov, V., Shavrina, T.: mGPT: Few-shot learners go multilingual. Transactions of the Association for Computational Linguistics12, 58–79 (2024) https://doi.org/10. 1162/tacl a 00633
work page 2024
-
[37]
Sennrich, R., Haddow, B., Birch, A.: Improving neural machine translation models with monolingual data. In: Erk, K., Smith, N.A. (eds.) Proceedings of the 54th Annual Meeting of the Association for Computational Linguis- tics (Volume 1: Long Papers), pp. 86–96. Association for Computational Linguistics, Berlin, Germany (2016). https://doi.org/10.18653/v1/...
-
[38]
doi:10.3115/1073083.1073135 , editor =
Papineni, K., Roukos, S., Ward, T., Zhu, W.-J.: Bleu: a method for automatic evaluation of machine translation. In: Isabelle, P., Charniak, E., Lin, D. (eds.) Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics, pp. 311–318. Association for Computational Linguistics, Philadel- phia, Pennsylvania, USA (2002). https://doi...
-
[39]
Post, M.: A call for clarity in reporting BLEU scores. In: Bojar, O., Chatterjee, R., Federmann, C., Fishel, M., Graham, Y., Haddow, B., Huck, M., Yepes, A.J., Koehn, P., Monz, C., Negri, M., N´ ev´ eol, A., Neves, M., Post, M., Specia, L., 48 Turchi, M., Verspoor, K. (eds.) Proceedings of the Third Conference on Machine Translation: Research Papers, pp. ...
-
[40]
Yin, Y., Zeng, J., Li, Y., Meng, F., Zhang, Y.: LexMatcher: Dictionary-centric data curation for LLM-based machine translation. In: Al-Onaizan, Y., Bansal, M., Chen, Y.-N. (eds.) Findings of the Association for Computational Linguis- tics: EMNLP 2024, pp. 14767–14779. Association for Computational Linguistics, Miami, Florida, USA (2024). https://doi.org/1...
-
[41]
Dabre, R., Song, H., Exel, M., Buschbeck, B., Eschbach-Dymanus, J., Tanaka, H.: How effective is synthetic data and instruction fine-tuning for trans- lation with markup using LLMs? In: Knowles, R., Eriguchi, A., Goel, S. (eds.) Proceedings of the 16th Conference of the Association for Machine Translation in the Americas (Volume 1: Research Track), pp. 73...
work page 2024
-
[42]
Frontull, S., Moser, G.: Rule-based, neural and LLM back-translation: Compar- ative insights from a variant of Ladin. In: Ojha, A.K., Liu, C.-h., Vylomova, E., Pirinen, F., Abbott, J., Washington, J., Oco, N., Malykh, V., Logacheva, V., Zhao, X. (eds.) Proceedings of the Seventh Workshop on Technologies for Machine Translation of Low-Resource Languages (L...
-
[43]
https://arxiv.org/abs/2505.14423
Gibert, O., Attieh, J., Vahtola, T., Aulamo, M., Li, Z., V´ azquez, R., Hu, T., Tiedemann, J.: Scaling Low-Resource MT via Synthetic Data Generation with LLMs (2025). https://arxiv.org/abs/2505.14423
-
[44]
No Language Left Behind: Scaling Human-Centered Machine Translation
Team, N., Costa-juss` a, M.R., Cross, J., C ¸ elebi, O., Elbayad, M., Heafield, K., Heffernan, K., Kalbassi, E., Lam, J., Licht, D., Maillard, J., Sun, A., Wang, S., Wenzek, G., Youngblood, A., Akula, B., Barrault, L., Gonzalez, G.M., Hansanti, P., Hoffman, J., Jarrett, S., Sadagopan, K.R., Rowe, D., Spruit, S., Tran, C., Andrews, P., Ayan, N.F., Bhosale,...
work page internal anchor Pith review Pith/arXiv arXiv 2022
-
[45]
Grattafiori, A., Dubey, A., Jauhri, A., Pandey, A., et al.: The Llama 3 Herd of Models (2024). https://arxiv.org/abs/2407.21783
work page internal anchor Pith review Pith/arXiv arXiv 2024
-
[46]
https://arxiv.org/abs/ 49 2510.11919
Zebaze, A., Bawden, R., Sagot, B.: LLM Reasoning for Machine Translation: Synthetic Data Generation over Thinking Tokens (2025). https://arxiv.org/abs/ 49 2510.11919
-
[48]
In: Christodoulopoulos, C., Chakraborty, T., Rose, C., Peng, V
Gibert, O., Attieh, J., Vahtola, T., Aulamo, M., Li, Z., V´ azquez, R., Hu, T., Tiedemann, J.: Scaling low-resource MT via synthetic data generation with LLMs. In: Christodoulopoulos, C., Chakraborty, T., Rose, C., Peng, V. (eds.) Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing, pp. 27662–27680. Association for Compu...
-
[49]
On the diversity of synthetic data and its impact on training large language models
Chen, H., Waheed, A., Li, X., Wang, Y., Wang, J., Raj, B., Abdin, M.I.: On the Diversity of Synthetic Data and its Impact on Training Large Language Models (2024). https://arxiv.org/abs/2410.15226
-
[50]
In: Haddow, B., Kocmi, T., Koehn, P., Monz, C
Iyer, V., Malik, B., Stepachev, P., Chen, P., Haddow, B., Birch, A.: Quality or quantity? on data scale and diversity in adapting large language models for low- resource translation. In: Haddow, B., Kocmi, T., Koehn, P., Monz, C. (eds.) Pro- ceedings of the Ninth Conference on Machine Translation, pp. 1393–1409. Asso- ciation for Computational Linguistics...
-
[51]
In: Cardoso, H.L., Sousa-Silva, R., Koponen, M., Pareja-Lora, A
Vajda, D., Vreˇ s, D., ˇSikonja, M.R.: Improving LLMs for machine translation using synthetic preference data. In: Cardoso, H.L., Sousa-Silva, R., Koponen, M., Pareja-Lora, A. (eds.) Proceedings of the 2nd LUHME Workshop, pp. 67–73. UP - Universidade do Porto (https://doi.org/10.21747/978-989-9193-73-4/lan2), LIACC - Laborat´ orio de Inteligˆ encia Artifi...
-
[53]
https://arxiv.org/abs/2407.02552
Dang, J., Ahmadian, A., Marchisio, K., Kreutzer, J., ¨Ust¨ un, A., Hooker, S.: RLHF Can Speak Many Languages: Unlocking Multilingual Preference Optimization for LLMs (2024). https://arxiv.org/abs/2407.02552
-
[54]
White paper, The Asia Foundation and Stanford HAI (2025)
Pava, J.N., Meinhardt, C., Uz Zaman, H.B., Friedman, T., Truong, S.T., Zhang, 50 D., Cryst, E., Marivate, V., Koyejo, S.: Mind the (language) gap: Mapping the challenges of llm development in low-resource language contexts. White paper, The Asia Foundation and Stanford HAI (2025). https://hai.stanford.edu/assets/ files/hai-taf-pretoria-white-paper-mind-th...
work page 2025
-
[55]
https://openreview.net/forum?id= eDnslTIWSt
Wang, Y., Bai, A., Peng, N., Hsieh, C.-J.: On the loss of context-awareness in general instruction finetuning (2024). https://openreview.net/forum?id= eDnslTIWSt
work page 2024
-
[56]
Google's Neural Machine Translation System: Bridging the Gap between Human and Machine Translation
Wu, Y., Schuster, M., Chen, Z., Le, Q.V., Norouzi, M., Macherey, W., Krikun, M., Cao, Y., Gao, Q., Macherey, K., Klingner, J., Shah, A., Johnson, M., Liu, X., Kaiser, Gouws, S., Kato, Y., Kudo, T., Kazawa, H., Stevens, K., Kurian, G., Patil, N., Wang, W., Young, C., Smith, J., Riesa, J., Rudnick, A., Vinyals, O., Corrado, G., Hughes, M., Dean, J.: Google’...
work page internal anchor Pith review Pith/arXiv arXiv 2016
-
[57]
Jiao, W., Tu, Z., Li, J., Wang, W., Huang, J.-t., Shi, S.: Tencent’s multilin- gual machine translation system for WMT22 large-scale African languages. In: Koehn, P., Barrault, L., Bojar, O., Bougares, F., Chatterjee, R., Costa-juss` a, M.R., Federmann, C., Fishel, M., Fraser, A., Freitag, M., Graham, Y., Grund- kiewicz, R., Guzman, P., Haddow, B., Huck, ...
work page 2022
-
[58]
https://arxiv.org/ abs/2301.08745
Jiao, W., Wang, W., Huang, J.-t., Wang, X., Shi, S., Tu, Z.: Is ChatGPT A Good Translator? Yes With GPT-4 As The Engine (2023). https://arxiv.org/ abs/2301.08745
-
[59]
OpenAI, Achiam, J., Adler, S., Agarwal, S., Ahmad, L., Akkaya, I., Aleman, F.L., Almeida, D., Altenschmidt, J., Altman, S., et al.: GPT-4 Technical Report (2024). https://arxiv.org/abs/2303.08774
work page internal anchor Pith review Pith/arXiv arXiv 2024
-
[60]
In: Falk, N., Papi, S., Zhang, M
Aycock, S., Bawden, R.: Topic-guided example selection for domain adapta- tion in LLM-based machine translation. In: Falk, N., Papi, S., Zhang, M. (eds.) Proceedings of the 18th Conference of the European Chapter of the Association for Computational Linguistics: Student Research Workshop, pp. 175–195. Association for Computational Linguistics, St. Julian’...
work page 2024
-
[61]
Ji, B., Duan, X., Zhang, Y., Wu, K., Zhang, M.: Zero-shot prompting for llm- based machine translation using in-domain target sentences. IEEE Transactions on Audio, Speech and Language Processing33, 251–261 (2025) https://doi.org/ 10.1109/TASLP.2024.3519814 51
-
[62]
In: Proceedings of the 6th ACM International Conference on Multimedia in Asia Workshops
Gao, Y., Wang, R., Hou, F.: How to design translation prompts for chatgpt: An empirical study. In: Proceedings of the 6th ACM International Conference on Multimedia in Asia Workshops. MMAsia ’24 Workshops. Association for Computing Machinery, New York, NY, USA (2024). https://doi.org/10.1145/ 3700410.3702123 .https://doi.org/10.1145/3700410.3702123
-
[63]
In: Proceedings of the 40th International Conference on Machine Learning
Garcia, X., Bansal, Y., Cherry, C., Foster, G., Krikun, M., Johnson, M., Firat, O.: The unreasonable effectiveness of few-shot learning for machine transla- tion. In: Proceedings of the 40th International Conference on Machine Learning. ICML’23. JMLR.org, ??? (2023)
work page 2023
-
[64]
In: Bouamor, H., Pino, J., Bali, K
Alves, D., Guerreiro, N., Alves, J., Pombal, J., Rei, R., Souza, J., Colombo, P., Martins, A.: Steering large language models for machine translation with finetun- ing and in-context learning. In: Bouamor, H., Pino, J., Bali, K. (eds.) Findings of the Association for Computational Linguistics: EMNLP 2023, pp. 11127–11148. Association for Computational Lin...
-
[65]
Bawden, R., Yvon, F.: Investigating the translation performance of a large multilingual language model: the case of BLOOM. In: Nurminen, M., Bren- ner, J., Koponen, M., Latomaa, S., Mikhailov, M., Schierl, F., Ranasinghe, T., Vanmassenhove, E., Vidal, S.A., Aranberri, N., Nunziatini, M., Escart´ ın, C.P., Forcada, M., Popovic, M., Scarton, C., Moniz, H. (...
work page 2023
-
[66]
Hendy, A., Abdelrehim, M., Sharaf, A., Raunak, V., Gabr, M., Matsushita, H., Kim, Y.J., Afify, M., Awadalla, H.H.: How Good Are GPT Models at Machine Translation? A Comprehensive Evaluation (2023). https://arxiv.org/abs/2302. 09210
work page 2023
-
[68]
In: Rogers, A., Boyd-Graber, J., Okazaki, N
Vilar, D., Freitag, M., Cherry, C., Luo, J., Ratnakar, V., Foster, G.: Prompting PaLM for translation: Assessing strategies and performance. In: Rogers, A., Boyd-Graber, J., Okazaki, N. (eds.) Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 15406–15427. Association for Computational Ling...
-
[69]
In: Proceedings of the 40th International Conference on Machine Learning
Zhang, B., Haddow, B., Birch, A.: Prompting large language model for machine translation: a case study. In: Proceedings of the 40th International Conference on Machine Learning. ICML’23. JMLR.org, ??? (2023)
work page 2023
-
[70]
GLM, T., Zeng, A., Xu, B., Wang, B., Zhang, C., et al.: ChatGLM: A Family of Large Language Models from GLM-130B to GLM-4 All Tools (2024)
work page 2024
-
[71]
Tang, L., Qin, J., Ye, W., Tan, H., Yang, Z.: Adaptive Few-shot Prompting for Machine Translation with Pre-trained Language Models (2025). https://arxiv. org/abs/2501.01679
-
[72]
In: Rogers, A., Boyd-Graber, J., Okazaki, N
Agrawal, S., Zhou, C., Lewis, M., Zettlemoyer, L., Ghazvininejad, M.: In- context examples selection for machine translation. In: Rogers, A., Boyd-Graber, J., Okazaki, N. (eds.) Findings of the Association for Computational Linguis- tics: ACL 2023, pp. 8857–8873. Association for Computational Linguistics, Toronto, Canada (2023). https://doi.org/10.18653/v...
-
[73]
Yifan Li, Yifan Du, Kun Zhou, Jinpeng Wang, Xin Zhao, and Ji-Rong Wen
Kumar, A., Puduppully, R., Dabre, R., Kunchukuttan, A.: CTQScorer: Combin- ing multiple features for in-context example selection for machine translation. In: Bouamor, H., Pino, J., Bali, K. (eds.) Findings of the Association for Computational Linguistics: EMNLP 2023, pp. 7736–7752. Association for Com- putational Linguistics, Singapore (2023). https://do...
-
[74]
Sia, S., Duh, K.: In-context learning as maintaining coherency: A study of on- the-fly machine translation using large language models. In: Utiyama, M., Wang, R. (eds.) Proceedings of Machine Translation Summit XIX, Vol. 1: Research Track, pp. 173–185. Asia-Pacific Association for Machine Translation, Macau SAR, China (2023).https://aclanthology.org/2023....
work page 2023
-
[75]
In: Calzolari, N., Kan, M.-Y., Hoste, V., Lenci, A., Sakti, S., Xue, N
Ji, B., Duan, X., Qiu, Z., Zhang, T., Li, J., Yang, H., Zhang, M.: Submodular- based in-context example selection for LLMs-based machine translation. In: Calzolari, N., Kan, M.-Y., Hoste, V., Lenci, A., Sakti, S., Xue, N. (eds.) Proceed- ings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-C...
work page 2024
-
[76]
https://arxiv.org/ abs/2408.00397 53
Zebaze, A., Sagot, B., Bawden, R.: In-Context Example Selection via Similarity Search Improves Low-Resource Machine Translation (2024). https://arxiv.org/ abs/2408.00397 53
-
[77]
In: Duh, K., Gomez, H., Bethard, S
Zhu, W., Liu, H., Dong, Q., Xu, J., Huang, S., Kong, L., Chen, J., Li, L.: Multilin- gual machine translation with large language models: Empirical results and anal- ysis. In: Duh, K., Gomez, H., Bethard, S. (eds.) Findings of the Association for Computational Linguistics: NAACL 2024, pp. 2765–2781. Association for Com- putational Linguistics, Mexico City...
work page 2024
-
[78]
https://arxiv.org/abs/2505.22293
Frontull, S., Str¨ ohle, T.: Compensating for Data with Reasoning: Low-Resource Machine Translation with LLMs (2025). https://arxiv.org/abs/2505.22293
-
[79]
In: Text Summarization Branches Out, pp
Lin, C.-Y.: ROUGE: A package for automatic evaluation of summaries. In: Text Summarization Branches Out, pp. 74–81. Association for Computational Linguistics, Barcelona, Spain (2004).https://aclanthology.org/W04-1013/
work page 2004
-
[80]
Popovi´ c, M.: chrF: character n-gram F-score for automatic MT evaluation. In: Bojar, O., Chatterjee, R., Federmann, C., Haddow, B., Hokamp, C., Huck, M., Logacheva, V., Pecina, P. (eds.) Proceedings of the Tenth Workshop on Statistical Machine Translation, pp. 392–395. Association for Computational Linguistics, Lisbon, Portugal (2015). https://doi.org/10...
-
[81]
In: Webber, B., Cohn, T., He, Y., Liu, Y
Rei, R., Stewart, C., Farinha, A.C., Lavie, A.: COMET: A neural frame- work for MT evaluation. In: Webber, B., Cohn, T., He, Y., Liu, Y. (eds.) Proceedings of the 2020 Conference on Empirical Methods in Natural Lan- guage Processing (EMNLP), pp. 2685–2702. Association for Computational Linguistics, Online (2020). https://doi.org/10.18653/v1/2020.emnlp-mai...
-
[82]
In: Bouamor, H., Pino, J., Bali, K
Peng, K., Ding, L., Zhong, Q., Shen, L., Liu, X., Zhang, M., Ouyang, Y., Tao, D.: Towards making the most of ChatGPT for machine translation. In: Bouamor, H., Pino, J., Bali, K. (eds.) Findings of the Association for Computational Lin- guistics: EMNLP 2023, pp. 5622–5633. Association for Computational Linguis- tics, Singapore (2023). https://doi.org/10.18...
-
[84]
In: Bouamor, H., Pino, J., Bali, K
Puduppully, R., Kunchukuttan, A., Dabre, R., Aw, A.T., Chen, N.: DecoMT: Decomposed prompting for machine translation between related languages using large language models. In: Bouamor, H., Pino, J., Bali, K. (eds.) 54 Proceedings of the 2023 Conference on Empirical Methods in Natural Lan- guage Processing, pp. 4586–4602. Association for Computational Lin...
-
[85]
Chen, P., Guo, Z., Haddow, B., Heafield, K.: Iterative translation refine- ment with large language models. In: Scarton, C., Prescott, C., Bayliss, C., Oakley, C., Wright, J., Wrigley, S., Song, X., Gow-Smith, E., Bawden, R., S´ anchez-Cartagena, V.M., Cadwell, P., Lapshinova-Koltunski, E., Cabarr˜ ao, V., Chatzitheodorou, K., Nurminen, M., Kanojia, D., M...
work page 2024
-
[87]
In: Chiruzzo, L., Ritter, A., Wang, L
Feng, Z., Zhang, Y., Li, H., Wu, B., Liao, J., Liu, W., Lang, J., Feng, Y., Wu, J., Liu, Z.: TEaR: Improving LLM-based machine translation with system- atic self-refinement. In: Chiruzzo, L., Ritter, A., Wang, L. (eds.) Findings of the Association for Computational Linguistics: NAACL 2025, pp. 3922–3938. Association for Computational Linguistics, Albuquer...
work page 2025
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.