GRPO with reference-free rewards improves NLLB-200 translation quality on 13 languages up to +5.03 chrF++, competing with supervised fine-tuning on complex languages without target data.
Title resolution pending
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
fields
cs.CL 2years
2026 2representative citing papers
Development of domain-specific scientific corpora for English-Spanish, English-French, and English-Portuguese and their application to fine-tuning NMT models.
citing papers explorer
-
Reference-Free Reinforcement Learning Fine-Tuning for MT: A Seq2Seq Perspective
GRPO with reference-free rewards improves NLLB-200 translation quality on 13 languages up to +5.03 chrF++, competing with supervised fine-tuning on complex languages without target data.
-
Enhancing Scientific Discourse: Machine Translation for the Scientific Domain
Development of domain-specific scientific corpora for English-Spanish, English-French, and English-Portuguese and their application to fine-tuning NMT models.