2602.14028 , archivePrefix=

Yang, Sen, Cheng, Shanbo, Xu, Lu, Zhang, Jianbing, Huang, Shujian , year = · arXiv 2602.14028

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

representative citing papers

Reference-Free Reinforcement Learning Fine-Tuning for MT: A Seq2Seq Perspective

cs.CL · 2026-05-15 · unverdicted · novelty 5.0

GRPO with reference-free rewards improves NLLB-200 translation quality on 13 languages up to +5.03 chrF++, competing with supervised fine-tuning on complex languages without target data.

citing papers explorer

Showing 1 of 1 citing paper.

Reference-Free Reinforcement Learning Fine-Tuning for MT: A Seq2Seq Perspective cs.CL · 2026-05-15 · unverdicted · none · ref 18
GRPO with reference-free rewards improves NLLB-200 translation quality on 13 languages up to +5.03 chrF++, competing with supervised fine-tuning on complex languages without target data.

2602.14028 , archivePrefix=

fields

years

verdicts

representative citing papers

citing papers explorer