ReflectMT internalizes reflection via two-stage RL to enable direct high-quality machine translation that outperforms explicit reasoning models like DeepSeek-R1 on WMT24 while using 94% fewer tokens.
Best-Worst Scaling More Reliable than Rating Scales: A Case Study on Sentiment Intensity Annotation
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
fields
cs.CL 2years
2026 2verdicts
UNVERDICTED 2representative citing papers
Lexical frequency is a stronger predictor of metaphor novelty than LM surprisal, with the surprisal-novelty link peaking early in training before declining as surprisal becomes more aligned with frequency.
citing papers explorer
-
ReflectMT: Internalizing Reflection for Efficient and High-Quality Machine Translation
ReflectMT internalizes reflection via two-stage RL to enable direct high-quality machine translation that outperforms explicit reasoning models like DeepSeek-R1 on WMT24 while using 94% fewer tokens.
-
The Frequency Confound in Language-Model Surprisal and Metaphor Novelty
Lexical frequency is a stronger predictor of metaphor novelty than LM surprisal, with the surprisal-novelty link peaking early in training before declining as surprisal becomes more aligned with frequency.