Best-Worst Scaling More Reliable than Rating Scales: A Case Study on Sentiment Intensity Annotation

Kiritchenko, Svetlana, Mohammad, Saif · 2017 · DOI 10.18653/v1/p17-2074

2 Pith papers cite this work. Polarity classification is still indexing.

2 Pith papers citing it

open at publisher browse 2 citing papers

representative citing papers

ReflectMT: Internalizing Reflection for Efficient and High-Quality Machine Translation

cs.CL · 2026-04-21 · unverdicted · novelty 7.0

ReflectMT internalizes reflection via two-stage RL to enable direct high-quality machine translation that outperforms explicit reasoning models like DeepSeek-R1 on WMT24 while using 94% fewer tokens.

The Frequency Confound in Language-Model Surprisal and Metaphor Novelty

cs.CL · 2026-05-07 · unverdicted · novelty 5.0 · 2 refs

Lexical frequency is a stronger predictor of metaphor novelty than LM surprisal, with the surprisal-novelty link peaking early in training before declining as surprisal becomes more aligned with frequency.

citing papers explorer

Showing 2 of 2 citing papers.

ReflectMT: Internalizing Reflection for Efficient and High-Quality Machine Translation cs.CL · 2026-04-21 · unverdicted · none · ref 22
ReflectMT internalizes reflection via two-stage RL to enable direct high-quality machine translation that outperforms explicit reasoning models like DeepSeek-R1 on WMT24 while using 94% fewer tokens.
The Frequency Confound in Language-Model Surprisal and Metaphor Novelty cs.CL · 2026-05-07 · unverdicted · none · ref 16 · 2 links
Lexical frequency is a stronger predictor of metaphor novelty than LM surprisal, with the surprisal-novelty link peaking early in training before declining as surprisal becomes more aligned with frequency.

Best-Worst Scaling More Reliable than Rating Scales: A Case Study on Sentiment Intensity Annotation

fields

years

verdicts

representative citing papers

citing papers explorer