Reading the remainders from bottom to top, we get629 10 =1165 8

Divide 1 by 8:1÷8=0remainder1 · 2026

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

browse 1 citing papers

representative citing papers

Limits of Difficulty Scaling: Hard Samples Yield Diminishing Returns in GRPO-Tuned SLMs

cs.LG · 2026-04-07 · unverdicted · novelty 5.0

GRPO tuning on SLMs shows diminishing returns from hard math samples, with easier subsets matching full performance using 45% fewer steps and GSM8K training outperforming MATH training on numeric subsets.

citing papers explorer

Showing 1 of 1 citing paper.

Limits of Difficulty Scaling: Hard Samples Yield Diminishing Returns in GRPO-Tuned SLMs cs.LG · 2026-04-07 · unverdicted · none · ref 21
GRPO tuning on SLMs shows diminishing returns from hard math samples, with easier subsets matching full performance using 45% fewer steps and GSM8K training outperforming MATH training on numeric subsets.

Reading the remainders from bottom to top, we get629 10 =1165 8

fields

years

verdicts

representative citing papers

citing papers explorer