Let’s think step by step and output the final answer within \boxed

Dynamic Penalty Supports a Two-Phase Curriculum · 2000 · arXiv 3535.98865

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

representative citing papers

Not All Tokens Matter: Towards Efficient LLM Reasoning via Token Significance in Reinforcement Learning

cs.LG · 2025-06-09 · unverdicted · novelty 5.0

Proposes token-significance and dynamic length rewards in RL to reduce LLM response length while preserving or improving reasoning correctness across benchmarks.

citing papers explorer

Showing 1 of 1 citing paper.

Not All Tokens Matter: Towards Efficient LLM Reasoning via Token Significance in Reinforcement Learning cs.LG · 2025-06-09 · unverdicted · none · ref 56
Proposes token-significance and dynamic length rewards in RL to reduce LLM response length while preserving or improving reasoning correctness across benchmarks.

Let’s think step by step and output the final answer within \boxed

fields

years

verdicts

representative citing papers

citing papers explorer