Proposes token-significance and dynamic length rewards in RL to reduce LLM response length while preserving or improving reasoning correctness across benchmarks.
Training language models to reason efficiently, 2025
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
years
2025 2verdicts
UNVERDICTED 2representative citing papers
Agents should invoke external tools only when epistemically necessary, per the introduced Theory of Agent framework that frames tool use as a decision under uncertainty.
citing papers explorer
-
Not All Tokens Matter: Towards Efficient LLM Reasoning via Token Significance in Reinforcement Learning
Proposes token-significance and dynamic length rewards in RL to reduce LLM response length while preserving or improving reasoning correctness across benchmarks.
-
Position: Agent Should Invoke External Tools ONLY When Epistemically Necessary
Agents should invoke external tools only when epistemically necessary, per the introduced Theory of Agent framework that frames tool use as a decision under uncertainty.