Post-Reasoning boosts LLM accuracy by reversing the usual answer-after-reasoning order, delivering mean relative gains of 17.37% across 117 model-benchmark pairs with zero extra cost.
Flashthink: An early exit method for efficient reasoning
5 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
roles
background 2polarities
background 2representative citing papers
MSIFR stops faulty LLM generations early via staged rule-based checks, reducing token consumption 11-78% with no accuracy loss.
Retrieval-of-Thought organizes prior reasoning into a thought graph for retrieval and reward-guided recombination, reducing output tokens by up to 40% and latency by 82% while preserving accuracy on reasoning benchmarks.
A survey organizing techniques to achieve efficient reasoning in LLMs by shortening chain-of-thought outputs.
The paper delivers a unified survey of token economics for LLM agents, conceptualizing tokens as production factors, exchange mediums, and units of account across micro, meso, macro, and security dimensions using established economic theories.
citing papers explorer
-
Post Reasoning: Improving the Performance of Non-Thinking Models at No Cost
Post-Reasoning boosts LLM accuracy by reversing the usual answer-after-reasoning order, delivering mean relative gains of 17.37% across 117 model-benchmark pairs with zero extra cost.
-
Know When To Fold 'Em: Token-Efficient LLM Synthetic Data Generation via Multi-Stage In-Flight Rejection
MSIFR stops faulty LLM generations early via staged rule-based checks, reducing token consumption 11-78% with no accuracy loss.
-
Retrieval-of-Thought: Efficient Reasoning via Reusing Thoughts
Retrieval-of-Thought organizes prior reasoning into a thought graph for retrieval and reward-guided recombination, reducing output tokens by up to 40% and latency by 82% while preserving accuracy on reasoning benchmarks.
-
Stop Overthinking: A Survey on Efficient Reasoning for Large Language Models
A survey organizing techniques to achieve efficient reasoning in LLMs by shortening chain-of-thought outputs.
-
Token Economics for LLM Agents: A Dual-View Study from Computing and Economics
The paper delivers a unified survey of token economics for LLM agents, conceptualizing tokens as production factors, exchange mediums, and units of account across micro, meso, macro, and security dimensions using established economic theories.