Flashthink: An early exit method for efficient reasoning

Guochao Jiang, Guofeng Quan, Zepeng Ding, Ziqin Luo, Dixuan Wang, Zheng Hu · 2025 · arXiv 2505.13949

5 Pith papers cite this work. Polarity classification is still indexing.

5 Pith papers citing it

read on arXiv browse 5 citing papers

citation-role summary

background 2

citation-polarity summary

background 2

representative citing papers

Post Reasoning: Improving the Performance of Non-Thinking Models at No Cost

cs.AI · 2026-05-07 · conditional · novelty 7.0

Post-Reasoning boosts LLM accuracy by reversing the usual answer-after-reasoning order, delivering mean relative gains of 17.37% across 117 model-benchmark pairs with zero extra cost.

Know When To Fold 'Em: Token-Efficient LLM Synthetic Data Generation via Multi-Stage In-Flight Rejection

cs.AI · 2026-05-13 · unverdicted · novelty 6.0

MSIFR stops faulty LLM generations early via staged rule-based checks, reducing token consumption 11-78% with no accuracy loss.

Retrieval-of-Thought: Efficient Reasoning via Reusing Thoughts

cs.AI · 2025-09-26 · unverdicted · novelty 6.0

Retrieval-of-Thought organizes prior reasoning into a thought graph for retrieval and reward-guided recombination, reducing output tokens by up to 40% and latency by 82% while preserving accuracy on reasoning benchmarks.

Stop Overthinking: A Survey on Efficient Reasoning for Large Language Models

cs.CL · 2025-03-20 · accept · novelty 5.0

A survey organizing techniques to achieve efficient reasoning in LLMs by shortening chain-of-thought outputs.

Token Economics for LLM Agents: A Dual-View Study from Computing and Economics

cs.AI · 2026-05-09 · unverdicted · novelty 4.0

The paper delivers a unified survey of token economics for LLM agents, conceptualizing tokens as production factors, exchange mediums, and units of account across micro, meso, macro, and security dimensions using established economic theories.

citing papers explorer

Showing 5 of 5 citing papers.

Post Reasoning: Improving the Performance of Non-Thinking Models at No Cost cs.AI · 2026-05-07 · conditional · none · ref 239
Post-Reasoning boosts LLM accuracy by reversing the usual answer-after-reasoning order, delivering mean relative gains of 17.37% across 117 model-benchmark pairs with zero extra cost.
Know When To Fold 'Em: Token-Efficient LLM Synthetic Data Generation via Multi-Stage In-Flight Rejection cs.AI · 2026-05-13 · unverdicted · none · ref 44
MSIFR stops faulty LLM generations early via staged rule-based checks, reducing token consumption 11-78% with no accuracy loss.
Retrieval-of-Thought: Efficient Reasoning via Reusing Thoughts cs.AI · 2025-09-26 · unverdicted · none · ref 12
Retrieval-of-Thought organizes prior reasoning into a thought graph for retrieval and reward-guided recombination, reducing output tokens by up to 40% and latency by 82% while preserving accuracy on reasoning benchmarks.
Stop Overthinking: A Survey on Efficient Reasoning for Large Language Models cs.CL · 2025-03-20 · accept · none · ref 73
A survey organizing techniques to achieve efficient reasoning in LLMs by shortening chain-of-thought outputs.
Token Economics for LLM Agents: A Dual-View Study from Computing and Economics cs.AI · 2026-05-09 · unverdicted · none · ref 49
The paper delivers a unified survey of token economics for LLM agents, conceptualizing tokens as production factors, exchange mediums, and units of account across micro, meso, macro, and security dimensions using established economic theories.

Flashthink: An early exit method for efficient reasoning

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer