hub

arXiv preprint arXiv:2502.18600 , year=

Chain of draft: Thinking faster by writing less , author= · 2025 · arXiv 2502.18600

15 Pith papers cite this work. Polarity classification is still indexing.

15 Pith papers citing it

read on arXiv browse 15 citing papers

hub tools

JSON dossier citing papers JSON arXiv source

citation-role summary

background 4

citation-polarity summary

background 4

representative citing papers

Respecting Self-Uncertainty in On-Policy Self-Distillation for Efficient LLM Reasoning

cs.AI · 2026-05-13 · unverdicted · novelty 7.0

EGRSD and CL-EGRSD advance the accuracy-length frontier in LLM reasoning by entropy-guided weighting of token-level distillation signals from the teacher.

Post Reasoning: Improving the Performance of Non-Thinking Models at No Cost

cs.AI · 2026-05-07 · conditional · novelty 7.0

Post-Reasoning boosts LLM accuracy by reversing the usual answer-after-reasoning order, delivering mean relative gains of 17.37% across 117 model-benchmark pairs with zero extra cost.

TrigReason: Trigger-Based Collaboration between Small and Large Reasoning Models

cs.AI · 2026-04-16 · unverdicted · novelty 7.0

TrigReason matches large reasoning model accuracy on math and science benchmarks by delegating most steps to small models and intervening selectively on three triggers, cutting latency by 43.9% and cost by 73.3%.

Unified Data Selection for LLM Reasoning

cs.CL · 2026-05-21 · unverdicted · novelty 6.0

High-Entropy Sum (HES) selects high-quality reasoning data for LLMs by summing entropy of the top highest-entropy tokens, matching full-dataset performance with top 20% in SFT and outperforming baselines in RFT and RL.

TCARD: Nearly Balanced Two-Level Designs with Treatment Cardinality Constraints with an Application to LLM Prompt Engineering

stat.ME · 2026-05-20 · unverdicted · novelty 6.0

Proposes nearly balanced TCARDs that minimize the first two generalized word-length pattern components, defines Φ_BCD criterion linked to classical optimality, and constructs designs via coordinate exchange with simulation-calibrated weights for LLM prompt engineering.

Stop When Reasoning Converges: Semantic-Preserving Early Exit for Reasoning Models

cs.CL · 2026-05-17 · unverdicted · novelty 6.0

PUMA detects reasoning-level semantic redundancy to enable early exit in chains of thought, achieving 26.2% average token reduction across five LRMs and five benchmarks while preserving accuracy and CoT quality.

Nice Fold or Hero Call: Learning Budget-Efficient Thinking for Adaptive Reasoning

cs.AI · 2026-05-12 · unverdicted · novelty 6.0

BET reduces reasoning tokens by about 55% on average while improving performance across benchmarks by learning to short-solve easy queries, fold early on unsolvable ones, and preserve budget for hard solvable queries.

Implicit Compression Regularization: Concise Reasoning via Internal Shorter Distributions in RL Post-Training

cs.AI · 2026-05-08 · unverdicted · novelty 6.0

ICR creates a virtual shorter distribution from shortest correct on-policy responses to regularize RL post-training toward concise yet accurate reasoning, improving the accuracy-length Pareto frontier on math and knowledge benchmarks.

CRISP: Compressed Reasoning via Iterative Self-Policy Distillation

cs.LG · 2026-03-05 · conditional · novelty 6.0

CRISP achieves 57-59% token reduction on MATH-500 with 9-16 point accuracy gains on Qwen3 models via iterative self-distillation of concise reasoning behavior.

Reasoning Compression with Mixed-Policy Distillation

cs.AI · 2026-05-09 · unverdicted · novelty 5.0

Mixed-Policy Distillation transfers concise reasoning behavior from larger to smaller LLMs by having the teacher compress student-generated trajectories, cutting token usage up to 27% while raising benchmark scores.

SAT: Balancing Reasoning Accuracy and Efficiency with Stepwise Adaptive Thinking

cs.AI · 2026-04-09 · unverdicted · novelty 5.0

SAT reduces reasoning tokens by up to 40% across multiple large reasoning models and benchmarks by adaptively pruning steps based on difficulty while maintaining or improving accuracy.

Early Stopping Chain-of-thoughts in Large Language Models

cs.CL · 2025-09-17 · conditional · novelty 5.0

ES-CoT shortens LLM chain-of-thought generation by tracking runs of identical step answers after linguistic markers, cutting tokens 16% on average while keeping accuracy comparable to full CoT across six datasets and three models.

Pruning Long Chain-of-Thought of Large Reasoning Models via Small-Scale Preference Optimization

cs.AI · 2025-08-13 · unverdicted · novelty 5.0

LCPO reduces average LRM output length by over 50% across benchmarks via targeted preference optimization while preserving reasoning performance.

Stop Overthinking: A Survey on Efficient Reasoning for Large Language Models

cs.CL · 2025-03-20 · accept · novelty 5.0

A survey organizing techniques to achieve efficient reasoning in LLMs by shortening chain-of-thought outputs.

Measuring AI Reasoning: A Guide for Researchers

cs.AI · 2026-05-04 · unverdicted · novelty 4.0

Reasoning in language models should be measured by the faithfulness and validity of their multi-step search processes and intermediate traces, not final-answer accuracy.

citing papers explorer

Showing 15 of 15 citing papers.

Respecting Self-Uncertainty in On-Policy Self-Distillation for Efficient LLM Reasoning cs.AI · 2026-05-13 · unverdicted · none · ref 13
EGRSD and CL-EGRSD advance the accuracy-length frontier in LLM reasoning by entropy-guided weighting of token-level distillation signals from the teacher.
Post Reasoning: Improving the Performance of Non-Thinking Models at No Cost cs.AI · 2026-05-07 · conditional · none · ref 108
Post-Reasoning boosts LLM accuracy by reversing the usual answer-after-reasoning order, delivering mean relative gains of 17.37% across 117 model-benchmark pairs with zero extra cost.
TrigReason: Trigger-Based Collaboration between Small and Large Reasoning Models cs.AI · 2026-04-16 · unverdicted · none · ref 16
TrigReason matches large reasoning model accuracy on math and science benchmarks by delegating most steps to small models and intervening selectively on three triggers, cutting latency by 43.9% and cost by 73.3%.
Unified Data Selection for LLM Reasoning cs.CL · 2026-05-21 · unverdicted · none · ref 23
High-Entropy Sum (HES) selects high-quality reasoning data for LLMs by summing entropy of the top highest-entropy tokens, matching full-dataset performance with top 20% in SFT and outperforming baselines in RFT and RL.
TCARD: Nearly Balanced Two-Level Designs with Treatment Cardinality Constraints with an Application to LLM Prompt Engineering stat.ME · 2026-05-20 · unverdicted · none · ref 57
Proposes nearly balanced TCARDs that minimize the first two generalized word-length pattern components, defines Φ_BCD criterion linked to classical optimality, and constructs designs via coordinate exchange with simulation-calibrated weights for LLM prompt engineering.
Stop When Reasoning Converges: Semantic-Preserving Early Exit for Reasoning Models cs.CL · 2026-05-17 · unverdicted · none · ref 25
PUMA detects reasoning-level semantic redundancy to enable early exit in chains of thought, achieving 26.2% average token reduction across five LRMs and five benchmarks while preserving accuracy and CoT quality.
Nice Fold or Hero Call: Learning Budget-Efficient Thinking for Adaptive Reasoning cs.AI · 2026-05-12 · unverdicted · none · ref 49
BET reduces reasoning tokens by about 55% on average while improving performance across benchmarks by learning to short-solve easy queries, fold early on unsolvable ones, and preserve budget for hard solvable queries.
Implicit Compression Regularization: Concise Reasoning via Internal Shorter Distributions in RL Post-Training cs.AI · 2026-05-08 · unverdicted · none · ref 27
ICR creates a virtual shorter distribution from shortest correct on-policy responses to regularize RL post-training toward concise yet accurate reasoning, improving the accuracy-length Pareto frontier on math and knowledge benchmarks.
CRISP: Compressed Reasoning via Iterative Self-Policy Distillation cs.LG · 2026-03-05 · conditional · none · ref 23
CRISP achieves 57-59% token reduction on MATH-500 with 9-16 point accuracy gains on Qwen3 models via iterative self-distillation of concise reasoning behavior.
Reasoning Compression with Mixed-Policy Distillation cs.AI · 2026-05-09 · unverdicted · none · ref 7
Mixed-Policy Distillation transfers concise reasoning behavior from larger to smaller LLMs by having the teacher compress student-generated trajectories, cutting token usage up to 27% while raising benchmark scores.
SAT: Balancing Reasoning Accuracy and Efficiency with Stepwise Adaptive Thinking cs.AI · 2026-04-09 · unverdicted · none · ref 38
SAT reduces reasoning tokens by up to 40% across multiple large reasoning models and benchmarks by adaptively pruning steps based on difficulty while maintaining or improving accuracy.
Early Stopping Chain-of-thoughts in Large Language Models cs.CL · 2025-09-17 · conditional · none · ref 14
ES-CoT shortens LLM chain-of-thought generation by tracking runs of identical step answers after linguistic markers, cutting tokens 16% on average while keeping accuracy comparable to full CoT across six datasets and three models.
Pruning Long Chain-of-Thought of Large Reasoning Models via Small-Scale Preference Optimization cs.AI · 2025-08-13 · unverdicted · none · ref 13
LCPO reduces average LRM output length by over 50% across benchmarks via targeted preference optimization while preserving reasoning performance.
Stop Overthinking: A Survey on Efficient Reasoning for Large Language Models cs.CL · 2025-03-20 · accept · none · ref 204
A survey organizing techniques to achieve efficient reasoning in LLMs by shortening chain-of-thought outputs.
Measuring AI Reasoning: A Guide for Researchers cs.AI · 2026-05-04 · unverdicted · none · ref 133
Reasoning in language models should be measured by the faithfulness and validity of their multi-step search processes and intermediate traces, not final-answer accuracy.

arXiv preprint arXiv:2502.18600 , year=

hub tools

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer