arXiv preprint arXiv:2503.05179

Sketch-of-thought: Efficient llm reasoning with adaptive cognitive-inspired sketching · 2025 · arXiv 2503.05179

9 Pith papers cite this work. Polarity classification is still indexing.

9 Pith papers citing it

read on arXiv browse 9 citing papers

citation-role summary

background 2

citation-polarity summary

background 2

representative citing papers

Post Reasoning: Improving the Performance of Non-Thinking Models at No Cost

cs.AI · 2026-05-07 · conditional · novelty 7.0

Post-Reasoning boosts LLM accuracy by reversing the usual answer-after-reasoning order, delivering mean relative gains of 17.37% across 117 model-benchmark pairs with zero extra cost.

Unified Data Selection for LLM Reasoning

cs.CL · 2026-05-21 · unverdicted · novelty 6.0

High-Entropy Sum (HES) selects high-quality reasoning data for LLMs by summing entropy of the top highest-entropy tokens, matching full-dataset performance with top 20% in SFT and outperforming baselines in RFT and RL.

HypEHR: Hyperbolic Modeling of Electronic Health Records for Efficient Question Answering

cs.AI · 2026-04-22 · unverdicted · novelty 6.0

HypEHR is a hyperbolic embedding model for EHR data that uses Lorentzian geometry and hierarchy-aware pretraining to answer clinical questions nearly as well as large language models but with much smaller size.

ZoomR: Memory Efficient Reasoning through Multi-Granularity Key Value Retrieval

cs.LG · 2026-04-13 · unverdicted · novelty 6.0

ZoomR reduces KV cache memory by more than 4x during long-output reasoning by using summary keys for coarse indexing and dynamic fine-grained retrieval.

FoE: Forest of Errors Makes the First Solution the Best in Large Reasoning Models

cs.AI · 2026-04-03 · unverdicted · novelty 6.0

Errors in large reasoning models form a forest structure that grows with more steps, making the first solution best; RED refines the first and prunes the rest for higher performance with less compute.

DeepPrune: Parallel Scaling without Inter-trace Redundancy

cs.CL · 2025-10-09 · conditional · novelty 5.0

DeepPrune prunes redundant parallel CoT traces via a judge model for equivalence prediction from partial traces plus online greedy clustering, delivering 65-88% token savings with accuracy within 3 points on AIME and GPQA benchmarks.

Exploring the System 1 Thinking Capability of Large Reasoning Models

cs.CL · 2025-04-14 · unverdicted · novelty 5.0

LRMs underperform on simple system 1 questions in both accuracy and efficiency, with problem difficulty implicitly encoded in early hidden states.

Stop Overthinking: A Survey on Efficient Reasoning for Large Language Models

cs.CL · 2025-03-20 · accept · novelty 5.0

A survey organizing techniques to achieve efficient reasoning in LLMs by shortening chain-of-thought outputs.

Towards Efficient Large Language Reasoning Models via Extreme-Ratio Chain-of-Thought Compression

cs.LG · 2026-02-09

citing papers explorer

Showing 9 of 9 citing papers.

Post Reasoning: Improving the Performance of Non-Thinking Models at No Cost cs.AI · 2026-05-07 · conditional · none · ref 114
Post-Reasoning boosts LLM accuracy by reversing the usual answer-after-reasoning order, delivering mean relative gains of 17.37% across 117 model-benchmark pairs with zero extra cost.
Unified Data Selection for LLM Reasoning cs.CL · 2026-05-21 · unverdicted · none · ref 24
High-Entropy Sum (HES) selects high-quality reasoning data for LLMs by summing entropy of the top highest-entropy tokens, matching full-dataset performance with top 20% in SFT and outperforming baselines in RFT and RL.
HypEHR: Hyperbolic Modeling of Electronic Health Records for Efficient Question Answering cs.AI · 2026-04-22 · unverdicted · none · ref 76
HypEHR is a hyperbolic embedding model for EHR data that uses Lorentzian geometry and hierarchy-aware pretraining to answer clinical questions nearly as well as large language models but with much smaller size.
ZoomR: Memory Efficient Reasoning through Multi-Granularity Key Value Retrieval cs.LG · 2026-04-13 · unverdicted · none · ref 1
ZoomR reduces KV cache memory by more than 4x during long-output reasoning by using summary keys for coarse indexing and dynamic fine-grained retrieval.
FoE: Forest of Errors Makes the First Solution the Best in Large Reasoning Models cs.AI · 2026-04-03 · unverdicted · none · ref 1
Errors in large reasoning models form a forest structure that grows with more steps, making the first solution best; RED refines the first and prunes the rest for higher performance with less compute.
DeepPrune: Parallel Scaling without Inter-trace Redundancy cs.CL · 2025-10-09 · conditional · none · ref 2
DeepPrune prunes redundant parallel CoT traces via a judge model for equivalence prediction from partial traces plus online greedy clustering, delivering 65-88% token savings with accuracy within 3 points on AIME and GPQA benchmarks.
Exploring the System 1 Thinking Capability of Large Reasoning Models cs.CL · 2025-04-14 · unverdicted · none · ref 1
LRMs underperform on simple system 1 questions in both accuracy and efficiency, with problem difficulty implicitly encoded in early hidden states.
Stop Overthinking: A Survey on Efficient Reasoning for Large Language Models cs.CL · 2025-03-20 · accept · none · ref 6
A survey organizing techniques to achieve efficient reasoning in LLMs by shortening chain-of-thought outputs.
Towards Efficient Large Language Reasoning Models via Extreme-Ratio Chain-of-Thought Compression cs.LG · 2026-02-09 · unreviewed · ref 2

arXiv preprint arXiv:2503.05179

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer