pith. sign in

Rethinking thinking tokens: Llms as improvement operators.arXiv preprint arXiv:2510.01123, 2025

4 Pith papers cite this work. Polarity classification is still indexing.

4 Pith papers citing it

fields

cs.AI 3 cs.CL 1

years

2026 3 2025 1

representative citing papers

OpenDeepThink: Parallel Reasoning via Bradley-Terry Aggregation

cs.AI · 2026-05-14 · conditional · novelty 6.0 · 2 refs

OpenDeepThink uses Bradley-Terry aggregation of LLM pairwise judgments to rank and evolve parallel reasoning traces, improving Gemini 3.1 Pro Codeforces Elo by 405 points over eight rounds.

Stateful Reasoning via Insight Replay

cs.AI · 2026-05-14 · unverdicted · novelty 6.0 · 2 refs

InsightReplay improves long CoT reasoning by extracting critical insights from the trace and replaying them near the active frontier, delivering +1.65 average accuracy gain across 24 model-benchmark settings.

DeepPrune: Parallel Scaling without Inter-trace Redundancy

cs.CL · 2025-10-09 · conditional · novelty 5.0

DeepPrune prunes redundant parallel CoT traces via a judge model for equivalence prediction from partial traces plus online greedy clustering, delivering 65-88% token savings with accuracy within 3 points on AIME and GPQA benchmarks.

citing papers explorer

Showing 4 of 4 citing papers.

  • CAPS: Cascaded Adaptive Pairwise Selection for Efficient Parallel Reasoning cs.AI · 2026-05-15 · unverdicted · none · ref 28

    CAPS is a four-stage inference-only cascade that adapts how much of each solution the verifier sees and how comparisons are distributed, halving per-candidate verifier tokens while outperforming uniform pairwise verification on most benchmarks.

  • OpenDeepThink: Parallel Reasoning via Bradley-Terry Aggregation cs.AI · 2026-05-14 · conditional · none · ref 13 · 2 links

    OpenDeepThink uses Bradley-Terry aggregation of LLM pairwise judgments to rank and evolve parallel reasoning traces, improving Gemini 3.1 Pro Codeforces Elo by 405 points over eight rounds.

  • Stateful Reasoning via Insight Replay cs.AI · 2026-05-14 · unverdicted · none · ref 16 · 2 links

    InsightReplay improves long CoT reasoning by extracting critical insights from the trace and replaying them near the active frontier, delivering +1.65 average accuracy gain across 24 model-benchmark settings.

  • DeepPrune: Parallel Scaling without Inter-trace Redundancy cs.CL · 2025-10-09 · conditional · none · ref 8

    DeepPrune prunes redundant parallel CoT traces via a judge model for equivalence prediction from partial traces plus online greedy clustering, delivering 65-88% token savings with accuracy within 3 points on AIME and GPQA benchmarks.