LLMs learn self-regulated summarization of chain-of-thought steps via RL, allowing compressed Fold inference to reach the same accuracy as exhaustive Unfold mode with far lower token overhead.
The number of solutions is given by the combination formula \\( \\binom{8 + 3 - 1}{3 - 1} = \\binom{10}{2} = 45 \\)
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.AI 1years
2026 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
Accordion-Thinking: Self-Regulated Step Summaries for Efficient and Readable LLM Reasoning
LLMs learn self-regulated summarization of chain-of-thought steps via RL, allowing compressed Fold inference to reach the same accuracy as exhaustive Unfold mode with far lower token overhead.