pith. sign in

hub

Sample more to think less: Group filtered policy optimization for concise reasoning.arXiv preprint arXiv:2508.09726

10 Pith papers cite this work. Polarity classification is still indexing.

10 Pith papers citing it

hub tools

citation-role summary

background 3 baseline 1

citation-polarity summary

years

2026 8 2025 2

representative citing papers

MEMENTO: Teaching LLMs to Manage Their Own Context

cs.AI · 2026-04-10 · unverdicted · novelty 6.0

MEMENTO trains LLMs to segment reasoning into blocks, generate mementos as dense summaries, and reason forward using only mementos and KV states, cutting peak KV cache by ~2.5x while preserving benchmark accuracy.

Entropy After </Think> for reasoning model early exiting

cs.LG · 2025-09-30 · unverdicted · novelty 6.0

Entropy After </Think> (EAT) enables early exiting in reasoning LLMs by tracking entropy stabilization after a </think> token, cutting token use 12-22% on MATH500 and AIME2025 with no accuracy loss.

citing papers explorer

Showing 10 of 10 citing papers.