Title resolution pending

URLhttps: //arxiv · arXiv 2402.02255

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

Title metadata for this work has not finished resolving. The hub is built from the citation graph; the title resolver retries DOI and OpenAlex on its next pass.

representative citing papers

STARE: Surprisal-Guided Token-Level Advantage Reweighting for Policy Entropy Stability

cs.LG · 2026-06-17 · unverdicted · novelty 5.0

STARE applies surprisal-guided token-level advantage reweighting plus a target-entropy gate to stabilize entropy in GRPO RL for LLMs, yielding stable training and 4-8% gains on AIME24/25 over baselines.

citing papers explorer

Showing 1 of 1 citing paper after filters.

STARE: Surprisal-Guided Token-Level Advantage Reweighting for Policy Entropy Stability cs.LG · 2026-06-17 · unverdicted · none · ref 28
STARE applies surprisal-guided token-level advantage reweighting plus a target-entropy gate to stabilize entropy in GRPO RL for LLMs, yielding stable training and 4-8% gains on AIME24/25 over baselines.

Title resolution pending

fields

years

verdicts

representative citing papers

citing papers explorer