pith. sign in

Reshaping reason- ing in llms: A theoretical analysis of rl training dynamics through pattern selection.arXiv preprint arXiv:2506.04695, 2025

2 Pith papers cite this work. Polarity classification is still indexing.

2 Pith papers citing it

fields

cs.AI 1 cs.LG 1

years

2026 2

verdicts

UNVERDICTED 2

clear filters

representative citing papers

Not only where, But when: Temporal Scheduling for RLVR

cs.LG · 2026-05-25 · unverdicted · novelty 7.0

Temporal scheduling of credit allocation criteria over RLVR training, using trajectory percentiles to target heterogeneous behaviors, yields more stable policy entropy and better reasoning benchmark results than static allocation.

citing papers explorer

Showing 1 of 1 citing paper after filters.