The model exactly recovers the data distribu- tion

Density-estimation pole( q= 1 ): θ∗ j (1) =α j

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

browse 1 citing papers

representative citing papers

How Fast Should a Model Commit to Supervision? Training Reasoning Models on the Tsallis Loss Continuum

cs.LG · 2026-04-28 · unverdicted · novelty 7.0 · 2 refs

A single-parameter Tsallis loss continuum unifies SFT and RLVR, derives time-to-escape bounds for cold start, and yields GARL and PAFT estimators that improve performance on QA reasoning tasks.

citing papers explorer

Showing 1 of 1 citing paper.

How Fast Should a Model Commit to Supervision? Training Reasoning Models on the Tsallis Loss Continuum cs.LG · 2026-04-28 · unverdicted · none · ref 4 · 2 links
A single-parameter Tsallis loss continuum unifies SFT and RLVR, derives time-to-escape bounds for cold start, and yields GARL and PAFT estimators that improve performance on QA reasoning tasks.

The model exactly recovers the data distribu- tion

fields

years

verdicts

representative citing papers

citing papers explorer