com/p/28476703733

Jiacai Liu · arXiv p/2847670

2 Pith papers cite this work. Polarity classification is still indexing.

2 Pith papers citing it

representative citing papers

Addressing Performance Saturation for LLM RL via Precise Entropy Curve Control

cs.LG · 2026-04-29 · unverdicted · novelty 6.0 · 2 refs

Entrocraft uses rejection sampling to enforce precise entropy schedules in LLM RL by biasing advantages, enabling longer training, better generalization, and higher performance than baselines.

Rethinking Entropy Interventions in RLVR: An Entropy Change Perspective

cs.LG · 2025-10-11 · unverdicted · novelty 5.0

Derives a token-level entropy change approximation revealing four factors, identifies limitations in prior entropy interventions, and proposes STEER which adaptively reweights tokens to mitigate collapse and improve performance on math and coding benchmarks.

citing papers explorer

Showing 2 of 2 citing papers.

Addressing Performance Saturation for LLM RL via Precise Entropy Curve Control cs.LG · 2026-04-29 · unverdicted · none · ref 15 · 2 links
Entrocraft uses rejection sampling to enforce precise entropy schedules in LLM RL by biasing advantages, enabling longer training, better generalization, and higher performance than baselines.
Rethinking Entropy Interventions in RLVR: An Entropy Change Perspective cs.LG · 2025-10-11 · unverdicted · none · ref 13
Derives a token-level entropy change approximation revealing four factors, identifies limitations in prior entropy interventions, and proposes STEER which adaptively reweights tokens to mitigate collapse and improve performance on math and coding benchmarks.

com/p/28476703733

fields

years

verdicts

representative citing papers

citing papers explorer