Title resolution pending

Long Ouyang, Jeffrey Wu, Xu Jiang, Diogo Almeida, Carroll L · 2022

3 Pith papers cite this work. Polarity classification is still indexing.

3 Pith papers citing it

browse 3 citing papers

Title metadata for this work has not finished resolving. The hub is built from the citation graph; the title resolver retries DOI and OpenAlex on its next pass.

representative citing papers

Entropy Ratio Clipping as a Soft Global Constraint for Stable Reinforcement Learning

cs.LG · 2025-12-05 · unverdicted · novelty 6.0

Entropy Ratio Clipping introduces a global entropy-ratio constraint that stabilizes RL policy updates in LLM post-training beyond local PPO clipping.

CE-GPPO: Coordinating Entropy via Gradient-Preserving Clipping Policy Optimization in Reinforcement Learning

cs.LG · 2025-09-25 · unverdicted · novelty 6.0

CE-GPPO preserves bounded gradients from clipped tokens in PPO to regulate entropy evolution and improve performance on mathematical reasoning benchmarks.

How Human-Like Are Large Language Models? A Register-Aware Linguistic Evaluation Framework

cs.CL · 2026-05-22

citing papers explorer

Showing 3 of 3 citing papers.

Entropy Ratio Clipping as a Soft Global Constraint for Stable Reinforcement Learning cs.LG · 2025-12-05 · unverdicted · none · ref 13
Entropy Ratio Clipping introduces a global entropy-ratio constraint that stabilizes RL policy updates in LLM post-training beyond local PPO clipping.
CE-GPPO: Coordinating Entropy via Gradient-Preserving Clipping Policy Optimization in Reinforcement Learning cs.LG · 2025-09-25 · unverdicted · none · ref 15
CE-GPPO preserves bounded gradients from clipped tokens in PPO to regulate entropy evolution and improve performance on mathematical reasoning benchmarks.
How Human-Like Are Large Language Models? A Register-Aware Linguistic Evaluation Framework cs.CL · 2026-05-22 · unreviewed · ref 145

Title resolution pending

fields

years

verdicts

representative citing papers

citing papers explorer