Title resolution pending

Reinforcement learning fine-tuning enhances activation intensity, diversity in the internal circuitry of llms · arXiv 2509.21044

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

read on arXiv browse 1 citing papers

Title metadata for this work has not finished resolving. The hub is built from the citation graph; the title resolver retries DOI and OpenAlex on its next pass.

representative citing papers

OISD: On-Policy Internal Self-Distillation of Language Models

cs.LG · 2026-05-27 · unverdicted · novelty 6.0

OISD improves mathematical reasoning in language models by using the final layer as an internal teacher to align logits and attention patterns in selected intermediate layers via signed advantage-weighted Jensen-Shannon divergence during GRPO optimization.

citing papers explorer

Showing 1 of 1 citing paper after filters.

OISD: On-Policy Internal Self-Distillation of Language Models cs.LG · 2026-05-27 · unverdicted · none · ref 2
OISD improves mathematical reasoning in language models by using the final layer as an internal teacher to align logits and attention patterns in selected intermediate layers via signed advantage-weighted Jensen-Shannon divergence during GRPO optimization.

Title resolution pending

fields

years

verdicts

representative citing papers

citing papers explorer