Sparse-RL: Breaking the memory wall in LLM reinforcement learning via stable sparse rollouts.arXiv preprint arXiv:2401.10079

Sijia Luo, Xiaokang Zhang, Yuxuan Hu, Bohan Zhang, Ke Wang, Jinbo Su, Mengshu Sun, Lei Liang, Jing Zhang · arXiv 2401.10079

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

read on arXiv browse 1 citing papers

representative citing papers

How to Compress KV Cache in RL Post-Training? Shadow Mask Distillation for Memory-Efficient Alignment

cs.LG · 2026-05-07 · unverdicted · novelty 5.0

Shadow Mask Distillation enables KV cache compression in RL post-training of LLMs by mitigating amplified off-policy bias that defeats standard importance reweighting.

citing papers explorer

Showing 1 of 1 citing paper.

How to Compress KV Cache in RL Post-Training? Shadow Mask Distillation for Memory-Efficient Alignment cs.LG · 2026-05-07 · unverdicted · none · ref 12
Shadow Mask Distillation enables KV cache compression in RL post-training of LLMs by mitigating amplified off-policy bias that defeats standard importance reweighting.

Sparse-RL: Breaking the memory wall in LLM reinforcement learning via stable sparse rollouts.arXiv preprint arXiv:2401.10079

fields

years

verdicts

representative citing papers

citing papers explorer