Efficient Memory Management for Large Language Model Serving with PagedAtten - tion,

· 2023

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

browse 1 citing papers

representative citing papers

FP8-RL: A Practical and Stable Low-Precision Stack for LLM Reinforcement Learning

cs.LG · 2026-01-26 · unverdicted · novelty 6.0

FP8-RL delivers up to 44% faster rollouts in LLM RL by using blockwise FP8 quantization, KV-cache recalibration, and importance-sampling corrections while keeping learning behavior close to BF16 baselines.

citing papers explorer

Showing 1 of 1 citing paper.

FP8-RL: A Practical and Stable Low-Precision Stack for LLM Reinforcement Learning cs.LG · 2026-01-26 · unverdicted · none · ref 3
FP8-RL delivers up to 44% faster rollouts in LLM RL by using blockwise FP8 quantization, KV-cache recalibration, and importance-sampling corrections while keeping learning behavior close to BF16 baselines.

Efficient Memory Management for Large Language Model Serving with PagedAtten - tion,

fields

years

verdicts

representative citing papers

citing papers explorer