pith. sign in

Unified FP8: Moving beyond mixed precision for stable and accelerated MoE RL

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

fields

stat.ML 1

years

2026 1

verdicts

UNVERDICTED 1

representative citing papers

AIS: Adaptive Importance Sampling for Quantized RL

stat.ML · 2026-05-13 · unverdicted · novelty 7.0

AIS adaptively corrects non-stationary policy gradient bias in quantized LLM RL, matching BF16 performance while retaining 1.5-2.76x FP8 rollout speedup.

citing papers explorer

Showing 1 of 1 citing paper.

  • AIS: Adaptive Importance Sampling for Quantized RL stat.ML · 2026-05-13 · unverdicted · none · ref 17

    AIS adaptively corrects non-stationary policy gradient bias in quantized LLM RL, matching BF16 performance while retaining 1.5-2.76x FP8 rollout speedup.