arXiv preprint arXiv:2603.08747 , year=

Diagnosing FP4 inference: a layer-wise, block-wise sensitivity analysis of NVFP4, MXFP4 , author= · arXiv 2603.08747

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

representative citing papers

cs.LG · 2026-05-11 · unverdicted · novelty 6.0 · 3 refs

Weight gradient FP4 quantization drives LLM pretraining divergence, which deterministic Hadamard rotations can stabilize on native MXFP4 hardware.

Showing 1 of 1 citing paper.

Pretraining large language models with MXFP4 on Native FP4 Hardware cs.LG · 2026-05-11 · unverdicted · none · ref 12 · 3 links
Weight gradient FP4 quantization drives LLM pretraining divergence, which deterministic Hadamard rotations can stabilize on native MXFP4 hardware.