← back to paper
arxiv: 2605.09825 · 3 revisions
Pretraining large language models with MXFP4 on Native FP4 Hardware