MxGLUT introduces a reconfigurable LUT-centric broadcast dataflow accelerator with mixed-precision LUT-based PEs that unifies FP8-INT4 and FP8-FP8 GEMM without separate FP datapaths, reporting up to 2.16x prefill speedup and 0.492 TFLOPS/mm² area efficiency in 28nm synthesis.
A survey of low-bit large language models: Basics, systems, and algorithms,
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
years
2026 2verdicts
UNVERDICTED 2representative citing papers
Empirical tests show 8-bit weight-only quantization is lossless on both models while 4-bit works for the 7B but harms the 1B on reasoning/math/code tasks, and 2-bit or lower settings collapse performance.
citing papers explorer
-
An Empirical Study of OpenPangu Quantization on Ascend NPUs
Empirical tests show 8-bit weight-only quantization is lossless on both models while 4-bit works for the 7B but harms the 1B on reasoning/math/code tasks, and 2-bit or lower settings collapse performance.