A 28nm digital CIM accelerator for FP8 uses on-the-fly shift-aware bitwidth prediction, FIFO alignment, and scalable MACs to reach 20.4 TFLOPS/W and 2.8x better efficiency than prior work while supporting variable mantissa widths.
16.4 an 89tops/w and 16.3 tops/mm 2 all-digital sram-based full-precision compute-in memory macro in 22nm for machine-learning edge applications
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.AR 1years
2026 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
Balancing FP8 Computation Accuracy and Efficiency on Digital CIM via Shift-Aware On-the-fly Aligned-Mantissa Bitwidth Prediction
A 28nm digital CIM accelerator for FP8 uses on-the-fly shift-aware bitwidth prediction, FIFO alignment, and scalable MACs to reach 20.4 TFLOPS/W and 2.8x better efficiency than prior work while supporting variable mantissa widths.