A 28nm digital CIM accelerator for FP8 uses on-the-fly shift-aware bitwidth prediction, FIFO alignment, and scalable MACs to reach 20.4 TFLOPS/W and 2.8x better efficiency than prior work while supporting variable mantissa widths.
A 28-nm 64-kb 31.6-tflops/w digital-domain floating-point-computing-unit and double- bit 6t-sram computing-in-memory macro for floating-point cnns,
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.AR 1years
2026 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
Balancing FP8 Computation Accuracy and Efficiency on Digital CIM via Shift-Aware On-the-fly Aligned-Mantissa Bitwidth Prediction
A 28nm digital CIM accelerator for FP8 uses on-the-fly shift-aware bitwidth prediction, FIFO alignment, and scalable MACs to reach 20.4 TFLOPS/W and 2.8x better efficiency than prior work while supporting variable mantissa widths.