A distributed arithmetic algorithm for CMVM operations on FPGAs reduces area by up to one third and latency for quantized neural networks, integrated into hls4ml.
Title resolution pending
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.AR 1years
2025 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
da4ml: Distributed Arithmetic for Real-time Neural Networks on FPGAs
A distributed arithmetic algorithm for CMVM operations on FPGAs reduces area by up to one third and latency for quantized neural networks, integrated into hls4ml.