A distributed arithmetic algorithm for CMVM operations on FPGAs reduces area by up to one third and latency for quantized neural networks, integrated into hls4ml.
Title resolution pending
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
verdicts
UNVERDICTED 2representative citing papers
Gives poly-time algorithms for MINSUM capacity augmentation to ensure strong stability in HR with ties, proves NP-hardness for MINMAX, and bounded-increase results when ties are short.
citing papers explorer
-
da4ml: Distributed Arithmetic for Real-time Neural Networks on FPGAs
A distributed arithmetic algorithm for CMVM operations on FPGAs reduces area by up to one third and latency for quantized neural networks, integrated into hls4ml.