Ten-Four delivers a fused mixed-precision dot-product unit for open-source GPGPUs that runs in 4 cycles at 262 MHz, matches NVIDIA numerical accuracy, and uses less than 60% the area of a prior open implementation while delivering 3.1x higher throughput.
Title resolution pending
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.AR 1years
2025 1verdicts
CONDITIONAL 1representative citing papers
citing papers explorer
-
Ten-Four: An Open-Source Fused Dot Product Unit for Mixed-Precision GPGPU Tensor Cores
Ten-Four delivers a fused mixed-precision dot-product unit for open-source GPGPUs that runs in 4 cycles at 262 MHz, matches NVIDIA numerical accuracy, and uses less than 60% the area of a prior open implementation while delivering 3.1x higher throughput.