Template-Based Posit Multiplication for Training and Inferring in Neural Networks
Pith reviewed 2026-05-25 00:31 UTC · model grok-4.3
The pith
Posit arithmetic supports the first training of neural networks with competitive accuracy on binary classification.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The paper claims to present the first instance of training a neural network using the posit number format, along with a new multiplication algorithm that handles the zero-exponent case, and shows that 8-bit posits achieve floating-point level accuracy on MNIST inference.
What carries the argument
The template-based posit multiplication algorithm integrated into FloPoCo, which supports configurations with zero exponent bits to allow fast sigmoid computation.
If this is right
- Posit multipliers can be synthesized and their resource usage compared to floating point multipliers.
- Neural network training can proceed using posit format for all arithmetic operations including gradients.
- 8-bit posit numbers are sufficient to match floating point accuracy for MNIST classification.
- Smaller posit configurations still allow effective training for simple binary classification tasks.
Where Pith is reading between the lines
- Posit training could potentially lower the energy cost of machine learning models on specialized hardware.
- The approach might be tested on deeper networks or other datasets to see if the accuracy advantages hold.
- Hardware designers could explore using the zero-exponent posit mode for other non-linear functions beyond sigmoid.
Load-bearing premise
That the posit format can be directly substituted for floating point in existing neural network training code without any modifications to the optimization process or loss functions.
What would settle it
Performing the binary classification training experiment with the posit configurations described and measuring if the accuracy reaches the levels reported in the paper.
Figures
read the original abstract
The posit number system is arguably the most promising and discussed topic in Arithmetic nowadays. The recent breakthroughs claimed by the format proposed by John L. Gustafson have put posits in the spotlight. In this work, we first describe an algorithm for multiplying two posit numbers, even when the number of exponent bits is zero. This configuration, scarcely tackled in literature, is particularly interesting because it allows the deployment of a fast sigmoid function. The proposed multiplication algorithm is then integrated as a template into the well-known FloPoCo framework. Synthesis results are shown to compare with the floating point multiplication offered by FloPoCo as well. Second, the performance of posits is studied in the scenario of Neural Networks in both training and inference stages. To the best of our knowledge, this is the first time that training is done with posit format, achieving promising results for a binary classification problem even with reduced posit configurations. In the inference stage, 8-bit posits are as good as floating point when dealing with the MNIST dataset, but lose some accuracy with CIFAR-10.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper presents a template-based algorithm for multiplying posit numbers (including the zero-exponent-bit case to enable a fast sigmoid), integrated into the FloPoCo framework with synthesis comparisons to floating-point multipliers. It further reports the first use of posit arithmetic for neural-network training, claiming promising accuracy on a binary classification task even with reduced posit configurations, and shows that 8-bit posits match floating-point accuracy on MNIST inference while losing some accuracy on CIFAR-10.
Significance. If the experimental claims hold after details are supplied, the work would provide the first concrete evidence that posit arithmetic can be dropped into unmodified training loops (forward pass, gradients, weight updates) while preserving convergence, plus a reusable FloPoCo template for posit hardware. These are potentially high-impact contributions for low-precision NN accelerators.
major comments (2)
- [Neural-network experiments] Neural-network experiments section: the central claim that posit arithmetic was successfully substituted into standard training (including gradient computation and updates) and produced 'promising results' for binary classification rests on unspecified network topologies, exact posit formats (total bits and exponent bits, e.g. <8,0>), optimizer/loss settings, and quantitative accuracy tables with error bars or training curves. Without these, reproducibility and verification of the 'first time training with posit' result are impossible.
- [Inference results] Inference results paragraph: the statements that '8-bit posits are as good as floating point' on MNIST and 'lose some accuracy' on CIFAR-10 are presented without the underlying network sizes, posit parameter settings, or side-by-side accuracy numbers, making it impossible to assess whether the comparison is load-bearing for the overall posit-for-NN thesis.
minor comments (1)
- [Synthesis results] The synthesis results table would benefit from explicit column headers clarifying which rows correspond to the zero-exponent-bit posit configuration.
Simulated Author's Rebuttal
We thank the referee for the detailed and constructive feedback. We agree that the neural-network experiments and inference results sections lack the necessary specifics for reproducibility and will revise the manuscript to address this.
read point-by-point responses
-
Referee: [Neural-network experiments] Neural-network experiments section: the central claim that posit arithmetic was successfully substituted into standard training (including gradient computation and updates) and produced 'promising results' for binary classification rests on unspecified network topologies, exact posit formats (total bits and exponent bits, e.g. <8,0>), optimizer/loss settings, and quantitative accuracy tables with error bars or training curves. Without these, reproducibility and verification of the 'first time training with posit' result are impossible.
Authors: We agree that these details are essential and were omitted. In the revised manuscript we will specify the network topology for the binary classification task, the exact posit formats (total bits and exponent bits), the optimizer and loss function, and provide quantitative accuracy tables with error bars or training curves from multiple runs. revision: yes
-
Referee: [Inference results] Inference results paragraph: the statements that '8-bit posits are as good as floating point' on MNIST and 'lose some accuracy' on CIFAR-10 are presented without the underlying network sizes, posit parameter settings, or side-by-side accuracy numbers, making it impossible to assess whether the comparison is load-bearing for the overall posit-for-NN thesis.
Authors: We acknowledge the omission. The revision will describe the network architectures and sizes for MNIST and CIFAR-10, the precise 8-bit posit configurations, and include a table with side-by-side accuracy numbers versus floating-point baselines. revision: yes
Circularity Check
No circularity: implementation and empirical tests build on external Gustafson posit definition
full rationale
The paper presents a template-based posit multiplier (including the zero-exponent case) derived from the external posit format definition of Gustafson, integrates it into the independent FloPoCo framework, and reports empirical NN training/inference results. No load-bearing step reduces a claimed prediction or uniqueness result to a fitted parameter, self-citation chain, or self-definition; the training claim is an experimental substitution of posit arithmetic into standard loops, not a derived quantity forced by the paper's own inputs. The work is self-contained against external benchmarks.
Axiom & Free-Parameter Ledger
Reference graph
Works this paper leans on
-
[1]
Ieee standard for binary floating-point arithmetic
IEEE Computer Society Standards Committee and American National Standards Institute. Ieee standard for binary floating-point arithmetic. ANSI/IEEE Std 754-1985, 1985
work page 1985
-
[2]
IEEE Std 754-2008, pages 1–70, 2008
IEEE standard for floating-point arithmetic. IEEE Std 754-2008, pages 1–70, 2008
work page 2008
- [3]
-
[4]
How java’s floating-point hurts everyone everywhere
William Kahan and Joseph D Darcy. How java’s floating-point hurts everyone everywhere. InACM 1998 workshop on Java for High–Performance Network Computing, pages 1–81. Stanford University, 1998
work page 1998
-
[5]
Architecture generator for type-3 unum posit adder/subtractor
Manish Kumar Jaiswal and Hayden K.-H So. Architecture generator for type-3 unum posit adder/subtractor. In 2018 IEEE International Symposium on Circuits and Systems (ISCAS). IEEE, 05 2018
work page 2018
-
[6]
John L. Gustafson and Isaac T. Yonemoto. Beating floating point at its own game: Posit arithmetic.Supercomputing Frontiers and Innovations, 4(2):71–86, 06 2017
work page 2017
- [7]
- [8]
-
[9]
Posits: The good, the bad and the ugly
Florent de Dinechin, Luc Forget, Jean-Michel Muller, and Yohann Uguen. Posits: The good, the bad and the ugly. In Proceedings of the Conference for Next Generation Arithmetic 2019, CoNGA’19, pages 6:1–6:10, 2019
work page 2019
-
[10]
Evaluating the hardware cost of the posit number system
Yohann Uguen, Luc Forget, and Florent de Dinechin. Evaluating the hardware cost of the posit number system. working paper or preprint, May 2019
work page 2019
-
[11]
Del Barrio, Roman Hermida, and Nader Bagherzadeh
Min Soo Kim, Alberto A. Del Barrio, Roman Hermida, and Nader Bagherzadeh. Low-power implementation of mitchell’s approximate logarithmic multiplication for convolutional neural networks. In 2018 23rd Asia and South Pacific Design Automation Conference (ASP-DAC), pages 617–622. IEEE, 1 2018
work page 2018
-
[12]
Del Barrio, Leonardo Tavares Oliveira, Roman Hermida, and Nader Bagherzadeh
Min Soo Kim, Alberto A. Del Barrio, Leonardo Tavares Oliveira, Roman Hermida, and Nader Bagherzadeh. Efficient mitchell’s approximate log multipliers for convolutional neural networks. IEEE Transactions on Computers, 68(5):660–675, 05 2019
work page 2019
-
[13]
Design of power-efficient fpga convolutional cores with approximate log multiplier
Leonardo Tavares Oliveira, Min Soo Kim, Alberto Antonio Del Barrio, Nader Bagherzadeh, and Ricardo Menotti. Design of power-efficient fpga convolutional cores with approximate log multiplier. InEuropean Symposium on Artificial Neural Networks, Computational Intelligence and Machine Learning, pages 203–208, 2019
work page 2019
-
[14]
Survey of floating-point formats, 2018
Robert Munafo. Survey of floating-point formats, 2018
work page 2018
-
[15]
Del Barrio, Roman Hermida, and Seda Ogrenci-Memik
Alberto A. Del Barrio, Roman Hermida, and Seda Ogrenci-Memik. A combined arithmetic-high-level synthesis solution to deploy partial carry-save radix-8 booth multipliers in datapaths. IEEE Transactions on Circuits and Systems I: Regular Papers, 66(2):742–755, 02 2019
work page 2019
-
[16]
Alberto A. Del Barrio and Román Hermida. A slack-based approach to efficiently deploy radix 8 booth multipliers. In Proceedings of the Conference on Design, Automation & Test in Europe, pages 1153–1158, 2017
work page 2017
-
[17]
Designing custom arithmetic data paths with FloPoCo
Florent de Dinechin and Bogdan Pasca. Designing custom arithmetic data paths with FloPoCo. IEEE Design & Test of Computers, 28(4):18–27, 07 2011
work page 2011
-
[18]
Universal number posit arithmetic generator on FPGA
Manish Kumar Jaiswal and Hayden K.-H So. Universal number posit arithmetic generator on FPGA. In 2018 Design, Automation & Test in Europe Conference & Exhibition (DATE). IEEE, 03 2018
work page 2018
-
[19]
Enabling high performance posit arithmetic applications using hardware acceleration
Laurens van Dam. Enabling high performance posit arithmetic applications using hardware acceleration. Master’s thesis, Delft University of Technology, the Netherlands
-
[20]
Parameterized posit arithmetic hardware generator
Rohit Chaurasiya, John Gustafson, Rahul Shrestha, Jonathan Neudorfer, Sangeeth Nambiar, Kaustav Niyogi, Farhad Merchant, and Rainer Leupers. Parameterized posit arithmetic hardware generator. In 2018 IEEE 36th International Conference on Computer Design (ICCD). IEEE, 10 2018
work page 2018
-
[21]
M. K. Jaiswal and H. K. . So. Pacogen: A hardware posit arithmetic core generator. IEEE Access, 7:74586–74601, 2019. 11 A PREPRINT - JULY 10, 2019
work page 2019
-
[22]
A. Podobas and S. Matsuoka. Hardware implementation of posits and their application in fpgas. In 2018 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pages 138–145, 2018
work page 2018
-
[23]
Rethinking floating point for deep learning
Jeff Johnson. Rethinking floating point for deep learning
-
[24]
Z. Carmichael, H. F. Langroudi, C. Khazanov, J. Lillie, J. L. Gustafson, and D. Kudithipudi. Deep positron: A deep neural network using the posit number system. In 2019 Design, Automation Test in Europe Conference Exhibition (DATE), pages 1421–1426, 2019
work page 2019
-
[25]
Langroudi, Char Khazanov, Jeffrey Lillie, John L
Zachariah Carmichael, Hamed F. Langroudi, Char Khazanov, Jeffrey Lillie, John L. Gustafson, and Dhireesha Kudithipudi. Performance-efficiency trade-off of low-precision numerical formats in deep neural networks. In Proceedings of the Conference for Next Generation Arithmetic 2019, CoNGA’19, pages 3:1–3:9, 2019
work page 2019
-
[26]
Langroudi, Zachariah Carmichael, John L
Hamed F. Langroudi, Zachariah Carmichael, John L. Gustafson, and Dhireesha Kudithipudi. Positnn: Tapered precision deep learning inference for the edge. 2018
work page 2018
- [27]
-
[28]
Jianyu Chen, Zaid Al-Ars, and H. Peter Hofstee. A matrix-multiply unit for posits in reconfigurable logic leveraging (open)capi. In Proceedings of the Conference for Next Generation Arithmetic, CoNGA ’18, pages 1:1–1:5, New York, NY , USA, 2018. ACM
work page 2018
-
[29]
Computer Arithmetic Algorithms
Israel Koren. Computer Arithmetic Algorithms. Prentice-Hall, Inc., Englewood Cliffs, NJ, USA, 1993
work page 1993
- [30]
-
[31]
DaDianNao: A machine-learning supercomputer
Yunji Chen, Tao Luo, Shaoli Liu, Shijin Zhang, Liqiang He, Jia Wang, Ling Li, Tianshi Chen, Zhiwei Xu, Ninghui Sun, and Olivier Temam. DaDianNao: A machine-learning supercomputer. In 2014 47th Annual IEEE/ACM International Symposium on Microarchitecture, pages 609–622. IEEE, 12 2014
work page 2014
-
[32]
Deep Learning with Limited Numerical Precision
Suyog Gupta, Ankur Agrawal, Kailash Gopalakrishnan, and Pritish Narayanan. Deep learning with limited numerical precision. CoRR, abs/1502.02551
work page internal anchor Pith review Pith/arXiv arXiv
-
[33]
Training deep neural networks with low precision multiplications
Matthieu Courbariaux, Yoshua Bengio, and Jean-Pierre David. Low precision arithmetic for deep learning. CoRR, abs/1412.7024
work page internal anchor Pith review Pith/arXiv arXiv
-
[34]
Lower numerical precision deep learning inference and training
Andres Rodriguez, Eden Segal, Etay Meiri, Evarist Fomenko, Y Jim Kim, Haihao Shen, and Barukh Ziv. Lower numerical precision deep learning inference and training. Intel White Paper, 2018
work page 2018
-
[35]
Quantization and training of neural networks for efficient integer-arithmetic-only inference
Benoit Jacob, Skirmantas Kligys, Bo Chen, Menglong Zhu, Matthew Tang, Andrew Howard, Hartwig Adam, and Dmitry Kalenichenko. Quantization and training of neural networks for efficient integer-arithmetic-only inference. In The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 2704–2713
-
[36]
Quantized neural networks: Training neural networks with low precision weights and activations
Itay Hubara, Matthieu Courbariaux, Daniel Soudry, Ran El-Yaniv, and Yoshua Bengio. Quantized neural networks: Training neural networks with low precision weights and activations. The Journal of Machine Learning Research, 18(1):6869–6898
-
[37]
Exploiting posit arithmetic for deep neural networks in autonomous driving applications
Marco Cococcioni, Emanuele Ruffaldi, and Sergio Saponara. Exploiting posit arithmetic for deep neural networks in autonomous driving applications. In 2018 International Conference of Electrical and Electronic Technologies for Automotive, pages 1–6. IEEE, 07 2018
work page 2018
-
[38]
Wolfram Research. Wolfram mathematica. http://www.wolfram.com/mathematica/, 2019. [Online; ac- cessed 01-July-2019]
work page 2019
- [39]
-
[40]
Stan van der Linde. Posits als vervanging van floating-points: Een vergelijking van unum type iii posits met ieee 754 floating points met mathematica en python
- [41]
- [42]
-
[43]
Martín Abadi, Ashish Agarwal, Paul Barham, Eugene Brevdo, Zhifeng Chen, Craig Citro, Greg S. Corrado, Andy Davis, Jeffrey Dean, Matthieu Devin, Sanjay Ghemawat, Ian Goodfellow, Andrew Harp, Geoffrey Irving, Michael Isard, Yangqing Jia, Rafal Jozefowicz, Lukasz Kaiser, Manjunath Kudlur, Josh Levenberg, Dandelion Mané, Rajat Monga, Sherry Moore, Derek Murra...
work page 2015
-
[44]
Numpy (on top of softposit), 2018
SpeedGo Computing. Numpy (on top of softposit), 2018. 12
work page 2018
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.