pith. sign in

arxiv: 1907.04091 · v1 · pith:ZOC3F2RZnew · submitted 2019-07-09 · 💻 cs.CV · cs.LG

Template-Based Posit Multiplication for Training and Inferring in Neural Networks

Pith reviewed 2026-05-25 00:31 UTC · model grok-4.3

classification 💻 cs.CV cs.LG
keywords posit arithmeticneural network trainingposit multiplicationreduced precision computingMNIST datasetCIFAR-10hardware implementationsigmoid function
0
0 comments X

The pith

Posit arithmetic supports the first training of neural networks with competitive accuracy on binary classification.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper develops a multiplication algorithm for posit numbers that works even when there are no exponent bits, which enables a fast sigmoid function. This is implemented as a template in the FloPoCo framework for generating hardware. The authors then use posits for neural network training and inference. They find that training with reduced posit configurations gives promising results on a binary classification problem, and that 8-bit posits match floating point performance on MNIST inference while losing some accuracy on CIFAR-10. A sympathetic reader would care because this suggests posits could replace floating point in machine learning hardware for better efficiency.

Core claim

The paper claims to present the first instance of training a neural network using the posit number format, along with a new multiplication algorithm that handles the zero-exponent case, and shows that 8-bit posits achieve floating-point level accuracy on MNIST inference.

What carries the argument

The template-based posit multiplication algorithm integrated into FloPoCo, which supports configurations with zero exponent bits to allow fast sigmoid computation.

If this is right

  • Posit multipliers can be synthesized and their resource usage compared to floating point multipliers.
  • Neural network training can proceed using posit format for all arithmetic operations including gradients.
  • 8-bit posit numbers are sufficient to match floating point accuracy for MNIST classification.
  • Smaller posit configurations still allow effective training for simple binary classification tasks.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • Posit training could potentially lower the energy cost of machine learning models on specialized hardware.
  • The approach might be tested on deeper networks or other datasets to see if the accuracy advantages hold.
  • Hardware designers could explore using the zero-exponent posit mode for other non-linear functions beyond sigmoid.

Load-bearing premise

That the posit format can be directly substituted for floating point in existing neural network training code without any modifications to the optimization process or loss functions.

What would settle it

Performing the binary classification training experiment with the posit configurations described and measuring if the accuracy reaches the levels reported in the paper.

Figures

Figures reproduced from arXiv: 1907.04091 by Alberto A. Del Barrio, Guillermo Botella, Ra\'ul Murillo Montero.

Figure 1
Figure 1. Figure 1: Layout of an hn, esi posit number. • Multiple bit patterns are used for handling exceptions such as the Not a Number (NaN) value, which indicates that a value is not representable or undefined – for example dividing by zero results in a NaN. The problem is that the amount of bit patterns that represent NaN may be more than necessary, making hardware design more complex and decreasing the available number o… view at source ↗
Figure 2
Figure 2. Figure 2: Generation of synthesizable VHDL from C++ code with FloPoCo. [PITH_FULL_IMAGE:figures/full_fig_p006_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Distributions of posit values and NN weights. [PITH_FULL_IMAGE:figures/full_fig_p007_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: Classification problem for posit training. [PITH_FULL_IMAGE:figures/full_fig_p007_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: Loss function along the NN training. Therefore, it can be concluded that posits have converged as well as floats, even with some short formats employing the fast sigmoid approach. Although the proposed NN is a reduced example, these facts point in a good direction to train more complex CNNs. 5.3 Neural Networks Inference The performance of Posith8, 0i format is evaluated on two datasets: MNIST and CIFAR-10… view at source ↗
read the original abstract

The posit number system is arguably the most promising and discussed topic in Arithmetic nowadays. The recent breakthroughs claimed by the format proposed by John L. Gustafson have put posits in the spotlight. In this work, we first describe an algorithm for multiplying two posit numbers, even when the number of exponent bits is zero. This configuration, scarcely tackled in literature, is particularly interesting because it allows the deployment of a fast sigmoid function. The proposed multiplication algorithm is then integrated as a template into the well-known FloPoCo framework. Synthesis results are shown to compare with the floating point multiplication offered by FloPoCo as well. Second, the performance of posits is studied in the scenario of Neural Networks in both training and inference stages. To the best of our knowledge, this is the first time that training is done with posit format, achieving promising results for a binary classification problem even with reduced posit configurations. In the inference stage, 8-bit posits are as good as floating point when dealing with the MNIST dataset, but lose some accuracy with CIFAR-10.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 1 minor

Summary. The paper presents a template-based algorithm for multiplying posit numbers (including the zero-exponent-bit case to enable a fast sigmoid), integrated into the FloPoCo framework with synthesis comparisons to floating-point multipliers. It further reports the first use of posit arithmetic for neural-network training, claiming promising accuracy on a binary classification task even with reduced posit configurations, and shows that 8-bit posits match floating-point accuracy on MNIST inference while losing some accuracy on CIFAR-10.

Significance. If the experimental claims hold after details are supplied, the work would provide the first concrete evidence that posit arithmetic can be dropped into unmodified training loops (forward pass, gradients, weight updates) while preserving convergence, plus a reusable FloPoCo template for posit hardware. These are potentially high-impact contributions for low-precision NN accelerators.

major comments (2)
  1. [Neural-network experiments] Neural-network experiments section: the central claim that posit arithmetic was successfully substituted into standard training (including gradient computation and updates) and produced 'promising results' for binary classification rests on unspecified network topologies, exact posit formats (total bits and exponent bits, e.g. <8,0>), optimizer/loss settings, and quantitative accuracy tables with error bars or training curves. Without these, reproducibility and verification of the 'first time training with posit' result are impossible.
  2. [Inference results] Inference results paragraph: the statements that '8-bit posits are as good as floating point' on MNIST and 'lose some accuracy' on CIFAR-10 are presented without the underlying network sizes, posit parameter settings, or side-by-side accuracy numbers, making it impossible to assess whether the comparison is load-bearing for the overall posit-for-NN thesis.
minor comments (1)
  1. [Synthesis results] The synthesis results table would benefit from explicit column headers clarifying which rows correspond to the zero-exponent-bit posit configuration.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the detailed and constructive feedback. We agree that the neural-network experiments and inference results sections lack the necessary specifics for reproducibility and will revise the manuscript to address this.

read point-by-point responses
  1. Referee: [Neural-network experiments] Neural-network experiments section: the central claim that posit arithmetic was successfully substituted into standard training (including gradient computation and updates) and produced 'promising results' for binary classification rests on unspecified network topologies, exact posit formats (total bits and exponent bits, e.g. <8,0>), optimizer/loss settings, and quantitative accuracy tables with error bars or training curves. Without these, reproducibility and verification of the 'first time training with posit' result are impossible.

    Authors: We agree that these details are essential and were omitted. In the revised manuscript we will specify the network topology for the binary classification task, the exact posit formats (total bits and exponent bits), the optimizer and loss function, and provide quantitative accuracy tables with error bars or training curves from multiple runs. revision: yes

  2. Referee: [Inference results] Inference results paragraph: the statements that '8-bit posits are as good as floating point' on MNIST and 'lose some accuracy' on CIFAR-10 are presented without the underlying network sizes, posit parameter settings, or side-by-side accuracy numbers, making it impossible to assess whether the comparison is load-bearing for the overall posit-for-NN thesis.

    Authors: We acknowledge the omission. The revision will describe the network architectures and sizes for MNIST and CIFAR-10, the precise 8-bit posit configurations, and include a table with side-by-side accuracy numbers versus floating-point baselines. revision: yes

Circularity Check

0 steps flagged

No circularity: implementation and empirical tests build on external Gustafson posit definition

full rationale

The paper presents a template-based posit multiplier (including the zero-exponent case) derived from the external posit format definition of Gustafson, integrates it into the independent FloPoCo framework, and reports empirical NN training/inference results. No load-bearing step reduces a claimed prediction or uniqueness result to a fitted parameter, self-citation chain, or self-definition; the training claim is an experimental substitution of posit arithmetic into standard loops, not a derived quantity forced by the paper's own inputs. The work is self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Based solely on the abstract, no free parameters, axioms, or invented entities are introduced; the work relies on the pre-existing posit format definition.

pith-pipeline@v0.9.0 · 5721 in / 1046 out tokens · 20688 ms · 2026-05-25T00:31:54.185712+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

44 extracted references · 44 canonical work pages · 2 internal anchors

  1. [1]

    Ieee standard for binary floating-point arithmetic

    IEEE Computer Society Standards Committee and American National Standards Institute. Ieee standard for binary floating-point arithmetic. ANSI/IEEE Std 754-1985, 1985

  2. [2]

    IEEE Std 754-2008, pages 1–70, 2008

    IEEE standard for floating-point arithmetic. IEEE Std 754-2008, pages 1–70, 2008

  3. [3]

    Gustafson

    John L. Gustafson. The End of Error: Unum Computing, volume 24. CRC Press

  4. [4]

    How java’s floating-point hurts everyone everywhere

    William Kahan and Joseph D Darcy. How java’s floating-point hurts everyone everywhere. InACM 1998 workshop on Java for High–Performance Network Computing, pages 1–81. Stanford University, 1998

  5. [5]

    Architecture generator for type-3 unum posit adder/subtractor

    Manish Kumar Jaiswal and Hayden K.-H So. Architecture generator for type-3 unum posit adder/subtractor. In 2018 IEEE International Symposium on Circuits and Systems (ISCAS). IEEE, 05 2018

  6. [6]

    Gustafson and Isaac T

    John L. Gustafson and Isaac T. Yonemoto. Beating floating point at its own game: Posit arithmetic.Supercomputing Frontiers and Innovations, 4(2):71–86, 06 2017

  7. [7]

    Gustafson

    John L. Gustafson. Posit arithmetic

  8. [8]

    Gustafson

    John L. Gustafson. A radical approach to computation with real numbers. Supercomputing Frontiers and Innovations, 3(2):38–53, 09 2016

  9. [9]

    Posits: The good, the bad and the ugly

    Florent de Dinechin, Luc Forget, Jean-Michel Muller, and Yohann Uguen. Posits: The good, the bad and the ugly. In Proceedings of the Conference for Next Generation Arithmetic 2019, CoNGA’19, pages 6:1–6:10, 2019

  10. [10]

    Evaluating the hardware cost of the posit number system

    Yohann Uguen, Luc Forget, and Florent de Dinechin. Evaluating the hardware cost of the posit number system. working paper or preprint, May 2019

  11. [11]

    Del Barrio, Roman Hermida, and Nader Bagherzadeh

    Min Soo Kim, Alberto A. Del Barrio, Roman Hermida, and Nader Bagherzadeh. Low-power implementation of mitchell’s approximate logarithmic multiplication for convolutional neural networks. In 2018 23rd Asia and South Pacific Design Automation Conference (ASP-DAC), pages 617–622. IEEE, 1 2018

  12. [12]

    Del Barrio, Leonardo Tavares Oliveira, Roman Hermida, and Nader Bagherzadeh

    Min Soo Kim, Alberto A. Del Barrio, Leonardo Tavares Oliveira, Roman Hermida, and Nader Bagherzadeh. Efficient mitchell’s approximate log multipliers for convolutional neural networks. IEEE Transactions on Computers, 68(5):660–675, 05 2019

  13. [13]

    Design of power-efficient fpga convolutional cores with approximate log multiplier

    Leonardo Tavares Oliveira, Min Soo Kim, Alberto Antonio Del Barrio, Nader Bagherzadeh, and Ricardo Menotti. Design of power-efficient fpga convolutional cores with approximate log multiplier. InEuropean Symposium on Artificial Neural Networks, Computational Intelligence and Machine Learning, pages 203–208, 2019

  14. [14]

    Survey of floating-point formats, 2018

    Robert Munafo. Survey of floating-point formats, 2018

  15. [15]

    Del Barrio, Roman Hermida, and Seda Ogrenci-Memik

    Alberto A. Del Barrio, Roman Hermida, and Seda Ogrenci-Memik. A combined arithmetic-high-level synthesis solution to deploy partial carry-save radix-8 booth multipliers in datapaths. IEEE Transactions on Circuits and Systems I: Regular Papers, 66(2):742–755, 02 2019

  16. [16]

    Del Barrio and Román Hermida

    Alberto A. Del Barrio and Román Hermida. A slack-based approach to efficiently deploy radix 8 booth multipliers. In Proceedings of the Conference on Design, Automation & Test in Europe, pages 1153–1158, 2017

  17. [17]

    Designing custom arithmetic data paths with FloPoCo

    Florent de Dinechin and Bogdan Pasca. Designing custom arithmetic data paths with FloPoCo. IEEE Design & Test of Computers, 28(4):18–27, 07 2011

  18. [18]

    Universal number posit arithmetic generator on FPGA

    Manish Kumar Jaiswal and Hayden K.-H So. Universal number posit arithmetic generator on FPGA. In 2018 Design, Automation & Test in Europe Conference & Exhibition (DATE). IEEE, 03 2018

  19. [19]

    Enabling high performance posit arithmetic applications using hardware acceleration

    Laurens van Dam. Enabling high performance posit arithmetic applications using hardware acceleration. Master’s thesis, Delft University of Technology, the Netherlands

  20. [20]

    Parameterized posit arithmetic hardware generator

    Rohit Chaurasiya, John Gustafson, Rahul Shrestha, Jonathan Neudorfer, Sangeeth Nambiar, Kaustav Niyogi, Farhad Merchant, and Rainer Leupers. Parameterized posit arithmetic hardware generator. In 2018 IEEE 36th International Conference on Computer Design (ICCD). IEEE, 10 2018

  21. [21]

    M. K. Jaiswal and H. K. . So. Pacogen: A hardware posit arithmetic core generator. IEEE Access, 7:74586–74601, 2019. 11 A PREPRINT - JULY 10, 2019

  22. [22]

    Podobas and S

    A. Podobas and S. Matsuoka. Hardware implementation of posits and their application in fpgas. In 2018 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pages 138–145, 2018

  23. [23]

    Rethinking floating point for deep learning

    Jeff Johnson. Rethinking floating point for deep learning

  24. [24]

    Carmichael, H

    Z. Carmichael, H. F. Langroudi, C. Khazanov, J. Lillie, J. L. Gustafson, and D. Kudithipudi. Deep positron: A deep neural network using the posit number system. In 2019 Design, Automation Test in Europe Conference Exhibition (DATE), pages 1421–1426, 2019

  25. [25]

    Langroudi, Char Khazanov, Jeffrey Lillie, John L

    Zachariah Carmichael, Hamed F. Langroudi, Char Khazanov, Jeffrey Lillie, John L. Gustafson, and Dhireesha Kudithipudi. Performance-efficiency trade-off of low-precision numerical formats in deep neural networks. In Proceedings of the Conference for Next Generation Arithmetic 2019, CoNGA’19, pages 3:1–3:9, 2019

  26. [26]

    Langroudi, Zachariah Carmichael, John L

    Hamed F. Langroudi, Zachariah Carmichael, John L. Gustafson, and Dhireesha Kudithipudi. Positnn: Tapered precision deep learning inference for the edge. 2018

  27. [27]

    Posit standard documentation

    Posit Working Group. Posit standard documentation

  28. [28]

    Peter Hofstee

    Jianyu Chen, Zaid Al-Ars, and H. Peter Hofstee. A matrix-multiply unit for posits in reconfigurable logic leveraging (open)capi. In Proceedings of the Conference for Next Generation Arithmetic, CoNGA ’18, pages 1:1–1:5, New York, NY , USA, 2018. ACM

  29. [29]

    Computer Arithmetic Algorithms

    Israel Koren. Computer Arithmetic Algorithms. Prentice-Hall, Inc., Englewood Cliffs, NJ, USA, 1993

  30. [30]

    Pysigmoid, 2017

    Ken Mercado. Pysigmoid, 2017

  31. [31]

    DaDianNao: A machine-learning supercomputer

    Yunji Chen, Tao Luo, Shaoli Liu, Shijin Zhang, Liqiang He, Jia Wang, Ling Li, Tianshi Chen, Zhiwei Xu, Ninghui Sun, and Olivier Temam. DaDianNao: A machine-learning supercomputer. In 2014 47th Annual IEEE/ACM International Symposium on Microarchitecture, pages 609–622. IEEE, 12 2014

  32. [32]

    Deep Learning with Limited Numerical Precision

    Suyog Gupta, Ankur Agrawal, Kailash Gopalakrishnan, and Pritish Narayanan. Deep learning with limited numerical precision. CoRR, abs/1502.02551

  33. [33]

    Training deep neural networks with low precision multiplications

    Matthieu Courbariaux, Yoshua Bengio, and Jean-Pierre David. Low precision arithmetic for deep learning. CoRR, abs/1412.7024

  34. [34]

    Lower numerical precision deep learning inference and training

    Andres Rodriguez, Eden Segal, Etay Meiri, Evarist Fomenko, Y Jim Kim, Haihao Shen, and Barukh Ziv. Lower numerical precision deep learning inference and training. Intel White Paper, 2018

  35. [35]

    Quantization and training of neural networks for efficient integer-arithmetic-only inference

    Benoit Jacob, Skirmantas Kligys, Bo Chen, Menglong Zhu, Matthew Tang, Andrew Howard, Hartwig Adam, and Dmitry Kalenichenko. Quantization and training of neural networks for efficient integer-arithmetic-only inference. In The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 2704–2713

  36. [36]

    Quantized neural networks: Training neural networks with low precision weights and activations

    Itay Hubara, Matthieu Courbariaux, Daniel Soudry, Ran El-Yaniv, and Yoshua Bengio. Quantized neural networks: Training neural networks with low precision weights and activations. The Journal of Machine Learning Research, 18(1):6869–6898

  37. [37]

    Exploiting posit arithmetic for deep neural networks in autonomous driving applications

    Marco Cococcioni, Emanuele Ruffaldi, and Sergio Saponara. Exploiting posit arithmetic for deep neural networks in autonomous driving applications. In 2018 International Conference of Electrical and Electronic Technologies for Automotive, pages 1–6. IEEE, 07 2018

  38. [38]

    Wolfram mathematica

    Wolfram Research. Wolfram mathematica. http://www.wolfram.com/mathematica/, 2019. [Online; ac- cessed 01-July-2019]

  39. [39]

    Vivado design suite

    Tom Feist. Vivado design suite. White Paper, 2012

  40. [40]

    Posits als vervanging van floating-points: Een vergelijking van unum type iii posits met ieee 754 floating points met mathematica en python

    Stan van der Linde. Posits als vervanging van floating-points: Een vergelijking van unum type iii posits met ieee 754 floating points met mathematica en python

  41. [41]

    Lecun, L

    Y . Lecun, L. Bottou, Y . Bengio, and P. Haffner. Gradient-based learning applied to document recognition. Proceedings of the IEEE, 86(11):2278–2324, 1998

  42. [42]

    Keras, 2015

    François Chollet et al. Keras, 2015

  43. [43]

    Martín Abadi, Ashish Agarwal, Paul Barham, Eugene Brevdo, Zhifeng Chen, Craig Citro, Greg S. Corrado, Andy Davis, Jeffrey Dean, Matthieu Devin, Sanjay Ghemawat, Ian Goodfellow, Andrew Harp, Geoffrey Irving, Michael Isard, Yangqing Jia, Rafal Jozefowicz, Lukasz Kaiser, Manjunath Kudlur, Josh Levenberg, Dandelion Mané, Rajat Monga, Sherry Moore, Derek Murra...

  44. [44]

    Numpy (on top of softposit), 2018

    SpeedGo Computing. Numpy (on top of softposit), 2018. 12