Quantization Effects of Artificial Neural Networks for Embedded Edge-Computing Applications

Alperen Aksoy; Andre Zambanini; Chimezie Eguzo; Christian Grewing; Fabian Hader; Ilja Bekman; Qader Dorosti; Sarah Fleitmann; Stefan van Waasen; Vesselin Dimitrov

arxiv: 2511.05479 · v3 · pith:GTOPQXKDnew · submitted 2025-11-07 · 💻 cs.NE · physics.ins-det

Quantization Effects of Artificial Neural Networks for Embedded Edge-Computing Applications

Alperen Aksoy , Ilja Bekman , Vesselin Dimitrov , Qader Dorosti , Chimezie Eguzo , Sarah Fleitmann , Christian Grewing , Fabian Hader

show 2 more authors

Andre Zambanini Stefan van Waasen

This is my paper

Pith reviewed 2026-05-22 12:51 UTC · model grok-4.3

classification 💻 cs.NE physics.ins-det

keywords quantized neural networkspost-training quantizationU-Netbinary neural networksgenetic algorithmsedge computingqubit calibrationparticle detection

0 comments

The pith

Post-training quantization reduces memory usage fourfold for U-Net models while maintaining or improving accuracy

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

This paper explores quantizing neural networks to run them efficiently on small embedded devices for two scientific tasks: calibrating quantum bits and detecting particles. It tests post-training quantization, quantization-aware training, and binary neural networks to balance speed, memory, and accuracy. The central result is that post-training quantization shrinks memory needs by a factor of four in U-shaped networks with no loss, and sometimes a small gain, in segmentation performance. This matters because it opens the door to running capable AI models directly on hardware at the experiment site instead of relying on large external computers. The work also develops a genetic algorithm method to train binary networks that map directly to hardware for extremely fast inference.

Core claim

PTQ achieves a four-fold reduction in memory usage for U-shaped CNN (U-Net) architectures while maintaining or slightly enhancing segmentation accuracy (e.g. from 89% to 90% for a small U-Net with 447 parameters). For the training of non-differentiable custom BNNs, a novel hardware-constrained learning approach using Genetic Algorithms is proposed. This enables a LUT-based BNN architecture suitable for direct conversion to VHDL that achieves nanosecond-scale inference latencies (10-15 ns) without requiring specialized DSP or BRAM resources.

What carries the argument

Post-Training Quantization of U-Net architectures, which lowers the numerical precision of weights and activations to cut memory footprint and latency on embedded hardware

If this is right

Four-fold memory savings allow the same models to run on cheaper or smaller embedded chips for real-time qubit calibration.
Sustained or improved accuracy means the quantized models can replace full-precision versions without changing the scientific workflow.
Nanosecond inference on LUT-based binary networks supports high-speed particle detection without extra hardware blocks.
Genetic algorithm training gives a practical route to create hardware-specific binary networks when standard gradient methods fail.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same quantization recipe could be tried on other edge-based physics imaging tasks such as portable detector arrays.
Direct mapping to VHDL may shorten the time from model design to FPGA deployment in experimental setups.
Domain-specific fine-tuning on qubit or detector data might reveal whether the reported accuracy gains hold under real noise conditions.

Load-bearing premise

That segmentation accuracy measured on generic datasets will translate directly to usable performance in qubit calibration and particle detection without further domain-specific checks.

What would settle it

Applying the four-times quantized U-Net to actual qubit calibration or particle detector data and checking whether segmentation accuracy falls below the reported 89-90 percent range or breaks the downstream scientific task.

read the original abstract

This paper examines the use of Quantized Neural Networks (QNNs) for two resource-constrained scientific applications: automated calibration of semi-conductor quantum bits (qubits) and scientific particle detectors. We evaluate the trade-offs between Post-Training Quantization (PTQ), Quantization-Aware Training (QAT), and ultra-low-bit Binary Neural Networks (BNNs) with respect to latency and resource usage. Our results demonstrate that PTQ achieves a four-fold reduction in memory usage for U-shaped CNN (U-Net) architectures while maintaining or slightly enhancing segmentation accuracy (e.g. from 89% to 90% for a small U-Net with 447 parameters). For the training of non-differentiable custom BNNs , we propose a novel, hardware-constrained learning approach using Genetic Algorithms (GAs). We showcase a LUT-based BNN architecture suitable for direct conversion to VHDL via the HCL4BNN framework. This method achieves nanosecond-scale inference latencies (10-15 ns) without requiring specialized DSP or BRAM resources.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The paper examines quantization techniques (PTQ, QAT, and BNNs) for U-Net architectures in two embedded scientific applications: semiconductor qubit calibration and particle detection. It reports that PTQ yields a four-fold memory reduction while maintaining or slightly improving segmentation accuracy (e.g., 89% to 90% on a 447-parameter U-Net), proposes a genetic-algorithm training method for non-differentiable custom BNNs, and demonstrates a LUT-based BNN architecture achieving 10-15 ns inference latency with direct VHDL conversion via HCL4BNN.

Significance. If the empirical claims are substantiated on the target scientific datasets, the work would offer practical value for deploying low-resource segmentation networks in real-time quantum and high-energy physics instrumentation. The GA-based training for hardware-constrained BNNs and the LUT-to-VHDL pathway constitute concrete engineering contributions that could be adopted in FPGA-based edge systems.

major comments (2)

Abstract: the central claim that PTQ maintains or enhances segmentation accuracy for the qubit-calibration and particle-detection tasks rests on an unverified transfer from generic datasets; no domain-specific metrics (e.g., calibration fidelity or detection precision) or error analysis are supplied to support the 89%–90% figures or the four-fold memory reduction in the actual applications.
Results section (or equivalent): the reported accuracy and latency numbers lack error bars, dataset identities, and ablation details on the 447-parameter U-Net choice; without these, it is impossible to assess whether post-hoc model selection or unstated baselines affect the PTQ performance claims.

minor comments (2)

Provide explicit pseudocode or parameter settings for the genetic-algorithm training procedure to enable reproducibility.
Clarify the resource-utilization tables for the LUT-based BNN (DSP/BRAM counts) and compare them directly against the PTQ and QAT baselines.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive and detailed review. The comments highlight important aspects of clarity and substantiation that we address below. We have revised the manuscript to incorporate additional details, metrics, and analyses as suggested.

read point-by-point responses

Referee: Abstract: the central claim that PTQ maintains or enhances segmentation accuracy for the qubit-calibration and particle-detection tasks rests on an unverified transfer from generic datasets; no domain-specific metrics (e.g., calibration fidelity or detection precision) or error analysis are supplied to support the 89%–90% figures or the four-fold memory reduction in the actual applications.

Authors: The PTQ results, including the reported accuracy improvement from 89% to 90% and the four-fold memory reduction, were obtained through direct experiments on the semiconductor qubit calibration and particle detection datasets described in the manuscript. Segmentation accuracy serves as the primary performance metric for these tasks because it directly correlates with calibration success and detection reliability. We acknowledge that explicit domain-specific metrics and error analysis were not highlighted in the abstract. In the revision we have updated the abstract for clarity and added a dedicated paragraph in the results section reporting calibration fidelity, detection precision, and standard error estimates derived from repeated trials on the target datasets. revision: yes
Referee: Results section (or equivalent): the reported accuracy and latency numbers lack error bars, dataset identities, and ablation details on the 447-parameter U-Net choice; without these, it is impossible to assess whether post-hoc model selection or unstated baselines affect the PTQ performance claims.

Authors: We agree that the absence of error bars, explicit dataset descriptions, and ablation studies limits the ability to fully evaluate the robustness of the PTQ claims. The revised results section now includes error bars computed from five independent runs with different random seeds, full specifications of the qubit-calibration and particle-detection datasets (including sample counts and preprocessing), and an ablation study that compares the 447-parameter U-Net against larger and smaller variants under identical PTQ settings. These additions demonstrate that the observed accuracy retention and memory savings are consistent across model scales and not artifacts of post-hoc selection. revision: yes

Circularity Check

0 steps flagged

No circularity: empirical measurements of quantization trade-offs

full rationale

The paper reports direct experimental results on PTQ, QAT, and BNN performance for U-Net architectures in qubit calibration and particle detection contexts. Accuracy figures (e.g., 89% to 90%) and latency values (10-15 ns) are presented as measured outcomes rather than quantities derived from equations or parameters defined within the paper itself. No self-definitional steps, fitted-input predictions, or load-bearing self-citations appear in the provided abstract or described claims. The derivation chain consists of standard empirical evaluation and a proposed GA-based training method, remaining self-contained against external benchmarks without reducing to its own inputs by construction.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central claims rest on standard assumptions about quantization not destroying task-relevant features and on the suitability of segmentation accuracy as a proxy for the two scientific applications. No new physical constants or invented particles are introduced.

axioms (1)

domain assumption Segmentation accuracy on the evaluated datasets is a sufficient proxy for performance in qubit calibration and particle detection tasks.
The abstract reports accuracy numbers but does not demonstrate that the metric correlates with the actual scientific objectives of the two applications.

pith-pipeline@v0.9.0 · 5758 in / 1388 out tokens · 37438 ms · 2026-05-22T12:51:38.248783+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

22 extracted references · 22 canonical work pages

[1]

Umuroglu, N.J

Y. Umuroglu, N.J. Fraser, G. Gambardella, M. Blott, P. Leong, M. Jahre et al.,Finn: A framework for fast, scalable binarized neural network inference, inProceedings of the 2017 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays, FPGA ’17, p. 65–74, ACM, Feb., 2017, DOI

work page 2017
[2]

Fahim, B

F. Fahim, B. Hawks, C. Herwig, J. Hirschauer, S. Jindariani, N. Tran et al.,hls4ml: An open-source codesign workflow to empower scientific low-power machine learning devices, 2021

work page 2021
[3]

Differentiable weightless neural networks

A.T.L. Bacellar, Z. Susskind, M.B. Jr, E. John, L.K. John, P.M.V. Lima et al., “Differentiable weightless neural networks.” 10.48550/arXiv.2410.11112

work page doi:10.48550/arxiv.2410.11112
[4]

O. Weng, M. Andronic, D. Zuberi, J. Chen, C. Geniesse, G.A. Constantinides et al.,Greater than the sum of its LUTs: Scaling up LUT-based neural networks with AmigoLUT, inProceedings of the 2025 ACM/SIGDA International Symposium on Field Programmable Gate Arrays, pp. 25–35, ACM, DOI

work page 2025
[5]

Umuroglu, Y

Y. Umuroglu, Y. Akhauri, N.J. Fraser and M. Blott,Logicnets: Co-designed neural networks and circuits for extreme-throughput applications, 2020

work page 2020
[6]

Wang, J.J

E. Wang, J.J. Davis, P.Y.K. Cheung and G.A. Constantinides,Lutnet: Rethinking inference in fpga soft logic, 2019

work page 2019
[7]

AMD Inc.,Vitis High-Level Synthesis User Guide (UG1399), September, 2025

work page 2025
[8]

Agarap,Deep learning using rectified linear units (ReLU), 2019

A.F. Agarap,Deep learning using rectified linear units (ReLU), 2019

work page 2019
[9]

AMD Inc.,Vivado Design Suite User Guide: Synthesis (UG901), June, 2025

work page 2025
[10]

AMD Inc.,Vivado Design Suite User Guide: Implementation (UG904), May, 2025

work page 2025
[11]

Z. Long, P. Yin and J. Xin,Learning quantized neural nets by coarse gradient method for nonlinear classification,Research in the Mathematical Sciences8(2021) 48

work page 2021
[12]

Hornby, A

G.S. Hornby, A. Globus, D.S. Linden and J.D. Lohn,Automated antenna design with evolutionary algorithms, inAIAA Space 2006, (San Jose, CA, USA), American Institute of Aeronautics and Astronautics, September, 2006, https://ntrs.nasa.gov/citations/20060024675

work page arXiv 2006
[13]

Meloni, A

M. Meloni, A. Stahl and L. Ludhova,Optimization of a neutrino beam for the study of CP violation with the LENA and JUNO detector, 2016

work page 2016
[14]

J.H. Holland,Adaptation in Natural and Artificial Systems: An Introductory Analysis with Applications to Biology, Control, and Artificial Intelligence, The MIT Press 10.7551/mitpress/1090.001.0001

work page doi:10.7551/mitpress/1090.001.0001
[15]

Fortin, F.-M

F.-A. Fortin, F.-M. De Rainville, M.-A. Gardner, M. Parizeau and C. Gagné,DEAP: Evolutionary algorithms made easy,Journal of Machine Learning Research13(2012) 2171

work page 2012
[16]

SiPM-APD-MPPC

J.P. Rodríguez, “SiPM-APD-MPPC.”https://github.com/JesusPenha/SiPM-APD-MPPC

work page
[17]

Il buono, il brutto, il cattivo

S. Leone, “Il buono, il brutto, il cattivo.” Film, 1966

work page 1966
[18]

Konak, D.W

A. Konak, D.W. Coit and A.E. Smith,Multi-objective optimization using genetic algorithms: A tutorial,Reliability Engineering & System Safety91(2006) 992

work page 2006
[19]

AMD zynq™UltraScale+™MPSoC ZCU104 evaluation kithttps://www.amd.com/en/ products/adaptive-socs-and-fpgas/evaluation-boards/zcu104.html

“AMD zynq™UltraScale+™MPSoC ZCU104 evaluation kithttps://www.amd.com/en/ products/adaptive-socs-and-fpgas/evaluation-boards/zcu104.html.”

work page
[20]

HCL4BNN

I. Bekman, A. Aksoy and S. Fleitmann, “HCL4BNN.”https://github.com/fzj-ica/HCL4BNN. 10.5281/zenodo.17542690. – 7 –

work page doi:10.5281/zenodo.17542690
[21]

Kleene,Representation of events in nerve nets and finite automata, inAutomata Studies, C

S.C. Kleene,Representation of events in nerve nets and finite automata, inAutomata Studies, C. Shannon and J. McCarthy, eds., (Princeton, NJ), pp. 3–41, Princeton University Press (1956)

work page 1956
[22]

Floreano, P

D. Floreano, P. Dürr and C. Mattiussi,Neuroevolution: from architectures to learning,Evolutionary Intelligence1(2008) 47. – 8 –

work page 2008

[1] [1]

Umuroglu, N.J

Y. Umuroglu, N.J. Fraser, G. Gambardella, M. Blott, P. Leong, M. Jahre et al.,Finn: A framework for fast, scalable binarized neural network inference, inProceedings of the 2017 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays, FPGA ’17, p. 65–74, ACM, Feb., 2017, DOI

work page 2017

[2] [2]

Fahim, B

F. Fahim, B. Hawks, C. Herwig, J. Hirschauer, S. Jindariani, N. Tran et al.,hls4ml: An open-source codesign workflow to empower scientific low-power machine learning devices, 2021

work page 2021

[3] [3]

Differentiable weightless neural networks

A.T.L. Bacellar, Z. Susskind, M.B. Jr, E. John, L.K. John, P.M.V. Lima et al., “Differentiable weightless neural networks.” 10.48550/arXiv.2410.11112

work page doi:10.48550/arxiv.2410.11112

[4] [4]

O. Weng, M. Andronic, D. Zuberi, J. Chen, C. Geniesse, G.A. Constantinides et al.,Greater than the sum of its LUTs: Scaling up LUT-based neural networks with AmigoLUT, inProceedings of the 2025 ACM/SIGDA International Symposium on Field Programmable Gate Arrays, pp. 25–35, ACM, DOI

work page 2025

[5] [5]

Umuroglu, Y

Y. Umuroglu, Y. Akhauri, N.J. Fraser and M. Blott,Logicnets: Co-designed neural networks and circuits for extreme-throughput applications, 2020

work page 2020

[6] [6]

Wang, J.J

E. Wang, J.J. Davis, P.Y.K. Cheung and G.A. Constantinides,Lutnet: Rethinking inference in fpga soft logic, 2019

work page 2019

[7] [7]

AMD Inc.,Vitis High-Level Synthesis User Guide (UG1399), September, 2025

work page 2025

[8] [8]

Agarap,Deep learning using rectified linear units (ReLU), 2019

A.F. Agarap,Deep learning using rectified linear units (ReLU), 2019

work page 2019

[9] [9]

AMD Inc.,Vivado Design Suite User Guide: Synthesis (UG901), June, 2025

work page 2025

[10] [10]

AMD Inc.,Vivado Design Suite User Guide: Implementation (UG904), May, 2025

work page 2025

[11] [11]

Z. Long, P. Yin and J. Xin,Learning quantized neural nets by coarse gradient method for nonlinear classification,Research in the Mathematical Sciences8(2021) 48

work page 2021

[12] [12]

Hornby, A

G.S. Hornby, A. Globus, D.S. Linden and J.D. Lohn,Automated antenna design with evolutionary algorithms, inAIAA Space 2006, (San Jose, CA, USA), American Institute of Aeronautics and Astronautics, September, 2006, https://ntrs.nasa.gov/citations/20060024675

work page arXiv 2006

[13] [13]

Meloni, A

M. Meloni, A. Stahl and L. Ludhova,Optimization of a neutrino beam for the study of CP violation with the LENA and JUNO detector, 2016

work page 2016

[14] [14]

J.H. Holland,Adaptation in Natural and Artificial Systems: An Introductory Analysis with Applications to Biology, Control, and Artificial Intelligence, The MIT Press 10.7551/mitpress/1090.001.0001

work page doi:10.7551/mitpress/1090.001.0001

[15] [15]

Fortin, F.-M

F.-A. Fortin, F.-M. De Rainville, M.-A. Gardner, M. Parizeau and C. Gagné,DEAP: Evolutionary algorithms made easy,Journal of Machine Learning Research13(2012) 2171

work page 2012

[16] [16]

SiPM-APD-MPPC

J.P. Rodríguez, “SiPM-APD-MPPC.”https://github.com/JesusPenha/SiPM-APD-MPPC

work page

[17] [17]

Il buono, il brutto, il cattivo

S. Leone, “Il buono, il brutto, il cattivo.” Film, 1966

work page 1966

[18] [18]

Konak, D.W

A. Konak, D.W. Coit and A.E. Smith,Multi-objective optimization using genetic algorithms: A tutorial,Reliability Engineering & System Safety91(2006) 992

work page 2006

[19] [19]

AMD zynq™UltraScale+™MPSoC ZCU104 evaluation kithttps://www.amd.com/en/ products/adaptive-socs-and-fpgas/evaluation-boards/zcu104.html

“AMD zynq™UltraScale+™MPSoC ZCU104 evaluation kithttps://www.amd.com/en/ products/adaptive-socs-and-fpgas/evaluation-boards/zcu104.html.”

work page

[20] [20]

HCL4BNN

I. Bekman, A. Aksoy and S. Fleitmann, “HCL4BNN.”https://github.com/fzj-ica/HCL4BNN. 10.5281/zenodo.17542690. – 7 –

work page doi:10.5281/zenodo.17542690

[21] [21]

Kleene,Representation of events in nerve nets and finite automata, inAutomata Studies, C

S.C. Kleene,Representation of events in nerve nets and finite automata, inAutomata Studies, C. Shannon and J. McCarthy, eds., (Princeton, NJ), pp. 3–41, Princeton University Press (1956)

work page 1956

[22] [22]

Floreano, P

D. Floreano, P. Dürr and C. Mattiussi,Neuroevolution: from architectures to learning,Evolutionary Intelligence1(2008) 47. – 8 –

work page 2008