pith. sign in

arxiv: 2604.16474 · v1 · submitted 2026-04-11 · 💻 cs.AR · cs.AI· cs.NE

Full Feature Spiking Neural Network Simulation on Micro-Controllers for Neuromorphic Applications at the Edge

Pith reviewed 2026-05-10 16:08 UTC · model grok-4.3

classification 💻 cs.AR cs.AIcs.NE
keywords spiking neural networksmicrocontrollerneuromorphic computingedge computingCARLsimenergy efficiency16-bit floating point
0
0 comments X

The pith

The CARLsim spiking neural network simulator runs its full feature set on a microcontroller with 8 MB memory by switching to 16-bit floating point numbers.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

This paper demonstrates that full-featured spiking neural network simulations no longer need GPUs, workstations, or specialized chips. By adopting IEEE 16-bit floating point numbers inside CARLsim, the authors fit a 1200-neuron benchmark onto the RP2350 microcontroller while retaining 97.5 percent accuracy relative to single-precision results. A scaled version with 186 neurons runs in real time at 20 milliwatts, using five times less energy than the smallest ARM processors for the network computation alone.

Core claim

CARLsim achieves complete functionality on the RP2350 microcontroller through the use of IEEE 16-bit floating point arithmetic, which cuts memory demand enough to execute the Synfire4 benchmark of 1200 neurons at 97.5 percent accuracy versus standard single-precision execution and to run a 186-neuron version in real time at 20 mW.

What carries the argument

The IEEE 16-bit floating point format inside CARLsim on the RP2350 MCU, which halves memory requirements while preserving simulation accuracy and timing.

If this is right

  • Real-time spiking neural network execution becomes feasible on battery-powered microcontrollers without external hardware.
  • Neuromorphic edge applications can operate at 20 mW while retaining near-full accuracy.
  • Full CARLsim features no longer require application-class processors or GPUs for deployment at the edge.
  • Energy efficiency for the SNN itself improves by a factor of five compared with the smallest ARM Cortex-A53 implementations.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • Battery life in wearable or remote sensors could extend dramatically if SNN control loops replace conventional code.
  • The same 16-bit approach may allow other simulators to reach larger networks on standard MCUs without custom silicon.
  • Lowering power to 20 mW opens always-on neuromorphic processing in environments where heat or battery swaps are impractical.

Load-bearing premise

Switching from 32-bit to 16-bit floating point numbers reduces memory use without any meaningful loss of accuracy or functionality in the spiking neural network simulations.

What would settle it

Running the identical 1200-neuron Synfire4 benchmark on the RP2350 in both 16-bit and 32-bit modes and measuring whether the output differs by more than 2.5 percent or whether memory consumption exceeds 8 MB.

Figures

Figures reproduced from arXiv: 2604.16474 by J. L. Krichmar, L. Niedermeier.

Figure 1
Figure 1. Figure 1: Neuromorphic Applications on RP2350 MCUs. (a) An MCU has no [PITH_FULL_IMAGE:figures/full_fig_p001_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: The RP2350 MCU with two ARM Cortex-M33 cores. a) The [PITH_FULL_IMAGE:figures/full_fig_p002_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: PSRAM circuit. Schematic of 8 MB PSRAM circuit for the [PITH_FULL_IMAGE:figures/full_fig_p003_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: Architecture of Synfire4 benchmark network. The four segments are [PITH_FULL_IMAGE:figures/full_fig_p004_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: CARLsim Synfire4 benchmark scaled-down to 186 neurons for [PITH_FULL_IMAGE:figures/full_fig_p005_5.png] view at source ↗
read the original abstract

Microcontroller units (MCU), which have an order of magnitude lower Size, Weight and Power (SWaP) than standard computers, makes them suitable for applications at the edge. Neuromorphic computing, which can realize low SWaP, relies on Spiking Neural Networks (SNNs). Until now, software based simulations of SNNs required GPU-based workstations, application classified core processors such as the ARM Cortex-A53, or specialized hardware like Intel's Loihi. In the present work, we demonstrate that the SNN simulator CARLsim can run its full feature set on a MCU RP2350 with 8 MB memory. We accomplished this by utilizing IEEE 16-bit float point numbers, which reduced memory requirements without loss of function. We were able to run the Synfire4 benchmark which comprises 1200 neurons. The accuracy was 97.5% compared to the standard single precision numbers. Furthermore, we show that CARLsim runs a Synfire4 benchmark scaled-down to 186 neurons on a MCU in real-time at only 20 mW. Compared to the smallest application class ARM processor used by Raspberry in their Pi Zero 2 W, our MCU implementation is five times more energy efficient for the SNN itself, and an order of magnitude better when compared to the complete SoC (MCU/CPU + Board).

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The paper claims that the full feature set of the CARLsim SNN simulator can be ported to the RP2350 MCU (8 MB memory) by switching to IEEE 16-bit floating-point arithmetic, preserving functionality as evidenced by 97.5% accuracy on the Synfire4 benchmark (1200 neurons) relative to single-precision reference, with a scaled 186-neuron version running in real time at 20 mW and showing 5-10x better energy efficiency than ARM Cortex-A53 based systems like the Raspberry Pi Zero 2 W.

Significance. If the numerical fidelity claim holds under a well-defined metric, the work would enable practical full-featured SNN simulation on ultra-low-SWaP MCUs for edge neuromorphic applications, moving beyond GPU workstations or specialized chips like Loihi. Strengths include direct empirical hardware measurements on the RP2350 (no fitted parameters or simulations of the port), concrete power figures (20 mW), and explicit energy-efficiency comparisons that are falsifiable.

major comments (2)
  1. [Abstract] Abstract: the central claim that 16-bit floats achieve 'without loss of function' for the full CARLsim feature set rests solely on an unspecified 97.5% accuracy for the Synfire4 benchmark. No definition is supplied for the accuracy quantity (spike-rate error, spike-timing error, membrane-potential deviation, or downstream task performance), nor is a tolerance threshold justified for neuromorphic edge use. Because SNN dynamics are known to be sensitive to small perturbations that can alter firing patterns or plasticity rules, this 2.5% discrepancy leaves open whether critical features survive the port.
  2. [Results] Results / Methods (benchmark description): the paper reports successful execution of the 1200-neuron Synfire4 case and real-time 186-neuron operation but provides no verification that every CARLsim feature (e.g., all neuron models, STDP variants, or connectivity options) was exercised and compared on the MCU; only aggregate accuracy is shown.
minor comments (2)
  1. The manuscript would benefit from an explicit table or section listing which CARLsim APIs and parameters were ported and tested, together with the precise definition and computation of the 97.5% accuracy metric.
  2. Power measurements (20 mW) should include the measurement method, instrumentation, and whether they isolate the SNN computation from board overhead.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the careful review and constructive feedback on the clarity of our numerical fidelity claims and feature verification. We address the major comments point by point below and will incorporate revisions to strengthen the manuscript.

read point-by-point responses
  1. Referee: [Abstract] Abstract: the central claim that 16-bit floats achieve 'without loss of function' for the full CARLsim feature set rests solely on an unspecified 97.5% accuracy for the Synfire4 benchmark. No definition is supplied for the accuracy quantity (spike-rate error, spike-timing error, membrane-potential deviation, or downstream task performance), nor is a tolerance threshold justified for neuromorphic edge use. Because SNN dynamics are known to be sensitive to small perturbations that can alter firing patterns or plasticity rules, this 2.5% discrepancy leaves open whether critical features survive the port.

    Authors: We agree that the abstract does not define the accuracy metric or justify the tolerance threshold. In the revised manuscript we will update the abstract to state that the reported 97.5% accuracy quantifies agreement in spike timing and firing rates between the half-precision and single-precision runs of the Synfire4 benchmark (with the precise computation given in the Results section). We will also add a short justification that, for edge neuromorphic applications, the relevant criterion is preservation of functional network behavior rather than bit-exact equivalence, and that the observed discrepancy does not change the overall dynamics or plasticity outcomes in the tested benchmark. revision: yes

  2. Referee: [Results] Results / Methods (benchmark description): the paper reports successful execution of the 1200-neuron Synfire4 case and real-time 186-neuron operation but provides no verification that every CARLsim feature (e.g., all neuron models, STDP variants, or connectivity options) was exercised and compared on the MCU; only aggregate accuracy is shown.

    Authors: The Synfire4 benchmark exercises a broad subset of CARLsim capabilities, including multiple neuron models, STDP plasticity, and structured connectivity. However, the manuscript does not explicitly enumerate or verify every available feature option. In the revision we will insert a concise table or subsection listing the specific CARLsim features (neuron models, STDP variants, connectivity patterns) that were active in the Synfire4 port and confirming that each was executed and compared against the single-precision reference on the RP2350. revision: yes

Circularity Check

0 steps flagged

No circularity: empirical hardware port and benchmark measurements only

full rationale

The paper contains no derivations, equations, fitted parameters, or first-principles claims. It reports an implementation of the existing CARLsim simulator on the RP2350 MCU using IEEE 16-bit floats, followed by direct empirical timing, power, and accuracy measurements on the Synfire4 benchmark. The 97.5% accuracy figure is presented as a measured outcome against single-precision reference runs, not as a prediction derived from any model or self-referential definition. No self-citations are invoked to justify uniqueness or load-bearing premises, and no ansatz or renaming of known results occurs. The central claims rest on hardware execution and comparison, which are externally falsifiable and independent of the paper's own inputs.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The demonstration rests on the domain assumption that half-precision arithmetic preserves SNN dynamics sufficiently for the chosen benchmarks; no free parameters or new entities are introduced.

axioms (1)
  • domain assumption IEEE 754 16-bit floating point arithmetic preserves full CARLsim functionality without loss
    Stated directly in the abstract as the key enabler for fitting the simulator into 8 MB memory.

pith-pipeline@v0.9.0 · 5553 in / 1296 out tokens · 41888 ms · 2026-05-10T16:08:44.591399+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

35 extracted references · 35 canonical work pages

  1. [1]

    Loihi: A neuromorphic manycore processor with on-chip learning,

    M. Davies, N. Srinivasa, T.-H. Lin, G. Chinya, Y . Cao, S. H. Choday, G. Dimou, P. Joshi, N. Imam, S. Jainet al., “Loihi: A neuromorphic manycore processor with on-chip learning,”IEEE Micro, vol. 38, no. 1, pp. 82–99, 2018

  2. [2]

    The SpiNNaker project,

    S. B. Furber, F. Galluppi, S. Temple, and L. A. Plana, “The SpiNNaker project,”Proceedings of the IEEE, vol. 102, no. 5, pp. 652–665, 2014

  3. [3]

    Neuromorphic silicon neuron circuits,

    G. Indiveri, B. Linares-Barranco, T. J. Hamilton, A. Van Schaik, R. Etienne-Cummings, T. Delbruck, S.-C. Liu, P. Dudek, P. H ¨afliger, S. Renaudet al., “Neuromorphic silicon neuron circuits,”Frontiers in Neuroscience, vol. 5, p. 73, 2011

  4. [4]

    Nengo: a python tool for building large-scale functional brain models,

    T. Bekolay, J. Bergstra, E. Hunsberger, T. DeWolf, T. C. Stewart, D. Rasmussen, X. Choo, A. V oelker, and C. Eliasmith, “Nengo: a python tool for building large-scale functional brain models,”Frontiers in Neuroinformatics, vol. 7, 2014

  5. [5]

    Brian: a simulator for spiking neural networks in Python,

    D. F. Goodman and R. Brette, “Brian: a simulator for spiking neural networks in Python,”Frontiers in Neuroinformatics, vol. 2, p. 5, 2008

  6. [6]

    Carlsim 6: An open source library for large-scale, biologically detailed spiking neural network simulation,

    L. Niedermeier, K. Chen, J. Xing, A. Das, J. Kopsick, E. Scott, N. Sutton, K. Weber, N. Dutt, and J. L. Krichmar, “Carlsim 6: An open source library for large-scale, biologically detailed spiking neural network simulation,” in2022 International Joint Conference on Neural Networks (IJCNN). IEEE, 2022, pp. 1–10

  7. [7]

    A multi-threading kernel for enabling neuromorphic edge applications,

    L. Niedermeier, V . Shah, and J. L. Krichmar, “A multi-threading kernel for enabling neuromorphic edge applications,” 2025. [Online]. Available: https://arxiv.org/abs/2510.17745

  8. [8]

    Arm Cortex-M Processor Comparison Table

    Arm, “Arm Cortex-M Processor Comparison Table.” [Online]. Available: https://developer.arm.com/documentation/102787/0300

  9. [9]

    SparkFun Pro Micro RP2350 - Features and Specs

    Sparkfun, “SparkFun Pro Micro RP2350 - Features and Specs.” [Online]. Available: https://www.sparkfun.com/sparkfun-pro-micro- rp2350.html#content-features

  10. [10]

    APM SPI 3V PSRAM Datasheet

    AP Memory, “APM SPI 3V PSRAM Datasheet.” [Online]. Available: https://cdn.sparkfun.com /assets/0/a/3/d/e/ APS6404L 3SQR Datasheet.pdf

  11. [11]

    APS6404L-3SQR QSPI PSRAM APM SPI 3V PSRAM Datasheet.pdf - Rev. 2.3 Apr 30, 2020

    apemory, “APS6404L-3SQR QSPI PSRAM APM SPI 3V PSRAM Datasheet.pdf - Rev. 2.3 Apr 30, 2020.” [Online]. Available: https://docs.sparkfun.com /SparkFun Pro Micro RP2350/assets/component documentation/ APS6404L 3SQR Datasheet.pdf

  12. [12]

    CARLsim 6 GitHub Repository,

    UCI CARLsim Team, “CARLsim 6 GitHub Repository,” Cognitive Anteater Robotics Laboratory, University of Calfornia, Irvine. [Online]. Available: https://github.com/uci-carl/carlsim6

  13. [13]

    Simple model of spiking neurons,

    E. M. Izhikevich, “Simple model of spiking neurons,”IEEE Trans. Neural Netw., vol. 14, no. 6, pp. 1569–1572, 2003

  14. [14]

    Pico SDK Github repository

    Raspberry, “Pico SDK Github repository.” [Online]. Available: https://github.com/raspberrypi/pico-sdk

  15. [15]

    Getting Started guide (PDF)

    ——, “Getting Started guide (PDF).” [Online]. Available: https://datasheets.raspberrypi.com/pico/getting-started-with-pico.pdf

  16. [16]

    VS Code extension for Raspberry Pi Pico Github repository

    ——, “VS Code extension for Raspberry Pi Pico Github repository.” [Online]. Available: https://github.com/raspberrypi/pico-sdk

  17. [17]

    SparkFun Pro Micro RP2350 schematic

    Sparkfun, “SparkFun Pro Micro RP2350 schematic.” [On- line]. Available: https://cdn.sparkfun.com /assets/6/2/1/0/8/ Spark- Fun ProMicro RP2350.pdf

  18. [18]

    QSPI (Quad Serial Peripheral Interface) PSRAM

    apemory, “QSPI (Quad Serial Peripheral Interface) PSRAM.” [Online]. Available: https://www.apmemory.com/en/product/iotram/SPIQSPI

  19. [19]

    Available: https://github.com/sparkfun/sparkfun- pico

    SparkFun, “.” [Online]. Available: https://github.com/sparkfun/sparkfun- pico

  20. [20]

    TLSF: Memory allocator real time embedded systems

    gii.upv.es, “TLSF: Memory allocator real time embedded systems.” [Online]. Available: http://www.gii.upv.es/tlsf/index.html

  21. [21]

    TLSF: Memory allocator real time embedded systems

    E. Matt Conte, “TLSF: Memory allocator real time embedded systems.” [Online]. Available: https://github.com/espressif /tlsf/tree/8fc595fe223cd0b3b5d7b29eb86825e4bd38e6e8

  22. [22]

    Raspberry Pi 3- pin Debug Connector Specification

    Raspberry Pi Trading LTD, “Raspberry Pi 3- pin Debug Connector Specification.” [Online]. Available: https://datasheets.raspberrypi.com/debug/debug-connector- specification.pdf

  23. [23]

    Raspberry Pi Debug Probe

    Arm Limited, “Raspberry Pi Debug Probe .” [Online]. Available: https://www.raspberrypi.com/documentation/microcontrollers/debug- probe.html

  24. [24]

    Arm Debug Interface Architecture Specification

    ——, “Arm Debug Interface Architecture Specification.” [Online]. Available: https://developer.arm.com/documentation/ihi0031/g?lang=en

  25. [25]

    CMSIS-DSP embedded compute library for Cortex-M and Cortex-A

    Arm Software , “CMSIS-DSP embedded compute library for Cortex-M and Cortex-A.” [Online]. Available: https://github.com/ARM- software/CMSIS-DSP

  26. [26]

    HPC - Defining Floating Point Precision FP64, FP32, FP16

    E. corp., “HPC - Defining Floating Point Precision FP64, FP32, FP16.” [Online]. Available: https://www.exxactcorp.com/blog/hpc/what- is-fp64-fp32-fp16

  27. [27]

    bfloat16 floating-point format

    Wikipedia, “bfloat16 floating-point format.” [On- line]. Available: https://en.wikipedia.org/wiki/Bfloat16 floating- point format#Rounding and conversion

  28. [28]

    et al.: The SpiNNaker 2 Processing Element Architecture for Hybrid Digital Neuromorphic Computing (Aug 2022), arXiv:2103.08392 [cs]

    S. H ¨oppner, Y . Yan, A. Dixius, S. Scholze, J. Partzsch, M. Stolba, F. Kelber, B. V ogginger, F. Neum ¨arker, G. Ellguth, S. Hartmann, S. Schiefer, T. Hocker, D. Walter, G. Liu, J. Garside, S. Furber, and C. Mayr, “The spinnaker 2 processing element architecture for hybrid digital neuromorphic computing,” 2022. [Online]. Available: https://arxiv.org/abs...

  29. [29]

    Hardware-friendly implementation of physical reservoir computing with CMOS-based time-domain analog spiking neurons,

    L. Niedermeier, V . Shah, and J. L. Krichmar, “Benchmark optimized snn simulations,”Neuromorphic Computing and Engineering, vol. 0, no. 0, p. 9, jan 2026. [Online]. Available: https://doi.org/10.1088/2634- 4386/asdf

  30. [30]

    Power analysis of large-scale, real-time neural networks on SpiNNaker,

    E. Stromatias, F. Galluppi, C. Patterson, and S. Furber, “Power analysis of large-scale, real-time neural networks on SpiNNaker,” inThe 2013 International Joint Conference on Neural Networks (IJCNN). IEEE, 2013, pp. 1–8

  31. [31]

    Benchmarking keyword spotting efficiency on neuromorphic hardware,

    P. Blouw, X. Choo, E. Hunsberger, and C. Eliasmith, “Benchmarking keyword spotting efficiency on neuromorphic hardware,” inProceedings of the 7th Annual Neuro-inspired Computational Elements Workshop, 2019, pp. 1–8

  32. [32]

    WCCI IJCNN 2026 1234: Supplemen- tal Data and Meterial - Synfire4 bench- mark

    larsnm, “WCCI IJCNN 2026 1234: Supplemen- tal Data and Meterial - Synfire4 bench- mark.” [Online]. Available: https://github.com/anonymous author/ wcci ijcnn 2026 1234/MOV 0549.mp4

  33. [33]

    WCCI IJCNN 2026 1234: Supplemen- tal Data and Meterial - Synfire4-mini bench- mark

    ——, “WCCI IJCNN 2026 1234: Supplemen- tal Data and Meterial - Synfire4-mini bench- mark.” [Online]. Available: https://github.com/anonymous author/ wcci ijcnn 2026 1234/MOV 0550.mp4

  34. [34]

    WCCI IJCNN 2026 1234: Supplemen- tal Data and Meterial - Synfire4-mini benchmark (A53)

    ——, “WCCI IJCNN 2026 1234: Supplemen- tal Data and Meterial - Synfire4-mini benchmark (A53).” [Online]. Available: https://github.com/anonymous author/ wcci ijcnn 2026 1234/MOV 0551.mp4

  35. [35]

    TLUR540 Universal LED 5 mm Tinted Diffused Package, RED

    Vishay Semiconductors, “TLUR540 Universal LED 5 mm Tinted Diffused Package, RED.” [Online]. Available: https://cdn- reichelt.de/documents/datenblatt/A500/TLUR540X ENG TDS.pdf