Scalable Construction of Spiking Neural Networks using up to thousands of GPUs
Pith reviewed 2026-05-21 17:55 UTC · model grok-4.3
The pith
Each MPI process builds only its local part of a spiking neural network to support efficient spike exchange on large GPU clusters.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
A novel method for building spiking neural networks on multi-GPU systems allows each process to construct its local connectivity and prepare data structures for efficient spike exchange during simulation, achieving good scaling performance on two cortical models with point-to-point and collective communication.
What carries the argument
The per-process local connectivity construction and preparation of spike exchange data structures using MPI.
If this is right
- Large cortical models with billions of neurons can be simulated without a central construction bottleneck.
- Both point-to-point and collective MPI communication patterns support efficient scaling.
- Memory and communication overheads are managed locally per process for sparse networks.
- The technique is suitable for exascale systems with thousands of GPUs.
Where Pith is reading between the lines
- Local construction may allow simulations to start faster by avoiding a global network assembly step.
- This approach could extend to other types of large-scale network simulations in physics or biology.
- Load balancing during construction might need additional techniques for models with uneven connectivity.
Load-bearing premise
That constructing connectivity locally per process will result in communication patterns that stay efficient at large scales without causing bottlenecks or uneven workloads in realistic brain models.
What would settle it
If tests on larger GPU counts or more detailed cortical models show that spike exchange time dominates and scaling efficiency falls below linear, the scalability claim would be falsified.
read the original abstract
Diverse scientific and engineering research areas deal with discrete, time-stamped changes in large systems of interacting delay differential equations. Simulating such complex systems at scale on high-performance computing clusters demands efficient management of communication and memory. Inspired by the human cerebral cortex -- a sparsely connected network of $\mathcal{O}(10^{10})$ neurons, each forming $\mathcal{O}(10^{3})$--$\mathcal{O}(10^{4})$ synapses and communicating via short electrical pulses called spikes -- we study the simulation of large-scale spiking neural networks for computational neuroscience research. This work presents a novel network construction method for multi-GPU clusters and upcoming exascale supercomputers using the Message Passing Interface (MPI), where each process builds its local connectivity and prepares the data structures for efficient spike exchange across the cluster during state propagation. We demonstrate scaling performance of two cortical models using point-to-point and collective communication, respectively.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper presents a novel MPI-based network construction method for large-scale spiking neural networks on multi-GPU clusters and exascale systems. Each process independently builds its local connectivity and prepares data structures for spike exchange during simulation. Scaling performance is demonstrated for two cortical models, one using point-to-point communication and the other collective communication.
Significance. If the scaling claims hold with detailed verification, the work could enable efficient simulation of cortical-scale networks (O(10^10) neurons) on thousands of GPUs, addressing key challenges in communication and memory for discrete-event systems in computational neuroscience.
major comments (2)
- [Abstract] Abstract: The claim that scaling performance was demonstrated for two cortical models provides no quantitative metrics (e.g., speedup, communication volume, or wall-clock times), error analysis, or description of the performance measurement methodology. This is load-bearing for the central scalability claim up to thousands of GPUs.
- [Methods] Network construction description: The method states that each process builds local connectivity independently, but does not specify the partitioning of global adjacency information or whether any collective MPI operations occur during the build phase itself. This detail is required to confirm absence of hidden global coordination costs or load imbalance for structured, distance-dependent cortical topologies.
minor comments (2)
- [Results] Figure captions and axis labels in the scaling results could more explicitly state the model sizes, number of processes, and exact communication primitives used to improve reproducibility.
- The abstract mentions O(10^10) neurons and O(10^3)--O(10^4) synapses per neuron; a brief comparison table to the two specific cortical models tested would clarify how the demonstration relates to these scales.
Simulated Author's Rebuttal
We thank the referee for their constructive comments on our manuscript. We address each major comment in detail below and have made revisions to improve clarity and completeness.
read point-by-point responses
-
Referee: [Abstract] Abstract: The claim that scaling performance was demonstrated for two cortical models provides no quantitative metrics (e.g., speedup, communication volume, or wall-clock times), error analysis, or description of the performance measurement methodology. This is load-bearing for the central scalability claim up to thousands of GPUs.
Authors: We agree that the abstract would be strengthened by including quantitative indicators of the demonstrated scaling. In the revised version we will add specific metrics (e.g., wall-clock times and speedup factors on up to thousands of GPUs for both models) together with a concise reference to the performance-measurement approach used in the experiments. revision: yes
-
Referee: [Methods] Network construction description: The method states that each process builds local connectivity independently, but does not specify the partitioning of global adjacency information or whether any collective MPI operations occur during the build phase itself. This detail is required to confirm absence of hidden global coordination costs or load imbalance for structured, distance-dependent cortical topologies.
Authors: We thank the referee for noting this omission. The global network is partitioned by a spatial decomposition that assigns neurons to MPI processes according to their cortical coordinates; each process then generates its local outgoing connections from the distance-dependent probability rules without ever materializing the full global adjacency matrix. No collective MPI operations are invoked during construction—all inter-process communication is confined to the subsequent simulation phase. We will insert an explicit paragraph describing this partitioning strategy and confirming the absence of collectives in the build phase. revision: yes
Circularity Check
No circularity: methods paper with independent algorithmic description and scaling results
full rationale
The paper presents a novel MPI-based method for constructing spiking neural networks on multi-GPU clusters, with each process building local connectivity and preparing spike-exchange data structures, followed by empirical scaling demonstrations on two cortical models. No mathematical derivations, equations, fitted parameters, or self-referential claims are indicated in the provided abstract or description. The work is self-contained as a methods and performance report; claims rest on the described construction algorithm and observed scaling behavior rather than reducing to inputs by construction or via load-bearing self-citation chains.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption The chosen cortical models are representative and their connectivity patterns can be partitioned without loss of essential dynamics.
Reference graph
Works this paper leans on
-
[1]
arXiv, 2505–21185 (2025) https://doi.org/10.48550/arXiv.2505.21185
Senk, J., Kurth, A.C., Furber, S., Gemmeke, T., Golosio, B., Heittmann, A., Knight, J.C., M¨ uller, E., Noll, T., Nowotny, T., Coppola, G.P., Peres, L., Rhodes, O., Rowley, A., Schemmel, J., Stadtmann, T., Tetzlaff, T., Tiddia, G., Albada, S.J., Villamar, J., Diesmann, M.: Constructive community race: full-density spik- ing neural network model drives neu...
-
[2]
Frontiers in Neuroinformatics 17(2023) https://doi.org/10.3389/fninf.2023.1157418
Aimone, J.B., Awile, O., Diesmann, M., Knight, J.C., Nowotny, T., Sch¨ urmann, F.: Editorial: Neuroscience, computing, performance, and benchmarks: Why it matters to neuroscience how fast we can compute. Frontiers in Neuroinformatics 17(2023) https://doi.org/10.3389/fninf.2023.1157418
-
[3]
Journal of large-scale research facilities JLSRF9(1) (2024) https://doi.org/10.17815/jlsrf-8-186
Turisini, M., Cestari, M., Amati, G.: Leonardo: A pan-european pre-exascale supercomputer for hpc and ai applications. Journal of large-scale research facilities JLSRF9(1) (2024) https://doi.org/10.17815/jlsrf-8-186
-
[4]
Technical report, Knoxville, Tennessee (2009)
Message Passing Interface Forum: MPI: A message-passing interface stan- dard, version 2.2. Technical report, Knoxville, Tennessee (2009). http://www. mpi-forum.org/docs
work page 2009
-
[5]
Nature Computational Science4(12), 890–898 (2024) https://doi.org/10.1038/s43588-024-00731-3
Lu, W., Du, X., Wang, J., Zeng, L., Ye, L., Xiang, S., Zheng, Q., Zhang, J., Xu, N., Feng, J., Bao, Y., Chen, B., Chen, S., Chen, Z., Dai, F., Ding, W., Du, X., Feng, J., Hou, Y., Ji, M., Ji, P., Li, C., Li, C., Li, X., Liu, Y., Lu, W., Lv, Z., Ma, H., Qi, Y., Rolls, E., Wang, H., Wang, H., Wang, S., Wang, Z., Xia, Y., Xie, C., Xue, X., Zeng, T., Zhang, C...
-
[6]
Du, X., Wang, M., Lu, Z., Duan, Q., Liu, Y., Feng, J., Wang, H.: Hrcm: A hierarchical regularizing mechanism for sparse and imbalanced communication in whole human brain simulations. IEEE Transactions on Parallel and Distributed Systems35(6), 1056–1073 (2024) https://doi.org/10.1109/TPDS.2024.3387720
-
[7]
Scientific Reports6(1) (2016) https://doi.org/10
Yavuz, E., Turner, J., Nowotny, T.: GeNN: a code generation framework for accelerated brain simulations. Scientific Reports6(1) (2016) https://doi.org/10. 25 1038/srep18854
work page 2016
-
[8]
Nature Computational Science1(2), 136–142 (2021) https: //doi.org/10.1038/s43588-020-00022-7
Knight, J.C., Nowotny, T.: Larger GPU-accelerated brain simulations with pro- cedural connectivity. Nature Computational Science1(2), 136–142 (2021) https: //doi.org/10.1038/s43588-020-00022-7
-
[9]
Beyeler, M., Carlson, K.D., Chou, T.S., Dutt, N.D., Krichmar, J.L.: Carlsim 3: A user-friendly and highly optimized library for the creation of neurobiologically detailed spiking neural networks. Proceedings of the International Joint Confer- ence on Neural Networks (IJCNN), 1–8 (2015) https://doi.org/10.1109/IJCNN. 2015.7280694
-
[10]
Niedermeier, L., Chen, K., Xing, J., Das, A., Kopsick, J., Scott, E., Sutton, N., Weber, K., Dutt, N., Krichmar, J.L.: Carlsim 6: An open source library for large-scale, biologically detailed spiking neural network simulation. In: 2022 International Joint Conference on Neural Networks (IJCNN), pp. 1–10 (2022). https://doi.org/10.1109/IJCNN55064.2022.9892644
-
[11]
Cambridge Uni- versity Press, ??? (2006)
Carnevale, N.T., Hines, M.L.: The NEURON Book. Cambridge Uni- versity Press, ??? (2006). https://doi.org/10.1017/cbo9780511541612 . https://doi.org/10.1017/CBO9780511541612
-
[12]
Frontiers in Neuroinformatics13, 63 (2019) https://doi.org/10.3389/ fninf.2019.00063
Kumbhar, P., Hines, M., Fouriaux, J., Ovcharenko, A., King, J., Delalondre, F., Sch¨ urmann, F.: Coreneuron: An optimized compute engine for the neuron simulator. Frontiers in Neuroinformatics13, 63 (2019) https://doi.org/10.3389/ fninf.2019.00063
-
[13]
Frontiers in Neuroinformatics16(2022) https://doi.org/10.3389/fninf.2022.884046
Awile, O., Kumbhar, P., Cornu, N., Dura-Bernal, S., King, J.G., Lupton, O., Magkanaris, I., McDougal, R.A., Newton, A.J.H., Pereira, F., S˘ avulescu, A., Carnevale, N.T., Lytton, W.W., Hines, M.L., Sch¨ urmann, F.: Modernizing the NEURON simulator for sustainability, portability, and performance. Frontiers in Neuroinformatics16(2022) https://doi.org/10.33...
-
[14]
Frontiers in Neuroinformatics15(2022) https://doi.org/10.3389/ fninf.2021.785068
Pronold, J., Jordan, J., Wylie, B.J.N., Kitayama, I., Diesmann, M., Kunkel, S.: Routing brain traffic through the von neumann bottleneck: Parallel sorting and refactoring. Frontiers in Neuroinformatics15(2022) https://doi.org/10.3389/ fninf.2021.785068
-
[15]
Frontiers in Neuroinformatics16(2022) https://doi.org/10.3389/fninf.2022.883333
Tiddia, G., Golosio, B., Albers, J., Senk, J., Simula, F., Pronold, J., Fanti, V., Pastorelli, E., Paolucci, P.S., Albada, S.J.: Fast simulation of a multi-area spiking network model of macaque cortex on an mpi-gpu cluster. Frontiers in Neuroinformatics16(2022) https://doi.org/10.3389/fninf.2022.883333
-
[16]
Scholarpedia 2(4), 1430 (2007) 26
Gewaltig, M.-O., Diesmann, M.: NEST (NEural Simulation Tool). Scholarpedia 2(4), 1430 (2007) 26
work page 2007
-
[17]
http://www.openmp.org/mp-documents/spec30.pdf
OpenMP Architecture Review Board: OpenMP Application Program Inter- face. http://www.openmp.org/mp-documents/spec30.pdf. Accessed: 2016-09-27 (2008)
work page 2016
-
[18]
Frontiers in Neuroinformatics8, 78 (2014) https://doi.org/10.3389/ fninf.2014.00078
Kunkel, S., Eppler, J.M., Plesser, H.E., Pyka, A., Courcol, J.-D., Potjans, T.C., Diesmann, M., Morrison, A.,et al.: Spiking network simulation code for petascale computers. Frontiers in Neuroinformatics8, 78 (2014) https://doi.org/10.3389/ fninf.2014.00078
-
[19]
Frontiers in Neuroinformatics12(2018) https://doi.org/10.3389/fninf.2018.00002
Jordan, J., Ippen, T., Helias, M., Kitayama, I., Sato, M., Igarashi, J., Diesmann, M., Kunkel, S.: Extremely scalable spiking neuronal network simulation code: From laptops to exascale computers. Frontiers in Neuroinformatics12(2018) https://doi.org/10.3389/fninf.2018.00002
-
[20]
PLOS Computational Biology14(10), 1006359 (2018) https://doi.org/10.1371/journal.pcbi.1006359
Schmidt, M., Bakker, R., Shen, K., Bezgin, G., Diesmann, M., Albada, S.J.: A multi-scale layer-resolved spiking network model of resting-state dynamics in macaque visual cortical areas. PLOS Computational Biology14(10), 1006359 (2018) https://doi.org/10.1371/journal.pcbi.1006359
-
[21]
Cerebral Cortex34(10), 409 (2024) https://doi.org/10.1093/cercor/ bhae409
Pronold, J., Meegen, A., Shimoura, R.O., Vollenbr¨ oker, H., Senden, M., Hilgetag, C.C., Bakker, R., Albada, S.J.: Multi-scale spiking network model of human cere- bral cortex. Cerebral Cortex34(10), 409 (2024) https://doi.org/10.1093/cercor/ bhae409
-
[22]
Cerebral Cortex34(10), 405 (2024) https://doi.org/10.1093/cercor/bhae405
Senk, J., Hagen, E., Albada, S.J., Diesmann, M.: Reconciliation of weak pairwise spike–train correlations and highly coherent local field potentials across space. Cerebral Cortex34(10), 405 (2024) https://doi.org/10.1093/cercor/bhae405
-
[23]
Nature Computational Science3(3), 264–276 (2023) https://doi.org/10.1038/ s43588-023-00417-2
Gandolfi, D., Mapelli, J., Solinas, S.M.G., Triebkorn, P., D’Angelo, E., Jirsa, V., Migliore, M.: Full-scale scaffold model of the human hippocampus ca1 area. Nature Computational Science3(3), 264–276 (2023) https://doi.org/10.1038/ s43588-023-00417-2
work page 2023
-
[24]
Nature637(8047), 801–812 (2025) https://doi.org/10.1038/s41586-024-08253-8
Kudithipudi, D., Schuman, C., Vineyard, C.M., Pandit, T., Merkel, C., Kuben- dran, R., Aimone, J.B., Orchard, G., Mayr, C., Benosman, R., Hays, J., Young, C., Bartolozzi, C., Majumdar, A., Cardwell, S.G., Payvand, M., Buckley, S., Kulkarni, S., Gonzalez, H.A., Cauwenberghs, G., Thakur, C.S., Subramoney, A., Furber, S.: Neuromorphic computing at scale. Nat...
-
[25]
Frontiers in Neuroinformatics16(2022) https://doi.org/10.3389/fninf
Albers, J., Pronold, J., Kurth, A.C., Vennemo, S.B., Mood, K.H., Patronis, A., Terhorst, D., Jordan, J., Kunkel, S., Tetzlaff, T., Diesmann, M., Senk, J.: A modular workflow for performance benchmarking of neuronal network simu- lations. Frontiers in Neuroinformatics16(2022) https://doi.org/10.3389/fninf. 2022.837549
-
[26]
Frontiers in Computational Neuroscience15(2021) https://doi.org/10.3389/fncom.2021
Golosio, B., Tiddia, G., De Luca, C., Pastorelli, E., Simula, F., Paolucci, P.S.: 27 Fast simulations of highly-connected spiking cortical models using gpus. Frontiers in Computational Neuroscience15(2021) https://doi.org/10.3389/fncom.2021. 627620
-
[27]
Applied Sciences13(17), 9598 (2023) https://doi.org/10.3390/app13179598
Golosio, B., Villamar, J., Tiddia, G., Pastorelli, E., Stapmanns, J., Fanti, V., Paolucci, P.S., Morrison, A., Senk, J.: Runtime construction of large-scale spiking neuronal network models on GPU devices. Applied Sciences13(17), 9598 (2023) https://doi.org/10.3390/app13179598
-
[28]
Herten, A., Achilles, S., Alvarez, D., Badwaik, J., Behle, E., Bode, M., Breuer, T., Caviedes-Voulli` eme, D., Cherti, M., Dabah, A., Sayed, S.E., Frings, W., Gonzalez-Nicolas, A., Gregory, E.B., Mood, K.H., Hater, T., Jitsev, J., John, C.M., Meinke, J.H., Meyer, C.I., Mezentsev, P., Mirus, J.-O., Nassyr, S., Penke, C., R¨ ommer, M., Sinha, U., Vieth, B.v...
work page internal anchor Pith review Pith/arXiv arXiv doi:10.1109/sc41406.2024.00038 2024
-
[29]
Brain Structure and Function223(3), 1409–1435 (2017) https://doi.org/10.1007/s00429-017-1554-4
Schmidt, M., Bakker, R., Hilgetag, C.C., Diesmann, M., Albada, S.J.: Multi-scale account of the network structure of macaque visual cortex. Brain Structure and Function223(3), 1409–1435 (2017) https://doi.org/10.1007/s00429-017-1554-4
-
[30]
Journal of large-scale research facilities JLSRF 7(2021) https://doi.org/10.17815/jlsrf-7-179
Vieth, B.V.S.: JUSUF: Modular tier-2 supercomputing and cloud infrastructure at j¨ ulich supercomputing centre. Journal of large-scale research facilities JLSRF 7(2021) https://doi.org/10.17815/jlsrf-7-179
-
[31]
Brunel, N.: Dynamics of sparsely connected networks of excitatory and inhibitory spiking neurons. Journal of Computational Neuroscience8(3), 183–208 (2000) https://doi.org/10.1023/a:1008925309027
-
[32]
Frontiers in Neurosciencevolume 5 - 2011(2011) https: //doi.org/10.3389/fnins.2011.00032
Boucsein, C., Nawrot, M., Schnepel, P., Aertsen, A.: Beyond the cortical column: Abundance and physiology of horizontal connections imply a strong role for inputs from the surround. Frontiers in Neurosciencevolume 5 - 2011(2011) https: //doi.org/10.3389/fnins.2011.00032
-
[33]
PLOS Biology20(3), 3001575 (2022) https://doi.org/10.1371/journal.pbio.3001575
Rosen, B.Q., Halgren, E.: An estimation of the absolute number of axons indicates that human cortical areas are sparsely connected. PLOS Biology20(3), 3001575 (2022) https://doi.org/10.1371/journal.pbio.3001575
-
[34]
PLOS Computational Biology18(9), 1010086 (2022) https://doi.org/10.1371/journal.pcbi.1010086
Senk, J., Kriener, B., Djurfeldt, M., Voges, N., Jiang, H.-J., Sch¨ uttler, L., Gramelsberger, G., Diesmann, M., Plesser, H.E., Albada, S.J.: Connectivity concepts in neuronal network modeling. PLOS Computational Biology18(9), 1010086 (2022) https://doi.org/10.1371/journal.pcbi.1010086
-
[35]
NVIDIA Corporation: CUDA Toolkit Documentation. (2024). Version 12.5. https: //developer.nvidia.com/cuda-toolkit 28
work page 2024
-
[36]
Cerebral Cortex 24(3), 785–806 (2014) https://doi.org/10.1093/cercor/bhs358
Potjans, T.C., Diesmann, M.: The cell-type specific cortical microcircuit: Relating structure and activity in a full-scale spiking network model. Cerebral Cortex 24(3), 785–806 (2014) https://doi.org/10.1093/cercor/bhs358
-
[37]
Multi-Scale Modeling in Morphogenesis: A Critical Analysis of the Cellular Potts Model
Schuecker, J., Schmidt, M., Albada, S.J., Diesmann, M., Helias, M.: Fundamen- tal activity constraints lead to specific interpretations of the connectome. PLOS Computational Biology13(2), 1005179 (2017) https://doi.org/10.1371/journal. pcbi.1005179
-
[38]
John Wiley & Sons, Inc., USA (1990)
Martello, S., Toth, P.: Knapsack Problems: Algorithms and Computer Implemen- tations. John Wiley & Sons, Inc., USA (1990)
work page 1990
-
[39]
Frontiers in Neuroinformatics6(2012) https: //doi.org/10.3389/fninf.2012.00026
Helias, M., Kunkel, S., Masumoto, G., Igarashi, J., Eppler, J.M., Ishii, S., Fukai, T., Morrison, A., Diesmann, M.: Supercomputers ready for use as dis- covery machines for neuroscience. Frontiers in Neuroinformatics6(2012) https: //doi.org/10.3389/fninf.2012.00026
-
[40]
Brain98(1), 81–90 (1975) https://doi.org/10.1093/ brain/98.1.81
CRAGG, B.G.: The density of synapses and neurons in normal, mentally defective and ageing human brains. Brain98(1), 81–90 (1975) https://doi.org/10.1093/ brain/98.1.81
work page 1975
-
[41]
Alonso-Nanclares, L., Gonzalez-Soriano, J., Rodriguez, J.R., DeFelipe, J.: Gen- der differences in human cortical synaptic density. Proceedings of the National Academy of Sciences105(38), 14615–14619 (2008) https://doi.org/10.1073/pnas. 0803652105
-
[42]
Frontiers in Neuroscience15(2021) https://doi.org/10.3389/fnins.2021.757790
Dasbach, S., Tetzlaff, T., Diesmann, M., Senk, J.: Dynamical characteristics of recurrent neuronal networks are robust against low synaptic weight resolution. Frontiers in Neuroscience15(2021) https://doi.org/10.3389/fnins.2021.757790
-
[43]
Waskom, M.L.: seaborn: statistical data visualization. Journal of Open Source Software6(60), 3021 (2021) https://doi.org/10.21105/joss.03021
-
[44]
The Annals of Mathematical Statistics27(3), 832–837 (1956) https://doi.org/10
Rosenblatt, M.: Remarks on some nonparametric estimates of a density function. The Annals of Mathematical Statistics27(3), 832–837 (1956) https://doi.org/10. 1214/aoms/1177728190
-
[45]
Parzen, E.: On estimation of a probability density function and mode. The Annals of Mathematical Statistics33(3), 1065–1076 (1962) https://doi.org/10. 1214/aoms/1177704472
-
[46]
Chapman and Hall, London (1986)
Silverman, B.W.: Density Estimation for Statistics and Data Analysis. Chapman and Hall, London (1986)
work page 1986
-
[47]
Virtanen, P., Gommers, R., Oliphant, T.E., Haberland, M., Reddy, T., Courna- peau, D., Burovski, E., Peterson, P., Weckesser, W., Bright, J., van der Walt, S.J., Brett, M., Wilson, J., Millman, K.J., Mayorov, N., Nelson, A.R.J., Jones, 29 E., Kern, R., Larson, E., Carey, C.J., Polat, ˙I., Feng, Y., Moore, E.W., Vander- Plas, J., Laxalde, D., Perktold, J.,...
work page 2020
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.