pith. sign in

arxiv: 2209.03358 · v5 · pith:XBBHUPYUnew · submitted 2022-09-07 · 💻 cs.NE · cs.AI· cs.CR· cs.CV· cs.LG

Attacking the Spike: On the Transferability and Security of Spiking Neural Networks to Adversarial Examples

Pith reviewed 2026-05-24 10:37 UTC · model grok-4.3

classification 💻 cs.NE cs.AIcs.CRcs.CVcs.LG
keywords adversarial attacksspiking neural networkssurrogate gradient estimatorstransferabilitymodel robustnessensemble attacksMDSE
0
0 comments X

The pith

MDSE dynamically mixes surrogate gradient estimators to create adversarial examples that transfer across spiking neural networks and other models.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper establishes that white-box attacks on spiking neural networks succeed or fail depending on the surrogate gradient estimator used during the attack, even when the networks have received adversarial training. It identifies two gaps: existing attacks do not combine multiple estimators, and no single-model attack reliably crosses from SNNs to CNNs or vision transformers. MDSE fills these gaps with a dynamic switching scheme that draws on several estimators at once. A sympathetic reader would care because SNNs are promoted for low-energy hardware, so weaknesses in their adversarial robustness affect whether they can be deployed safely. If the claim holds, current attack benchmarks and defense evaluations for SNNs must incorporate mixed-estimator strategies.

Core claim

Successful white-box adversarial attacks on SNNs are highly dependent on the underlying surrogate gradient estimator, even for adversarially trained SNNs. No existing white-box attack exploits multiple surrogate gradient estimators for SNNs, and no single-model attack reliably generates adversarial examples that simultaneously fool both SNN and non-SNN models. MDSE uses a dynamic gradient estimation scheme to fully exploit multiple surrogate gradient estimator functions and generates adversarial examples capable of fooling SNN and non-SNN models simultaneously, proving up to 91.4 percent more effective on SNN/ViT ensembles and delivering a 3x boost on adversarially trained SNN ensembles.

What carries the argument

The Mixed Dynamic Spiking Estimation (MDSE) attack, which applies a dynamic scheme to switch among multiple surrogate gradient estimators while crafting adversarial examples.

If this is right

  • White-box attacks on SNNs must combine multiple surrogate gradient estimators to reach peak effectiveness.
  • Adversarially trained SNNs remain vulnerable when the attacker can switch estimators during example generation.
  • Adversarial examples produced by MDSE transfer to both SNNs and non-SNN models on CIFAR-10, CIFAR-100, and ImageNet.
  • Single-estimator attacks leave measurable gaps in coverage when models from different families are grouped into ensembles.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • Training SNNs with randomized or hidden surrogate gradient choices could reduce the advantage of dynamic attacks.
  • Shared vulnerabilities between SNNs and vision transformers may point to common decision-boundary features that future defenses could target.
  • The dynamic scheme could be tested on additional spiking architectures or neuromorphic chips to check whether the transfer gains persist beyond the nineteen models studied.

Load-bearing premise

An attacker can implement the dynamic switching among surrogate gradient estimators without knowing in advance which estimator the target SNN used during training.

What would settle it

MDSE shows no improvement over single-estimator attacks such as Auto-PGD when evaluated on an SNN trained with a surrogate gradient estimator outside the set used to build MDSE.

Figures

Figures reproduced from arXiv: 2209.03358 by Caiwen Ding, Ethan Rathbun, Haowen Fang, Kaleel Mahmood, Nuo Xu, Wujie Wen.

Figure 1
Figure 1. Figure 1: White-box attack on SNN models using different surrogate gradients. (a) (b) (c) indicate results on CIFAR-10, (d) [PITH_FULL_IMAGE:figures/full_fig_p003_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Visual representation of transferability re [PITH_FULL_IMAGE:figures/full_fig_p005_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Different surrogate gradient functions. SNN Energy Efficiency Architecture Dataset Normalized ANN #OP Normalized SNN #OP ANN/SNN Energy SEW-ResNet CIFAR 10 1 0.4052 12.61 SEW-ResNet CIFAR 100 1 0.5788 8.83 SEW-ResNet ImageNet 1 0.5396 9.47 Vanilla Spiking ResNet ImageNet 1 0.6776 7.54 Transfer Spiking VGG 16 ImageNet 1 2.868 1.78 [PITH_FULL_IMAGE:figures/full_fig_p010_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: Accuracy of SEW ResNet on CIFAR 100 with re [PITH_FULL_IMAGE:figures/full_fig_p012_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: Accuracy of Vanilla Spiking ResNet on ImageNet [PITH_FULL_IMAGE:figures/full_fig_p013_5.png] view at source ↗
Figure 6
Figure 6. Figure 6: Visual representation of Table 20 for CIFAR-100. The x-axis corresponds to the model used to generate the adver [PITH_FULL_IMAGE:figures/full_fig_p015_6.png] view at source ↗
read the original abstract

Spiking neural networks (SNNs) have attracted much attention for their high energy efficiency and recent advances in classification performance. However, unlike traditional deep learning approaches, the study of SNN robustness to adversarial examples remains relatively underdeveloped. In this work, we advance the adversarial attack side of SNNs through three contributions. First, we show that successful white-box adversarial attacks on SNNs are highly dependent on the underlying surrogate gradient estimator, even for adversarially trained SNNs. Second, using the best single surrogate gradient estimator, we analyze the transferability of adversarial attacks across SNNs, Vision Transformers (ViTs) and CNNs. Our analysis reveals two key gaps: no existing white-box attack exploits multiple surrogate gradient estimators for SNNs, and no single-model attack reliably generates adversarial examples that simultaneously fool both SNN and non-SNN models. For our third contribution, we develop the Mixed Dynamic Spiking Estimation (MDSE) attack to address these issues. MDSE uses a dynamic gradient estimation scheme to fully exploit multiple surrogate gradient estimator functions and generates adversarial examples capable of fooling SNN and non-SNN models simultaneously. MDSE is up to 91.4% more effective on SNN/ViT model ensembles and provides a 3x boost on adversarially trained SNN ensembles compared to conventional white-box attacks like Auto-PGD. Experiments cover three datasets (CIFAR-10, CIFAR-100, ImageNet) and nineteen classifier models (seven per CIFAR dataset, five for ImageNet). Our implementation of MDSE and the evaluated models is publicly available at https://github.com/nuoxuxxx/attacking-the-spike-mdse.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The manuscript claims that successful white-box adversarial attacks on spiking neural networks (SNNs) depend strongly on the choice of surrogate gradient estimator (even for adversarially trained models), that existing attacks show limited transferability across SNNs, Vision Transformers (ViTs), and CNNs, and introduces the Mixed Dynamic Spiking Estimation (MDSE) attack. MDSE employs a dynamic switching scheme over multiple surrogate estimators to generate examples that simultaneously fool SNN and non-SNN models, reporting up to 91.4% higher effectiveness on SNN/ViT ensembles and a 3x boost on adversarially trained SNN ensembles relative to Auto-PGD. Experiments span CIFAR-10, CIFAR-100, and ImageNet with 19 models; code is released.

Significance. If the MDSE results hold under the condition of unknown surrogates, the work would meaningfully advance SNN security research by demonstrating a practical multi-estimator attack that bridges SNN and non-SNN architectures, with the public code release providing a clear reproducibility strength.

major comments (2)
  1. [§3.3, Algorithm 1] §3.3 and Algorithm 1: The dynamic switching scheme selects among a fixed set of estimators (e.g., SLAYER, STBP). The manuscript must explicitly state, for each target SNN in Tables 2–4, whether the MDSE instance used during attack already contained that model’s exact training surrogate; otherwise the headline transferability claims (91.4 % gain, 3× boost) do not demonstrate robustness to an attacker lacking prior knowledge of the target’s training estimator.
  2. [Abstract, §4] Abstract and §4 (results tables): The reported quantitative improvements lack error bars, standard deviations across random seeds, or statistical significance tests. Without these, it is impossible to assess whether the stated gains are reliable given the known variability of adversarial success rates.
minor comments (2)
  1. The exact composition of the 19 models (seven per CIFAR dataset, five for ImageNet) and the precise attack hyperparameters (step size, number of iterations, surrogate set size) should be tabulated in the main text rather than left to the appendix or code.
  2. [§4] Figure captions in §4 could more explicitly describe the ensemble construction and the precise metric used for “effectiveness.”

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their thorough review and constructive suggestions. Below we address each major comment in detail. We plan to incorporate revisions as indicated to improve the clarity and rigor of our work.

read point-by-point responses
  1. Referee: [§3.3, Algorithm 1] §3.3 and Algorithm 1: The dynamic switching scheme selects among a fixed set of estimators (e.g., SLAYER, STBP). The manuscript must explicitly state, for each target SNN in Tables 2–4, whether the MDSE instance used during attack already contained that model’s exact training surrogate; otherwise the headline transferability claims (91.4 % gain, 3× boost) do not demonstrate robustness to an attacker lacking prior knowledge of the target’s training estimator.

    Authors: We agree that this clarification is essential for properly interpreting the transferability claims. In the revised manuscript, we will update §3.3 and the description of Algorithm 1 to explicitly state, for each target SNN in Tables 2–4, whether the fixed set of estimators used by MDSE includes that model's training surrogate. This will allow readers to assess the extent to which the results demonstrate robustness to unknown surrogates. revision: yes

  2. Referee: [Abstract, §4] Abstract and §4 (results tables): The reported quantitative improvements lack error bars, standard deviations across random seeds, or statistical significance tests. Without these, it is impossible to assess whether the stated gains are reliable given the known variability of adversarial success rates.

    Authors: We appreciate the referee's emphasis on statistical rigor. We will revise the abstract and §4 (including the results tables) to include error bars representing standard deviations across multiple random seeds for the key quantitative results. We will also report the outcomes of statistical significance tests to substantiate the reported gains. revision: yes

Circularity Check

0 steps flagged

No significant circularity; empirical attack evaluation is independent of fitted results.

full rationale

The paper proposes MDSE as a dynamic attack exploiting multiple surrogate gradient estimators and evaluates its effectiveness via direct measurement of attack success rates on 19 external models across three datasets. No derivation chain, fitted parameters renamed as predictions, self-definitional relations, or load-bearing self-citations appear in the abstract or described contributions. The central claims rest on experimental transferability results rather than any reduction to the paper's own inputs by construction. This is the expected outcome for an empirical security paper whose evaluation benchmarks are external to the method definition.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Empirical attack paper with no mathematical axioms, free parameters beyond standard adversarial ML hyperparameters, or invented entities.

pith-pipeline@v0.9.0 · 5878 in / 1089 out tokens · 23810 ms · 2026-05-24T10:37:52.252405+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

42 extracted references · 42 canonical work pages · 5 internal anchors

  1. [1]

    Abnar, S.; and Zuidema, W. 2020. Quantifying Attention Flow in Transformers. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, 4190--4197

  2. [2]

    Athalye, A.; Carlini, N.; and Wagner, D. 2018. Obfuscated Gradients Give a False Sense of Security: Circumventing Defenses to Adversarial Examples. In Proceedings of the 35th International Conference on Machine Learning, 274--283

  3. [3]

    Bellec, G.; Salaj, D.; Subramoney, A.; Legenstein, R.; and Maass, W. 2018. Long short-term memory and learning-to-learn in networks of spiking neurons. Advances in neural information processing systems, 31

  4. [4]

    Bengio, Y.; L \'e onard, N.; and Courville, A. 2013. Estimating or propagating gradients through stochastic neurons for conditional computation. arXiv preprint arXiv:1308.3432

  5. [5]

    Carlini, N.; and Wagner, D. 2017. Towards evaluating the robustness of neural networks. In 2017 ieee symposium on security and privacy (sp), 39--57. Ieee

  6. [6]

    H.; Dimou, G.; Joshi, P.; Imam, N.; Jain, S.; et al

    Davies, M.; Srinivasa, N.; Lin, T.-H.; Chinya, G.; Cao, Y.; Choday, S. H.; Dimou, G.; Joshi, P.; Imam, N.; Jain, S.; et al. 2018. Loihi: A neuromorphic manycore processor with on-chip learning. Ieee Micro, 38(1): 82--99

  7. [7]

    Davies, M.; Wild, A.; Orchard, G.; Sandamirskaya, Y.; Guerra, G. A. F.; Joshi, P.; Plank, P.; and Risbud, S. R. 2021. Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE, 109(5): 911--934

  8. [8]

    U.; Neil, D.; Binas, J.; Cook, M.; Liu, S.-C.; and Pfeiffer, M

    Diehl, P. U.; Neil, D.; Binas, J.; Cook, M.; Liu, S.-C.; and Pfeiffer, M. 2015. Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In 2015 International joint conference on neural networks (IJCNN), 1--8. ieee

  9. [9]

    Dong, Y.; Liao, F.; Pang, T.; Su, H.; Zhu, J.; Hu, X.; and Li, J. 2018. Boosting adversarial attacks with momentum. In Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR), 9185--9193

  10. [10]

    Dosovitskiy, A.; Beyer, L.; Kolesnikov, A.; Weissenborn, D.; Zhai, X.; Unterthiner, T.; Dehghani, M.; Minderer, M.; Heigold, G.; Gelly, S.; et al. 2020. An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale. In International Conference on Learning Representations

  11. [11]

    El-Allami, R.; Marchisio, A.; Shafique, M.; and Alouani, I. 2021. Securing deep spiking neural networks against adversarial attacks through inherent structural parameters. In 2021 Design, Automation & Test in Europe Conference & Exhibition (DATE), 774--779. IEEE

  12. [12]

    Fang, H.; Shrestha, A.; Zhao, Z.; and Qiu, Q. 2020. Exploiting neuron and synapse filter dynamics in spatial temporal learning of deep spiking neural network. In 29th International Joint Conference on Artificial Intelligence, IJCAI 2020, 2799--2806. International Joint Conferences on Artificial Intelligence

  13. [13]

    Fang, W.; Yu, Z.; Chen, Y.; Huang, T.; Masquelier, T.; and Tian, Y. 2021 a . Deep residual learning in spiking neural networks. Advances in Neural Information Processing Systems, 34: 21056--21069

  14. [14]

    Fang, W.; Yu, Z.; Chen, Y.; Masquelier, T.; Huang, T.; and Tian, Y. 2021 b . Incorporating learnable membrane time constant to enhance learning of spiking neural networks. In Proceedings of the IEEE/CVF International Conference on Computer Vision, 2661--2671

  15. [15]

    Goodfellow, I.; Shlens, J.; and Szegedy, C. 2015. Explaining and Harnessing Adversarial Examples. In International Conference on Learning Representations

  16. [16]

    Explaining and Harnessing Adversarial Examples

    Goodfellow, I. J.; Shlens, J.; and Szegedy, C. 2014. Explaining and harnessing adversarial examples. arXiv preprint arXiv:1412.6572

  17. [17]

    He, K.; Zhang, X.; Ren, S.; and Sun, J. 2016. Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition, 770--778

  18. [18]

    Kolesnikov, A.; Beyer, L.; Zhai, X.; Puigcerver, J.; Yung, J.; Gelly, S.; and Houlsby, N. 2020. Big Transfer (BiT): General Visual Representation Learning. Lecture Notes in Computer Science, 491–507

  19. [19]

    Kugele, A.; Pfeil, T.; Pfeiffer, M.; and Chicca, E. 2020. Efficient processing of spatio-temporal data streams with spiking neural networks. Frontiers in Neuroscience, 14: 439

  20. [20]

    Kundu, S.; Pedram, M.; and Beerel, P. A. 2021. Hire-snn: Harnessing the inherent robustness of energy-efficient deep spiking neural networks by training with crafted input noise. In Proceedings of the IEEE/CVF International Conference on Computer Vision, 5209--5218

  21. [21]

    Liu, Y.; Chen, X.; Liu, C.; and Song, D. 2016. Delving into transferable adversarial examples and black-box attacks. arXiv preprint arXiv:1611.02770

  22. [22]

    Madry, A.; Makelov, A.; Schmidt, L.; Tsipras, D.; and Vladu, A. 2018. Towards Deep Learning Models Resistant to Adversarial Attacks. In International Conference on Learning Representations

  23. [23]

    Mahmood, K.; Gurevin, D.; van Dijk, M.; and Nguyen, P. 2021. Beware the Black-Box: On the Robustness of Recent Defenses to Adversarial Examples. Entropy, 23: 1359

  24. [24]

    Mahmood, K.; Mahmood, R.; and Van Dijk, M. 2021. On the robustness of vision transformers to adversarial examples. In Proceedings of the IEEE/CVF International Conference on Computer Vision, 7838--7847

  25. [25]

    O.; Mostafa, H.; and Zenke, F

    Neftci, E. O.; Mostafa, H.; and Zenke, F. 2019. Surrogate gradient learning in spiking neural networks: Bringing the power of gradient-based optimization to spiking neural networks. IEEE Signal Processing Magazine, 36(6): 51--63

  26. [26]

    Rathi, N.; and Roy, K. 2021. DIET-SNN: A low-latency spiking neural network with direct input encoding and leakage and threshold optimization. IEEE Transactions on Neural Networks and Learning Systems

  27. [27]

    Rueckauer, B.; Bybee, C.; Goettsche, R.; Singh, Y.; Mishra, J.; and Wild, A. 2022. NxTF: An API and compiler for deep spiking neural networks on Intel Loihi. ACM Journal on Emerging Technologies in Computing Systems (JETC), 18(3): 1--22

  28. [28]

    S.; Lee, C.; Ponghiran, W.; and Roy, K

    Sharmin, S.; Panda, P.; Sarwar, S. S.; Lee, C.; Ponghiran, W.; and Roy, K. 2019. A comprehensive analysis on adversarial robustness of spiking neural networks. In 2019 International Joint Conference on Neural Networks (IJCNN), 1--8. IEEE

  29. [29]

    Sharmin, S.; Rathi, N.; Panda, P.; and Roy, K. 2020. Inherent adversarial robustness of deep spiking neural networks: Effects of discrete input encoding and non-linear activations. In European Conference on Computer Vision, 399--414. Springer

  30. [30]

    P.; Wu, Q.; and Qiu, Q

    Shrestha, A.; Fang, H.; Mei, Z.; Rider, D. P.; Wu, Q.; and Qiu, Q. 2022. A Survey on Neuromorphic Computing: Models and Hardware. IEEE Circuits and Systems Magazine, 22(2): 6--35

  31. [31]

    B.; and Orchard, G

    Shrestha, S. B.; and Orchard, G. 2018. Slayer: Spike layer error reassignment in time. Advances in neural information processing systems, 31

  32. [32]

    Simonyan, K.; and Zisserman, A. 2014. Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556

  33. [33]

    Szegedy, C.; Zaremba, W.; Sutskever, I.; Bruna, J.; Erhan, D.; Goodfellow, I.; and Fergus, R. 2013. Intriguing properties of neural networks. arXiv preprint arXiv:1312.6199

  34. [34]

    Tang, G.; Kumar, N.; and Michmizos, K. P. 2020. Reinforcement co-learning of deep and spiking neural networks for energy-efficient mapless navigation with neuromorphic hardware. In 2020 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 6090--6097. IEEE

  35. [35]

    Tramer, F.; Carlini, N.; Brendel, W.; and Madry, A. 2020. On adaptive attacks to adversarial example defenses. Advances in Neural Information Processing Systems, 33: 1633--1645

  36. [36]

    Wu, Y.; Deng, L.; Li, G.; Zhu, J.; and Shi, L. 2018. Spatio-temporal backpropagation for training high-performance spiking neural networks. Frontiers in neuroscience, 12: 331

  37. [37]

    Wu, Y.; Deng, L.; Li, G.; Zhu, J.; Xie, Y.; and Shi, L. 2019. Direct training for spiking neural networks: Faster, larger, better. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 33, 1311--1318

  38. [38]

    Zenke, F.; and Ganguli, S. 2018. Superspike: Supervised learning in multilayer spiking neural networks. Neural computation, 30(6): 1514--1541

  39. [39]

    Zenke, F.; and Vogels, T. P. 2021. The remarkable robustness of surrogate gradient learning for instilling complex function in spiking neural networks. Neural computation, 33(4): 899--925

  40. [40]

    Zhang, J.; Xu, X.; Han, B.; Niu, G.; Cui, L.; Sugiyama, M.; and Kankanhalli, M. 2020. Attacks which do not kill training make adversarial learning stronger. In International conference on machine learning, 11278--11287. PMLR

  41. [41]

    , " * write output.state after.block = add.period write newline

    ENTRY address archivePrefix author booktitle chapter edition editor eid eprint howpublished institution isbn journal key month note number organization pages publisher school series title type volume year label extra.label sort.label short.list INTEGERS output.state before.all mid.sentence after.sentence after.block FUNCTION init.state.consts #0 'before.a...

  42. [42]

    write newline

    " write newline "" before.all 'output.state := FUNCTION n.dashify 't := "" t empty not t #1 #1 substring "-" = t #1 #2 substring "--" = not "--" * t #2 global.max substring 't := t #1 #1 substring "-" = "-" * t #2 global.max substring 't := while if t #1 #1 substring * t #2 global.max substring 't := if while FUNCTION word.in bbl.in capitalize " " * FUNCT...