SiLIF: Structured State Space Model Dynamics and Parametrization for Spiking Neural Networks

Emre Neftci; Lyubov Dudchenko; Maxime Fabre; Younes Bouhadjar

arxiv: 2506.06374 · v4 · submitted 2025-06-04 · 💻 cs.NE

SiLIF: Structured State Space Model Dynamics and Parametrization for Spiking Neural Networks

Maxime Fabre , Lyubov Dudchenko , Younes Bouhadjar , Emre Neftci This is my paper

Pith reviewed 2026-05-19 11:01 UTC · model grok-4.3

classification 💻 cs.NE

keywords spiking neural networksstate space modelsleaky integrate-and-firespeech recognitionevent-based processinggradient stabilityneuromorphic computingsynaptic delays

0 comments

The pith

Two SiLIF neuron models inspired by state space models achieve new state-of-the-art performance among spiking models on speech recognition tasks.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

This paper introduces two variants of leaky integrate-and-fire neurons that borrow discretization and initialization techniques from state space models to stabilize gradient flow during training. The first variant adds a learnable timestep with logarithmic reparametrization to a two-state neuron, while the second incorporates complex-state structures to support oscillatory dynamics. These changes target the instability that has limited scaling of multi-state spiking neurons on long sequences. A sympathetic reader would care because the resulting models deliver superior accuracy on event-based and raw-audio datasets while using less computation than non-spiking alternatives.

Core claim

The authors show that extending two-state spiking neurons with a learnable discretization timestep and logarithmic reparametrization, and further embedding the initialization and structure of complex-state SSMs to enable oscillatory regimes, produces stable gradients and new state-of-the-art results among spiking neuron models on both event-based and raw-audio speech recognition datasets.

What carries the argument

The SiLIF models, which apply SSM-style learnable discretization timestep and logarithmic reparametrization to the recurrent dynamics of two-state and complex-state leaky integrate-and-fire neurons.

If this is right

The models exhibit a favorable performance-efficiency trade-off relative to standard state space models.
They surpass SSM accuracy while requiring only half the computational cost through the addition of synaptic delays.
The parametrization supports more reliable scaling of spiking networks to longer audio and event sequences.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same discretization and reparametrization steps could be tested on other multi-state spiking neuron families to check for similar stability gains.
Hardware implementations might gain extra energy savings by combining the reported synaptic delays with the reduced state count.
Applying the oscillatory regime to visual or tactile event streams could reveal whether the benefits extend beyond audio tasks.

Load-bearing premise

The assumption that the SSM-inspired discretization, logarithmic reparametrization, and complex-state initialization will produce stable gradient flow through the spiking dynamics on the tested datasets without post-hoc tuning or dataset-specific adjustments.

What would settle it

Training the SiLIF models on the same speech recognition datasets and finding that gradients explode or performance falls below prior spiking baselines when the logarithmic reparametrization or complex initialization is removed would falsify the central claim.

Figures

Figures reproduced from arXiv: 2506.06374 by Emre Neftci, Lyubov Dudchenko, Maxime Fabre, Younes Bouhadjar.

**Figure 3.** Figure 3: Test accuracy and synaptic operations (SOP) with standard deviation for different models [PITH_FULL_IMAGE:figures/full_fig_p009_3.png] view at source ↗

**Figure 4.** Figure 4: Neuronal dynamics of the (a) CadLIF and (b) SiLIF models pre-trained on the SSC task. [PITH_FULL_IMAGE:figures/full_fig_p013_4.png] view at source ↗

**Figure 5.** Figure 5: SHD test accuracy with standard deviation interval on 5 runs for models over depth with [PITH_FULL_IMAGE:figures/full_fig_p015_5.png] view at source ↗

**Figure 6.** Figure 6: As described by Izhikevich [23], the RF neuron follows these neuronal dynamics: ut = ut−1 + ∆t((α real + iαimg)ut−1 + It) − θst−1 st = (Re(ut) ≥ θ), (7) where α real and α img are the trainable parameters, obtained directly from the discrete form. As for the gap between the AdLIF and our SiLIF model, a first difference with the C-SiLIF is the parametrization of the model, as the C-SiLIF model focus on trai… view at source ↗

**Figure 6.** Figure 6: Impact of incremental SSM-imported features on performance on the SHD dataset from [PITH_FULL_IMAGE:figures/full_fig_p016_6.png] view at source ↗

read the original abstract

Multi-state spiking neurons combine sparse binary activations with rich second-order nonlinear recurrent dynamics, making them a promising alternative to standard deep learning models. However, gradient propagation through these dynamics often leads to instabilities that hinder scalability and performance. Inspired by the stable training and strong performance of state space models (SSMs) on long sequences, we introduce two SSM-inspired Leaky Integrate-and-Fire (SiLIF) neuron models. The first extends a two-state neuron with a learnable discretization timestep and logarithmic reparametrization, while the second additionally incorporates the initialization scheme and structure of complex-state SSMs, enabling oscillatory regimes. Our two SiLIF models achieve new state-of-the-art performance among spiking neuron models on both event-based and raw-audio speech recognition datasets. We further demonstrate a favorable performance-efficiency trade-off compared to SSMs, even surpassing them while using half the computational cost through the use of synaptic delays. Our code is available at https://github.com/Maxtimer97/SSM-inspired-LIF.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

SiLIF ports SSM-style learnable timestep and complex states to LIF neurons for claimed SOTA on speech tasks, but the gains need matched baseline tuning to hold up.

read the letter

The main thing here is that the authors adapt stable discretization and complex-state tricks from SSMs to spiking LIF neurons. One variant adds a learnable timestep with log reparametrization; the second layers on complex initialization to support oscillatory regimes. This produces better gradient flow through the recurrent spiking dynamics than standard LIF setups. They report new SOTA numbers among spiking models on event-based and raw-audio speech datasets, plus a solid efficiency win over full SSMs by using synaptic delays to cut compute roughly in half. The public code is a practical plus for anyone who wants to test or extend it.

Referee Report

3 major / 2 minor

Summary. The paper introduces two SSM-inspired Leaky Integrate-and-Fire (SiLIF) neuron models. The first extends a two-state neuron with a learnable discretization timestep and logarithmic reparametrization; the second adds complex-state SSM initialization and structure to enable oscillatory regimes. The authors claim these models achieve new state-of-the-art performance among spiking neuron models on event-based and raw-audio speech recognition datasets, while providing a favorable performance-efficiency trade-off versus standard SSMs through the use of synaptic delays. Code is released.

Significance. If the performance gains hold under matched training conditions and the parametrizations demonstrably improve gradient stability without dataset-specific tuning, the work could usefully transfer SSM training techniques to spiking networks for long-sequence tasks. The explicit code release supports reproducibility.

major comments (3)

[§4, Tables 1-2] §4 (Experiments) and Tables 1-2: The SOTA claims among spiking neuron models on both datasets are load-bearing but rest on baseline comparisons whose details (hyperparameter search budget, network widths, optimizer settings, data splits, and whether baselines were re-tuned under the identical protocol) are not reported. Without this, it is impossible to attribute any delta to the SSM-inspired discretization and reparametrization rather than unmatched experimental conditions.
[§4.3] §4.3 and associated ablation text: No ablation isolates the individual contributions of the learnable timestep, logarithmic reparametrization, and complex-state initialization. The central claim that these SSM-inspired elements produce stable gradient flow and superior performance therefore lacks direct support; the extra degrees of freedom could explain the gains.
[Abstract, §3.2] Abstract and §3.2: The claim of stable training via the new parametrizations is not accompanied by any analysis or metrics of gradient norms, vanishing/exploding behavior, or training curves on the target datasets, leaving the weakest assumption untested.

minor comments (2)

[Figure 2] Figure 2: Axis labels and legend entries for the complex-state variant are difficult to distinguish from the first SiLIF variant.
[§3.1] §3.1: The definition of the logarithmic reparametrization should explicitly state the range constraints applied to the learnable parameters to ensure positivity.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for the constructive and detailed comments. We address each major point below and outline the revisions we will make to strengthen the experimental rigor and support for our claims.

read point-by-point responses

Referee: [§4, Tables 1-2] §4 (Experiments) and Tables 1-2: The SOTA claims among spiking neuron models on both datasets are load-bearing but rest on baseline comparisons whose details (hyperparameter search budget, network widths, optimizer settings, data splits, and whether baselines were re-tuned under the identical protocol) are not reported. Without this, it is impossible to attribute any delta to the SSM-inspired discretization and reparametrization rather than unmatched experimental conditions.

Authors: We agree that insufficient detail on the baseline experimental conditions limits the strength of the SOTA claims. In the revised manuscript we will add a dedicated subsection in §4 that fully specifies the hyperparameter search budget (including ranges and number of trials), network widths, optimizer settings, data splits, and training protocol used for all models. We have re-trained the primary baselines (LIF, ALIF, and standard SSM variants) under this identical protocol using the released code, and the updated Tables 1-2 will report these matched results. This will allow readers to attribute performance differences to the proposed parametrizations. revision: yes
Referee: [§4.3] §4.3 and associated ablation text: No ablation isolates the individual contributions of the learnable timestep, logarithmic reparametrization, and complex-state initialization. The central claim that these SSM-inspired elements produce stable gradient flow and superior performance therefore lacks direct support; the extra degrees of freedom could explain the gains.

Authors: We acknowledge that the current ablations do not fully disentangle the three components. In the revision we will expand §4.3 with a systematic set of ablations that independently enable/disable the learnable timestep, the logarithmic reparametrization, and the complex-state initialization while holding all other factors fixed. Performance deltas on both datasets will be reported, directly addressing whether each SSM-inspired element contributes to the observed gains beyond the added degrees of freedom. revision: yes
Referee: [Abstract, §3.2] Abstract and §3.2: The claim of stable training via the new parametrizations is not accompanied by any analysis or metrics of gradient norms, vanishing/exploding behavior, or training curves on the target datasets, leaving the weakest assumption untested.

Authors: We agree that direct empirical evidence of gradient stability is needed to support the central motivation. We will add to §3.2 (and the supplementary material) plots of gradient norm statistics across training epochs for SiLIF versus standard LIF neurons on both the event-based and raw-audio datasets. Training curves for loss and accuracy will also be included to demonstrate convergence behavior. These additions will provide concrete metrics on vanishing/exploding gradients and stable training dynamics. revision: yes

Circularity Check

0 steps flagged

No circularity: empirical performance claims rest on new parametrizations and benchmarks, not self-referential fits or definitions.

full rationale

The paper introduces two new neuron models (learnable timestep + log reparametrization; complex-state SSM initialization) and reports their empirical accuracy on speech datasets. No derivation chain reduces a claimed result to its own fitted inputs by construction, nor does any load-bearing premise collapse to a self-citation whose validity is presupposed. The SOTA claim is an experimental outcome rather than a mathematical identity or renamed known pattern; the models add explicit degrees of freedom whose effect is measured against baselines. This is the normal case of a self-contained empirical contribution.

Axiom & Free-Parameter Ledger

2 free parameters · 1 axioms · 0 invented entities

The central claims rest on standard gradient-based training assumptions for recurrent spiking models plus the new learnable parameters; no new physical entities are postulated.

free parameters (2)

learnable discretization timestep
Introduced in the first SiLIF model to extend the two-state neuron.
logarithmic reparametrization
Used alongside the learnable timestep for stable training.

axioms (1)

domain assumption Gradient propagation through spiking dynamics can be stabilized by SSM-style discretization and initialization without additional regularization.
Invoked to justify the choice of parametrization for avoiding instabilities.

pith-pipeline@v0.9.0 · 5720 in / 1209 out tokens · 44488 ms · 2026-05-19T11:01:03.455308+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel echoes

?

echoes
ECHOES: this paper passage has the same mathematical shape or conceptual pattern as the Recognition theorem, but is not a direct formal dependency.

we define it as a heterogeneous trainable parameter... logarithmic reparameterization... λα = exp(λα_log)... to enhance numerical stability
IndisputableMonolith/Foundation/AlexanderDuality.lean alexander_duality_circle_linking echoes

?

echoes
ECHOES: this paper passage has the same mathematical shape or conceptual pattern as the Recognition theorem, but is not a direct formal dependency.

eigenvalue distribution covering the whole unit circle... oscillatory dynamics

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Forward citations

Cited by 2 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

FiTS: Interpretable Spiking Neurons via Frequency Selectivity and Temporal Shaping
cs.NE 2026-05 unverdicted novelty 7.0

FiTS spiking neurons improve auditory task performance over LIF baselines by factorizing computation into frequency selectivity and group-delay-based temporal shaping, yielding interpretable per-neuron parameters.
Privacy-preserving fall detection at the edge using Sony IMX636 event-based vision sensor and Intel Loihi 2 neuromorphic processor
cs.NE 2025-11 unverdicted novelty 4.0

A neuromorphic edge system using event vision and sparse SNNs on Loihi 2 achieves up to 84% F1 score at 90 mW for privacy-preserving fall detection.

Reference graph

Works this paper leans on

43 extracted references · 43 canonical work pages · cited by 2 Pith papers · 3 internal anchors

[1]

The neu- robench framework for benchmarking neuromorphic computing algorithms and systems.Nature Communications, 16(1):1545, 2025

Jason Yik, Korneel Van den Berghe, Douwe den Blanken, Younes Bouhadjar, Maxime Fabre, Paul Hueber, Weijie Ke, Mina A Khoei, Denis Kleyko, Noah Pacik-Nelson, et al. The neu- robench framework for benchmarking neuromorphic computing algorithms and systems.Nature Communications, 16(1):1545, 2025

work page 2025
[2]

Low-power event-based face detection with asynchronous neuromorphic hardware

Caterina Caccavella, Federico Paredes-Vallés, Marco Cannici, and Lyes Khacef. Low-power event-based face detection with asynchronous neuromorphic hardware. In 2024 International Joint Conference on Neural Networks (IJCNN) , pages 1–10, 2024. doi: 10.1109/IJCNN60899. 2024.10650843

work page doi:10.1109/ijcnn60899 2024
[3]

Relu strikes back: Exploiting activation sparsity in large language models

Iman Mirzadeh, Keivan Alizadeh, Sachin Mehta, Carlo C Del Mundo, Oncel Tuzel, Golnoosh Samei, Mohammad Rastegari, and Mehrdad Farajtabar. Relu strikes back: Exploiting activation sparsity in large language models, 2023. URL https://arxiv.org/abs/2310.04564

work page arXiv 2023
[4]

The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits

Shuming Ma, Hongyu Wang, Lingxiao Ma, Lei Wang, Wenhui Wang, Shaohan Huang, Li Dong, Ruiping Wang, Jilong Xue, and Furu Wei. The era of 1-bit llms: All large language models are in 1.58 bits, 2024. URL https://arxiv.org/abs/2402.17764

work page internal anchor Pith review Pith/arXiv arXiv 2024
[5]

Synaptic plasticity dynamics for deep con- tinuous local learning (decolle)

Jacques Kaiser, Hesham Mostafa, and Emre Neftci. Synaptic plasticity dynamics for deep con- tinuous local learning (decolle). Frontiers in Neuroscience, 14:424, 2020. ISSN 1662453X. doi: 10.3389/fnins.2020.00424. URL https://www.frontiersin.org/article/10.3389/ fnins.2020.00424

work page doi:10.3389/fnins.2020.00424 2020
[6]

Superspike: Supervised learning in multilayer spiking neural networks

Friedemann Zenke and Surya Ganguli. Superspike: Supervised learning in multilayer spiking neural networks. Neural computation, 30(6):15141541, 2018

work page 2018
[7]

Alexandre Bittar and Philip N. Garner. A surrogate gradient spiking baseline for speech command recognition. Frontiers in Neuroscience, 16, 2022. ISSN 1662453X. doi: 10.3389/ fnins.2022.865897. 10

work page arXiv 2022
[8]

Stabilizing spiking neuron training

Luca Herranz-Celotti and Jean Rouat. Stabilizing spiking neuron training. arXiv preprint arXiv:2202.00282, 2022

work page arXiv 2022
[9]

Efficiently modeling long sequences with structured state spaces

Albert Gu, Karan Goel, and Christopher Ré. Efficiently modeling long sequences with structured state spaces. In ICLR 2022 - 10th International Conference on Learning Representations , 2022

work page 2022
[10]

Diagonal state spaces are as effective as structured state spaces

Ankit Gupta, Albert Gu, and Jonathan Berant. Diagonal state spaces are as effective as structured state spaces. In Advances in Neural Information Processing Systems , volume 35, 2022

work page 2022
[11]

Resurrecting recurrent neural networks for long sequences, 2023

Antonio Orvieto, Samuel L Smith, Albert Gu, Anushan Fernando, Caglar Gulcehre, Razvan Pascanu, and Soham De. Resurrecting Recurrent Neural Networks for Long Sequences. pages 1–30, 2023. URL http://arxiv.org/abs/2303.06349

work page arXiv 2023
[12]

Recurrent neural networks: vanishing and exploding gradients are not the end of the story, 2024

Nicolas Zucchet and Antonio Orvieto. Recurrent neural networks: vanishing and exploding gradients are not the end of the story, 2024. URL https://arxiv.org/abs/2405.21064

work page arXiv 2024
[13]

Kostas Pagiamtzis and Ali Sheikholeslami

Emre O. Neftci, Hesham Mostafa, and Friedemann Zenke. Surrogate gradient learning in spiking neural networks: Bringing the power of gradient-based optimization to spiking neural networks. IEEE Signal Processing Magazine, 36(6):51–63, 2019. doi: 10.1109/MSP.2019.2931595

work page doi:10.1109/msp.2019.2931595 2019
[14]

Training deep spiking neural networks using backpropagation

Jun Haeng Lee, Tobi Delbruck, and Michael Pfeiffer. Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience, 10, 2016. ISSN 1662453X. doi: 10.3389/ fnins.2016.00508

work page arXiv 2016
[15]

Optimal ann-snn conversion for high-accuracy and ultra-low-latency spiking neural networks

Tong Bu, Wei Fang, Jianhao Ding, Peng Lin Dai, Zhaofei Yu, and Tiejun Huang. Optimal ann-snn conversion for high-accuracy and ultra-low-latency spiking neural networks. In ICLR 2022 - 10th International Conference on Learning Representations , 2022

work page 2022
[16]

Izhikevich

E.M. Izhikevich. Simple model of spiking neurons. IEEE Transactions on Neural Networks , 14 (6):15691572, 2003

work page 2003
[17]

Adaptive exponential integrate-and-fire model as an effective description of neuronal activity.Journal of Neurophysiology, 94, 2005

Romain Brette and Wulfram Gerstner. Adaptive exponential integrate-and-fire model as an effective description of neuronal activity.Journal of Neurophysiology, 94, 2005. ISSN 00223077. doi: 10.1152/jn.00686.2005

work page doi:10.1152/jn.00686.2005 2005
[18]

Neuronal dynamics: From single neurons to networks and models of cognition

Wulfram Gerstner, Werner M Kistler, Richard Naud, and Liam Paninski. Neuronal dynamics: From single neurons to networks and models of cognition . Cambridge University Press, 2014

work page 2014
[19]

Spike frequency adaptation supports network computations on temporally dispersed information

Darjan Salaj, Anand Subramoney, Ceca Kraisnikovic, Guillaume Bellec, Robert Legenstein, and Wolfgang Maass. Spike frequency adaptation supports network computations on temporally dispersed information. eLife, 10, 2021. ISSN 2050084X. doi: 10.7554/eLife.65459

work page doi:10.7554/elife.65459 2021
[20]

Bojian Yin, Federico Corradi, and Sander M. Bohté. Accurate and efficient time-domain classification with adaptive spiking recurrent neural networks. Nature Machine Intelligence, 3,

work page
[21]

doi: 10.1038/s42256-021-00397-w

ISSN 25225839. doi: 10.1038/s42256-021-00397-w

work page doi:10.1038/s42256-021-00397-w
[22]

Co-learning synaptic delays, weights and adaptation in spiking neural networks, 2023

Lucas Deckers, Laurens Van Damme, Ing Jyh Tsang, Werner Van Leekwijck, and Steven Latré. Co-learning synaptic delays, weights and adaptation in spiking neural networks, 2023. URL https://arxiv.org/abs/2311.16112

work page arXiv 2023
[23]

Advancing spatiotemporal processing in spiking neural networks through adaptation, 2024

Maximilian Baronig, Romain Ferrand, Silvester Sabathiel, and Robert Legenstein. Advancing spatiotemporal processing in spiking neural networks through adaptation, 2024. URL https: //arxiv.org/abs/2408.07517

work page arXiv 2024
[24]

Izhikevich

Eugene M. Izhikevich. Resonate-and-fire neurons. Neural Networks, 14, 2001. ISSN 08936080. doi: 10.1016/S0893-6080(01)00078-8

work page doi:10.1016/s0893-6080(01)00078-8 2001
[25]

Deep spiking neural net- works with resonate-and-fire neurons, 2021

Badr AlKhamissi, Muhammad ElNokrashy, and David Bernal-Casas. Deep spiking neural net- works with resonate-and-fire neurons, 2021. URL https://arxiv.org/abs/2109.08234

work page arXiv 2021
[26]

Paxon Frady, Sophia Sanborn, Sumit Bam Shrestha, Daniel Ben Dayan Rubin, Garrick Orchard, Friedrich T

E. Paxon Frady, Sophia Sanborn, Sumit Bam Shrestha, Daniel Ben Dayan Rubin, Garrick Orchard, Friedrich T. Sommer, and Mike Davies. Efficient neuromorphic signal processing with resonator neurons. Journal of Signal Processing Systems , 94, 2022. ISSN 19398115. doi: 10.1007/s11265-022-01772-5. 11

work page doi:10.1007/s11265-022-01772-5 2022
[27]

Bohte, and Sebastian Otte

Saya Higuchi, Sebastian Kairat, Sander M. Bohte, and Sebastian Otte. Balanced resonate-and- fire neurons, 2024. URL https://arxiv.org/abs/2402.14603

work page arXiv 2024
[28]

Scaling up resonate-and-fire networks for fast deep learning

Thomas E Huber, Jules Lecomte, Borislav Polovnikov, and Axel von Arnim. Scaling up resonate-and-fire networks for fast deep learning. arXiv preprint arXiv:2504.00719, 2025

work page arXiv 2025
[29]

Zero-shot temporal resolution domain adaptation for spiking neural networks

Sanja Karilanova, Maxime Fabre, Emre Neftci, and Ayça Özçelikkale. Zero-shot temporal resolution domain adaptation for spiking neural networks. arXiv preprint arXiv:2411.04760, 2024

work page arXiv 2024
[30]

& Masquelier, T

Ilyass Hammouamri, Ismail Khalfaoui-Hassani, and Timothée Masquelier. Learning delays in spiking neural networks using dilated convolutions with learnable spacings, 2023. URL https://arxiv.org/abs/2306.17670

work page arXiv 2023
[31]

Hippo: Recurrent memory with optimal polynomial projections, 2020

Albert Gu, Tri Dao, Stefano Ermon, Atri Rudra, and Christopher Re. Hippo: Recurrent memory with optimal polynomial projections, 2020. URL https://arxiv.org/abs/2008.07669

work page arXiv 2020
[32]

Jimmy T. H. Smith, Andrew Warrington, and Scott W. Linderman. Simplified state space layers for sequence modeling, 2023. URL https://arxiv.org/abs/2208.04933

work page internal anchor Pith review Pith/arXiv arXiv 2023
[33]

Long range arena: A benchmark for efficient transformers, 2021

Yi Tay, Mostafa Dehghani, Samira Abnar, Yikang Shen, Dara Bahri, Philip Pham, Jinfeng Rao, Liu Yang, Sebastian Ruder, and Donald Metzler. Long range arena: A benchmark for efficient transformers. arXiv preprint arXiv:2011.04006, 2020

work page arXiv 2011
[34]

P-spikessm: Harnessing probabilistic spiking state space models for long-range dependency tasks, 2024

Malyaban Bal and Abhronil Sengupta. P-spikessm: Harnessing probabilistic spiking state space models for long-range dependency tasks, 2024. URL https://arxiv.org/abs/2406. 02923

work page 2024
[35]

Prf: Parallel resonate and fire neuron for long sequence learning in spiking neural networks, 2024

Yulong Huang, Zunchang Liu, Changchun Feng, Xiaopeng Lin, Hongwei Ren, Haotian Fu, Yue Zhou, Hong Xing, and Bojun Cheng. Prf: Parallel resonate and fire neuron for long sequence learning in spiking neural networks, 2024. URL https://arxiv.org/abs/2410.03530

work page arXiv 2024
[36]

2404.18508

Mark Schöne, Neeraj Mohan Sushma, Jingyue Zhuge, Christian Mayr, Anand Subramoney, and David Kappel. Scalable event-by-event processing of neuromorphic sensory signals with deep state-space models, 2024. URL https://arxiv.org/abs/2404.18508

work page arXiv 2024
[37]

State space models for event cameras,

Nikola Zubi´c, Mathias Gehrig, and Davide Scaramuzza. State space models for event cameras,

work page
[38]

URL https://arxiv.org/abs/2402.15584

work page arXiv
[39]

Provable benefits of complex parameterizations for structured state space models

Yuval Ran-Milo, Eden Lumbroso, Edo Cohen-Karlik, Raja Giryes, Amir Globerson, and Nadav Cohen. Provable benefits of complex parameterizations for structured state space models. Advances in Neural Information Processing Systems , 37:115906–115939, 2024

work page 2024
[40]

On the parameterization and initialization of diagonal state space models

Albert Gu, Ankit Gupta, Karan Goel, and Christopher Ré. On the parameterization and initialization of diagonal state space models. In Advances in Neural Information Processing Systems, volume 35, 2022

work page 2022
[41]

The heidelberg spiking data sets for the systematic evaluation of spiking neural networks

Benjamin Cramer, Yannik Stradmann, Johannes Schemmel, and Friedemann Zenke. The heidelberg spiking data sets for the systematic evaluation of spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems , 2020

work page 2020
[42]

Speech Commands: A Dataset for Limited-Vocabulary Speech Recognition

Pete Warden. Speech commands: A dataset for limitedvocabulary speech recognition. arXiv preprint arXiv:1804.03209, 2018

work page internal anchor Pith review Pith/arXiv arXiv 2018
[43]

Jonathan Frankle and Michael Carbin

M. Davies, N. Srinivasa, T. H. Lin, G. Chinya, P. Joshi, A. Lines, A. Wild, and H. Wang. Loihi: A neuromorphic manycore processor with onchip learning. IEEE Micro, PP(99):11, 2018. ISSN 02721732. doi: 10.1109/MM.2018.112130359. 12 A Technical Appendices and Supplementary Material A.1 Link between eigenvalues and neuronal dynamics regimes We give extra ins...

work page doi:10.1109/mm.2018.112130359 2018

[1] [1]

The neu- robench framework for benchmarking neuromorphic computing algorithms and systems.Nature Communications, 16(1):1545, 2025

Jason Yik, Korneel Van den Berghe, Douwe den Blanken, Younes Bouhadjar, Maxime Fabre, Paul Hueber, Weijie Ke, Mina A Khoei, Denis Kleyko, Noah Pacik-Nelson, et al. The neu- robench framework for benchmarking neuromorphic computing algorithms and systems.Nature Communications, 16(1):1545, 2025

work page 2025

[2] [2]

Low-power event-based face detection with asynchronous neuromorphic hardware

Caterina Caccavella, Federico Paredes-Vallés, Marco Cannici, and Lyes Khacef. Low-power event-based face detection with asynchronous neuromorphic hardware. In 2024 International Joint Conference on Neural Networks (IJCNN) , pages 1–10, 2024. doi: 10.1109/IJCNN60899. 2024.10650843

work page doi:10.1109/ijcnn60899 2024

[3] [3]

Relu strikes back: Exploiting activation sparsity in large language models

Iman Mirzadeh, Keivan Alizadeh, Sachin Mehta, Carlo C Del Mundo, Oncel Tuzel, Golnoosh Samei, Mohammad Rastegari, and Mehrdad Farajtabar. Relu strikes back: Exploiting activation sparsity in large language models, 2023. URL https://arxiv.org/abs/2310.04564

work page arXiv 2023

[4] [4]

The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits

Shuming Ma, Hongyu Wang, Lingxiao Ma, Lei Wang, Wenhui Wang, Shaohan Huang, Li Dong, Ruiping Wang, Jilong Xue, and Furu Wei. The era of 1-bit llms: All large language models are in 1.58 bits, 2024. URL https://arxiv.org/abs/2402.17764

work page internal anchor Pith review Pith/arXiv arXiv 2024

[5] [5]

Synaptic plasticity dynamics for deep con- tinuous local learning (decolle)

Jacques Kaiser, Hesham Mostafa, and Emre Neftci. Synaptic plasticity dynamics for deep con- tinuous local learning (decolle). Frontiers in Neuroscience, 14:424, 2020. ISSN 1662453X. doi: 10.3389/fnins.2020.00424. URL https://www.frontiersin.org/article/10.3389/ fnins.2020.00424

work page doi:10.3389/fnins.2020.00424 2020

[6] [6]

Superspike: Supervised learning in multilayer spiking neural networks

Friedemann Zenke and Surya Ganguli. Superspike: Supervised learning in multilayer spiking neural networks. Neural computation, 30(6):15141541, 2018

work page 2018

[7] [7]

Alexandre Bittar and Philip N. Garner. A surrogate gradient spiking baseline for speech command recognition. Frontiers in Neuroscience, 16, 2022. ISSN 1662453X. doi: 10.3389/ fnins.2022.865897. 10

work page arXiv 2022

[8] [8]

Stabilizing spiking neuron training

Luca Herranz-Celotti and Jean Rouat. Stabilizing spiking neuron training. arXiv preprint arXiv:2202.00282, 2022

work page arXiv 2022

[9] [9]

Efficiently modeling long sequences with structured state spaces

Albert Gu, Karan Goel, and Christopher Ré. Efficiently modeling long sequences with structured state spaces. In ICLR 2022 - 10th International Conference on Learning Representations , 2022

work page 2022

[10] [10]

Diagonal state spaces are as effective as structured state spaces

Ankit Gupta, Albert Gu, and Jonathan Berant. Diagonal state spaces are as effective as structured state spaces. In Advances in Neural Information Processing Systems , volume 35, 2022

work page 2022

[11] [11]

Resurrecting recurrent neural networks for long sequences, 2023

Antonio Orvieto, Samuel L Smith, Albert Gu, Anushan Fernando, Caglar Gulcehre, Razvan Pascanu, and Soham De. Resurrecting Recurrent Neural Networks for Long Sequences. pages 1–30, 2023. URL http://arxiv.org/abs/2303.06349

work page arXiv 2023

[12] [12]

Recurrent neural networks: vanishing and exploding gradients are not the end of the story, 2024

Nicolas Zucchet and Antonio Orvieto. Recurrent neural networks: vanishing and exploding gradients are not the end of the story, 2024. URL https://arxiv.org/abs/2405.21064

work page arXiv 2024

[13] [13]

Kostas Pagiamtzis and Ali Sheikholeslami

Emre O. Neftci, Hesham Mostafa, and Friedemann Zenke. Surrogate gradient learning in spiking neural networks: Bringing the power of gradient-based optimization to spiking neural networks. IEEE Signal Processing Magazine, 36(6):51–63, 2019. doi: 10.1109/MSP.2019.2931595

work page doi:10.1109/msp.2019.2931595 2019

[14] [14]

Training deep spiking neural networks using backpropagation

Jun Haeng Lee, Tobi Delbruck, and Michael Pfeiffer. Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience, 10, 2016. ISSN 1662453X. doi: 10.3389/ fnins.2016.00508

work page arXiv 2016

[15] [15]

Optimal ann-snn conversion for high-accuracy and ultra-low-latency spiking neural networks

Tong Bu, Wei Fang, Jianhao Ding, Peng Lin Dai, Zhaofei Yu, and Tiejun Huang. Optimal ann-snn conversion for high-accuracy and ultra-low-latency spiking neural networks. In ICLR 2022 - 10th International Conference on Learning Representations , 2022

work page 2022

[16] [16]

Izhikevich

E.M. Izhikevich. Simple model of spiking neurons. IEEE Transactions on Neural Networks , 14 (6):15691572, 2003

work page 2003

[17] [17]

Adaptive exponential integrate-and-fire model as an effective description of neuronal activity.Journal of Neurophysiology, 94, 2005

Romain Brette and Wulfram Gerstner. Adaptive exponential integrate-and-fire model as an effective description of neuronal activity.Journal of Neurophysiology, 94, 2005. ISSN 00223077. doi: 10.1152/jn.00686.2005

work page doi:10.1152/jn.00686.2005 2005

[18] [18]

Neuronal dynamics: From single neurons to networks and models of cognition

Wulfram Gerstner, Werner M Kistler, Richard Naud, and Liam Paninski. Neuronal dynamics: From single neurons to networks and models of cognition . Cambridge University Press, 2014

work page 2014

[19] [19]

Spike frequency adaptation supports network computations on temporally dispersed information

Darjan Salaj, Anand Subramoney, Ceca Kraisnikovic, Guillaume Bellec, Robert Legenstein, and Wolfgang Maass. Spike frequency adaptation supports network computations on temporally dispersed information. eLife, 10, 2021. ISSN 2050084X. doi: 10.7554/eLife.65459

work page doi:10.7554/elife.65459 2021

[20] [20]

Bojian Yin, Federico Corradi, and Sander M. Bohté. Accurate and efficient time-domain classification with adaptive spiking recurrent neural networks. Nature Machine Intelligence, 3,

work page

[21] [21]

doi: 10.1038/s42256-021-00397-w

ISSN 25225839. doi: 10.1038/s42256-021-00397-w

work page doi:10.1038/s42256-021-00397-w

[22] [22]

Co-learning synaptic delays, weights and adaptation in spiking neural networks, 2023

Lucas Deckers, Laurens Van Damme, Ing Jyh Tsang, Werner Van Leekwijck, and Steven Latré. Co-learning synaptic delays, weights and adaptation in spiking neural networks, 2023. URL https://arxiv.org/abs/2311.16112

work page arXiv 2023

[23] [23]

Advancing spatiotemporal processing in spiking neural networks through adaptation, 2024

Maximilian Baronig, Romain Ferrand, Silvester Sabathiel, and Robert Legenstein. Advancing spatiotemporal processing in spiking neural networks through adaptation, 2024. URL https: //arxiv.org/abs/2408.07517

work page arXiv 2024

[24] [24]

Izhikevich

Eugene M. Izhikevich. Resonate-and-fire neurons. Neural Networks, 14, 2001. ISSN 08936080. doi: 10.1016/S0893-6080(01)00078-8

work page doi:10.1016/s0893-6080(01)00078-8 2001

[25] [25]

Deep spiking neural net- works with resonate-and-fire neurons, 2021

Badr AlKhamissi, Muhammad ElNokrashy, and David Bernal-Casas. Deep spiking neural net- works with resonate-and-fire neurons, 2021. URL https://arxiv.org/abs/2109.08234

work page arXiv 2021

[26] [26]

Paxon Frady, Sophia Sanborn, Sumit Bam Shrestha, Daniel Ben Dayan Rubin, Garrick Orchard, Friedrich T

E. Paxon Frady, Sophia Sanborn, Sumit Bam Shrestha, Daniel Ben Dayan Rubin, Garrick Orchard, Friedrich T. Sommer, and Mike Davies. Efficient neuromorphic signal processing with resonator neurons. Journal of Signal Processing Systems , 94, 2022. ISSN 19398115. doi: 10.1007/s11265-022-01772-5. 11

work page doi:10.1007/s11265-022-01772-5 2022

[27] [27]

Bohte, and Sebastian Otte

Saya Higuchi, Sebastian Kairat, Sander M. Bohte, and Sebastian Otte. Balanced resonate-and- fire neurons, 2024. URL https://arxiv.org/abs/2402.14603

work page arXiv 2024

[28] [28]

Scaling up resonate-and-fire networks for fast deep learning

Thomas E Huber, Jules Lecomte, Borislav Polovnikov, and Axel von Arnim. Scaling up resonate-and-fire networks for fast deep learning. arXiv preprint arXiv:2504.00719, 2025

work page arXiv 2025

[29] [29]

Zero-shot temporal resolution domain adaptation for spiking neural networks

Sanja Karilanova, Maxime Fabre, Emre Neftci, and Ayça Özçelikkale. Zero-shot temporal resolution domain adaptation for spiking neural networks. arXiv preprint arXiv:2411.04760, 2024

work page arXiv 2024

[30] [30]

& Masquelier, T

Ilyass Hammouamri, Ismail Khalfaoui-Hassani, and Timothée Masquelier. Learning delays in spiking neural networks using dilated convolutions with learnable spacings, 2023. URL https://arxiv.org/abs/2306.17670

work page arXiv 2023

[31] [31]

Hippo: Recurrent memory with optimal polynomial projections, 2020

Albert Gu, Tri Dao, Stefano Ermon, Atri Rudra, and Christopher Re. Hippo: Recurrent memory with optimal polynomial projections, 2020. URL https://arxiv.org/abs/2008.07669

work page arXiv 2020

[32] [32]

Jimmy T. H. Smith, Andrew Warrington, and Scott W. Linderman. Simplified state space layers for sequence modeling, 2023. URL https://arxiv.org/abs/2208.04933

work page internal anchor Pith review Pith/arXiv arXiv 2023

[33] [33]

Long range arena: A benchmark for efficient transformers, 2021

Yi Tay, Mostafa Dehghani, Samira Abnar, Yikang Shen, Dara Bahri, Philip Pham, Jinfeng Rao, Liu Yang, Sebastian Ruder, and Donald Metzler. Long range arena: A benchmark for efficient transformers. arXiv preprint arXiv:2011.04006, 2020

work page arXiv 2011

[34] [34]

P-spikessm: Harnessing probabilistic spiking state space models for long-range dependency tasks, 2024

Malyaban Bal and Abhronil Sengupta. P-spikessm: Harnessing probabilistic spiking state space models for long-range dependency tasks, 2024. URL https://arxiv.org/abs/2406. 02923

work page 2024

[35] [35]

Prf: Parallel resonate and fire neuron for long sequence learning in spiking neural networks, 2024

Yulong Huang, Zunchang Liu, Changchun Feng, Xiaopeng Lin, Hongwei Ren, Haotian Fu, Yue Zhou, Hong Xing, and Bojun Cheng. Prf: Parallel resonate and fire neuron for long sequence learning in spiking neural networks, 2024. URL https://arxiv.org/abs/2410.03530

work page arXiv 2024

[36] [36]

2404.18508

Mark Schöne, Neeraj Mohan Sushma, Jingyue Zhuge, Christian Mayr, Anand Subramoney, and David Kappel. Scalable event-by-event processing of neuromorphic sensory signals with deep state-space models, 2024. URL https://arxiv.org/abs/2404.18508

work page arXiv 2024

[37] [37]

State space models for event cameras,

Nikola Zubi´c, Mathias Gehrig, and Davide Scaramuzza. State space models for event cameras,

work page

[38] [38]

URL https://arxiv.org/abs/2402.15584

work page arXiv

[39] [39]

Provable benefits of complex parameterizations for structured state space models

Yuval Ran-Milo, Eden Lumbroso, Edo Cohen-Karlik, Raja Giryes, Amir Globerson, and Nadav Cohen. Provable benefits of complex parameterizations for structured state space models. Advances in Neural Information Processing Systems , 37:115906–115939, 2024

work page 2024

[40] [40]

On the parameterization and initialization of diagonal state space models

Albert Gu, Ankit Gupta, Karan Goel, and Christopher Ré. On the parameterization and initialization of diagonal state space models. In Advances in Neural Information Processing Systems, volume 35, 2022

work page 2022

[41] [41]

The heidelberg spiking data sets for the systematic evaluation of spiking neural networks

Benjamin Cramer, Yannik Stradmann, Johannes Schemmel, and Friedemann Zenke. The heidelberg spiking data sets for the systematic evaluation of spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems , 2020

work page 2020

[42] [42]

Speech Commands: A Dataset for Limited-Vocabulary Speech Recognition

Pete Warden. Speech commands: A dataset for limitedvocabulary speech recognition. arXiv preprint arXiv:1804.03209, 2018

work page internal anchor Pith review Pith/arXiv arXiv 2018

[43] [43]

Jonathan Frankle and Michael Carbin

M. Davies, N. Srinivasa, T. H. Lin, G. Chinya, P. Joshi, A. Lines, A. Wild, and H. Wang. Loihi: A neuromorphic manycore processor with onchip learning. IEEE Micro, PP(99):11, 2018. ISSN 02721732. doi: 10.1109/MM.2018.112130359. 12 A Technical Appendices and Supplementary Material A.1 Link between eigenvalues and neuronal dynamics regimes We give extra ins...

work page doi:10.1109/mm.2018.112130359 2018