pith. sign in

arxiv: 2604.08894 · v1 · submitted 2026-04-10 · 💻 cs.NE · cs.AI· cs.CV

Ge²mS-T: Multi-Dimensional Grouping for Ultra-High Energy Efficiency in Spiking Transformer

Pith reviewed 2026-05-10 17:41 UTC · model grok-4.3

classification 💻 cs.NE cs.AIcs.CV
keywords spiking neural networksvision transformersenergy efficiencygrouped computationANN-SNN conversionspiking self-attentionS-ViTsmulti-dimensional grouping
0
0 comments X

The pith

Multi-dimensional grouping across time, space and structure lets spiking vision transformers match accuracy at far lower energy cost.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper establishes a new architecture called Ge²mS-T that performs grouped computation simultaneously in the temporal, spatial and network-structure dimensions of spiking vision transformers. It introduces an ExpG-IF neuron model that supports lossless conversion from conventional networks with fixed training cost and controlled spike patterns, plus a Group-wise Spiking Self-Attention block that replaces heavy multiplications with grouped, multiplication-free operations inside a hybrid attention-convolution design. The central claim is that this three-way grouping resolves the usual trade-off among memory use, learning performance and energy budget that has limited earlier spiking transformers.

Core claim

Ge²mS-T implements multi-dimensional grouped computation for spiking vision transformers by combining the Grouped-Exponential-Coding-based IF neuron model, which achieves lossless ANN-to-SNN conversion with constant overhead and precise spike-pattern regulation, and the Group-wise Spiking Self-Attention mechanism, which reduces complexity through multi-scale token grouping and multiplication-free operations within a hybrid attention-convolution framework, thereby simultaneously lowering memory overhead, preserving learning capability and cutting energy consumption.

What carries the argument

Multi-dimensional grouped computation realized through the ExpG-IF neuron model for lossless conversion and the GW-SSA attention block for multiplication-free grouped operations.

If this is right

  • S-ViTs can be deployed on resource-constrained hardware while retaining accuracy levels previously achievable only by full-precision models.
  • Memory footprint during both training and inference drops because grouping collapses redundant spike and token computations.
  • Energy consumption becomes predictable and ultra-low because the multiplication-free operations and constant-overhead conversion remove variable-cost components of prior SNN training methods.
  • The same grouping principle can be applied to other spiking transformer variants without redesigning the core attention or neuron blocks.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The three-dimensional grouping strategy may transfer directly to spiking versions of other attention-based models such as language transformers.
  • Because the conversion overhead stays constant, the method could support incremental fine-tuning of large pre-trained spiking models on edge devices.
  • Hybrid attention-convolution designs might become a standard template for balancing accuracy and efficiency across additional spiking neural network families.

Load-bearing premise

The ExpG-IF model and GW-SSA attention can be trained end-to-end at constant overhead while still delivering lossless conversion and exact spike-pattern control.

What would settle it

An end-to-end training run on a standard image-classification benchmark in which the ExpG-IF conversion produces measurable accuracy loss or the GW-SSA block requires more than constant extra computation relative to baseline spiking transformers.

Figures

Figures reproduced from arXiv: 2604.08894 by Kang Chen, Shenghao Xie, Tiejun Huang, Wenxuan Liu, Zecheng Hao, Zhaofei Yu.

Figure 1
Figure 1. Figure 1: Comparison of network parameter count, inference [PITH_FULL_IMAGE:figures/full_fig_p001_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Overall architecture of Ge²mS-T, comprising a series of stage-oriented blocks constructed from SSA, SConv and SFFN. Com [PITH_FULL_IMAGE:figures/full_fig_p005_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Distribution of energy-related metrics for modules (SSA, SConv and SFFN) on the ImageNet-1k dataset, including average firing [PITH_FULL_IMAGE:figures/full_fig_p008_3.png] view at source ↗
read the original abstract

Spiking Neural Networks (SNNs) offer superior energy efficiency over Artificial Neural Networks (ANNs). However, they encounter significant deficiencies in training and inference metrics when applied to Spiking Vision Transformers (S-ViTs). Existing paradigms including ANN-SNN Conversion and Spatial-Temporal Backpropagation (STBP) suffer from inherent limitations, precluding concurrent optimization of memory, accuracy and energy consumption. To address these issues, we propose Ge$^\text{2}$mS-T, a novel architecture implementing grouped computation across temporal, spatial and network structure dimensions. Specifically, we introduce the Grouped-Exponential-Coding-based IF (ExpG-IF) model, enabling lossless conversion with constant training overhead and precise regulation for spike patterns. Additionally, we develop Group-wise Spiking Self-Attention (GW-SSA) to reduce computational complexity via multi-scale token grouping and multiplication-free operations within a hybrid attention-convolution framework. Experiments confirm that our method can achieve superior performance with ultra-high energy efficiency on challenging benchmarks. To our best knowledge, this is the first work to systematically establish multi-dimensional grouped computation for resolving the triad of memory overhead, learning capability and energy budget in S-ViTs.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 0 minor

Summary. The manuscript proposes Ge²mS-T, a spiking Vision Transformer architecture that applies grouped computation across temporal, spatial, and network dimensions to address the trade-offs among memory overhead, learning capability, and energy consumption in S-ViTs. It introduces the Grouped-Exponential-Coding-based IF (ExpG-IF) neuron model claimed to enable lossless ANN-to-SNN conversion, constant training overhead independent of time steps or depth, and precise spike-pattern regulation. It also presents Group-wise Spiking Self-Attention (GW-SSA) that reduces complexity through multi-scale token grouping and multiplication-free operations in a hybrid attention-convolution framework. Experiments are said to demonstrate superior performance and ultra-high energy efficiency on challenging benchmarks, with the work positioned as the first to systematically establish multi-dimensional grouped computation for S-ViTs.

Significance. If the ExpG-IF model and GW-SSA deliver lossless conversion, constant-overhead end-to-end training, and the claimed efficiency gains without inheriting artifacts from ANN-SNN conversion or STBP instability, the approach could meaningfully advance practical deployment of energy-efficient S-ViTs. The multi-dimensional grouping strategy targets a recognized triad of limitations in spiking transformers and, if validated, would provide a concrete architectural route to higher efficiency while preserving accuracy.

major comments (2)
  1. Abstract: The central claim that ExpG-IF simultaneously achieves lossless conversion, constant training overhead, and precise spike-pattern control is load-bearing for the triad-resolution argument, yet the manuscript provides no derivation of equivalence, no analytic bound on conversion error, and no ablation relating error to time-step count or group size. Without such support the guarantee that the method avoids ANN-SNN conversion limitations cannot be assessed.
  2. Abstract: The assertion of 'constant training overhead' independent of time steps or depth is presented without reference to the underlying computational graph, caching mechanism, or complexity analysis; this property is essential to the energy-budget claim and requires explicit verification against standard STBP scaling.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the detailed and insightful comments on our manuscript. We address each of the major comments below and have made revisions to strengthen the presentation of our claims.

read point-by-point responses
  1. Referee: Abstract: The central claim that ExpG-IF simultaneously achieves lossless conversion, constant training overhead, and precise spike-pattern control is load-bearing for the triad-resolution argument, yet the manuscript provides no derivation of equivalence, no analytic bound on conversion error, and no ablation relating error to time-step count or group size. Without such support the guarantee that the method avoids ANN-SNN conversion limitations cannot be assessed.

    Authors: We acknowledge the referee's concern regarding the lack of supporting derivations and analyses for the ExpG-IF claims in the abstract. Upon review, we recognize that while the main text describes the model, explicit derivations were not sufficiently highlighted. In the revised manuscript, we will add a new subsection in Section 3 providing the mathematical derivation of the lossless equivalence, including the proof that the grouped exponential coding preserves the activation function exactly. We will also derive an analytic upper bound on the conversion error, showing it is bounded by a term inversely proportional to the group size. Furthermore, we will include additional ablation experiments in the experimental section that vary time steps and group sizes to quantify the error, confirming it remains below 0.1% across tested configurations. These changes will allow readers to assess the validity of the claims. revision: yes

  2. Referee: Abstract: The assertion of 'constant training overhead' independent of time steps or depth is presented without reference to the underlying computational graph, caching mechanism, or complexity analysis; this property is essential to the energy-budget claim and requires explicit verification against standard STBP scaling.

    Authors: We agree that the constant training overhead claim requires more explicit justification. The original manuscript describes the grouped computation but does not include a dedicated complexity analysis. In the revision, we will add a detailed complexity analysis section that outlines the computational graph, the role of caching in the grouped exponential coding, and the resulting O(1) scaling with respect to time steps and network depth. We will also provide a direct comparison to standard STBP, demonstrating that our method avoids the linear scaling in time steps typical of STBP through the use of multi-dimensional grouping. Empirical results from training time measurements will be included to verify the theoretical analysis. revision: yes

Circularity Check

0 steps flagged

No circularity: new architectural proposal with independent design claims

full rationale

The paper introduces Ge²mS-T as a novel multi-dimensional grouped architecture for S-ViTs, defining ExpG-IF for lossless conversion and GW-SSA for efficient attention. No equations, derivations, or self-citations are shown that reduce the central claims (lossless conversion with constant overhead, precise spike control) to fitted inputs or prior self-referential results by construction. The triad-resolution claim rests on the proposed components' stated properties rather than any loop where outputs are renamed as predictions or uniqueness is imported from the authors' own unverified prior work. This is a standard forward architectural contribution, self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Based solely on the abstract, no explicit free parameters, axioms, or invented entities are stated; the proposal rests on the unverified assumption that grouped computation simultaneously resolves the three-way trade-off.

pith-pipeline@v0.9.0 · 5541 in / 1154 out tokens · 68500 ms · 2026-05-10T17:41:58.239586+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

50 extracted references · 50 canonical work pages

  1. [1]

    Optimal ANN-SNN conversion for high-accuracy and ultra-low-latency spiking neural networks

    Tong Bu, Wei Fang, Jianhao Ding, PengLin Dai, Zhaofei Yu, and Tiejun Huang. Optimal ANN-SNN conversion for high-accuracy and ultra-low-latency spiking neural networks. InInternational Conference on Learning Representations, 2022. 1, 2, 4

  2. [2]

    Spiking deep convolutional neural networks for energy-efficient object recognition.International Journal of Computer Vision, 113(1):54–66, 2015

    Yongqiang Cao, Yang Chen, and Deepak Khosla. Spiking deep convolutional neural networks for energy-efficient object recognition.International Journal of Computer Vision, 113(1):54–66, 2015. 1, 2

  3. [3]

    Loihi: A neuromorphic manycore proces- sor with on-chip learning.IEEE Micro, 38(1):82–99,

    Mike Davies, Narayan Srinivasa, Tsung-Han Lin, Gautham Chinya, Yongqiang Cao, Sri Harsha Choday, Georgios Dimou, Prasad Joshi, Nabil Imam, Shweta Jain, et al. Loihi: A neuromorphic manycore proces- sor with on-chip learning.IEEE Micro, 38(1):82–99,

  4. [4]

    TrueNorth: Accel- erating from zero to 64 million neurons in 10 years

    Michael V DeBole, Brian Taba, Arnon Amir, Filipp Akopyan, Alexander Andreopoulos, William P Risk, Jeff Kusnitz, Carlos Ortega Otero, Tapan K Nayak, Rathinakumar Appuswamy, et al. TrueNorth: Accel- erating from zero to 64 million neurons in 10 years. Computer, 52(5):20–29, 2019. 1

  5. [5]

    Imagenet: A large-scale hierarchical im- age database

    Jia Deng, Richard Socher, Lijia Li, Kai Li, and Feifei Li. Imagenet: A large-scale hierarchical im- age database. InProceedings of the IEEE/CVF Con- ference on Computer Vision and Pattern Recognition,

  6. [6]

    Temporal efficient training of spiking neural network via gradient re-weighting

    Shikuang Deng, Yuhang Li, Shanghang Zhang, and Shi Gu. Temporal efficient training of spiking neural network via gradient re-weighting. 2022. 8, 9

  7. [7]

    Rethinking spiking neural networks from an ensemble learning perspective

    Yongqi Ding, Lin Zuo, Mengmeng Jing, Pei He, and Hanpu Deng. Rethinking spiking neural networks from an ensemble learning perspective. InInterna- tional Conference on Learning Representations, 2025. 1, 3

  8. [8]

    Deep residual learning in spiking neural networks

    Wei Fang, Zhaofei Yu, Yanqi Chen, Tiejun Huang, Timoth´ee Masquelier, and Yonghong Tian. Deep residual learning in spiking neural networks. InAd- vances in Neural Information Processing Systems,

  9. [9]

    Incor- porating learnable membrane time constant to enhance 9 learning of spiking neural networks

    Wei Fang, Zhaofei Yu, Yanqi Chen, Timothee Masquelier, Tiejun Huang, and Yonghong Tian. Incor- porating learnable membrane time constant to enhance 9 learning of spiking neural networks. InProceedings of the IEEE/CVF International Conference on Computer Vision, 2021. 3

  10. [10]

    Parallel spiking neurons with high ef- ficiency and ability to learn long-term dependencies

    Wei Fang, Zhaofei Yu, Zhaokun Zhou, Ding Chen, Yanqi Chen, Zhengyu Ma, Timoth´ee Masquelier, and Yonghong Tian. Parallel spiking neurons with high ef- ficiency and ability to learn long-term dependencies. InAdvances in Neural Information Processing Sys- tems, 2023. 3

  11. [11]

    Cambridge university press, 2002

    Wulfram Gerstner and Werner M Kistler.Spiking Neu- ron Models: Single Neurons, Populations, Plasticity. Cambridge university press, 2002. 3

  12. [12]

    Ternary spike: Learning ternary spikes for spiking neural net- works

    Yufei Guo, Yuanpei Chen, Xiaode Liu, Weihang Peng, Yuhan Zhang, Xuhui Huang, and Zhe Ma. Ternary spike: Learning ternary spikes for spiking neural net- works. InAAAI Conference on Artificial Intelligence,

  13. [13]

    Spiking transformer: In- troducing accurate addition-only spiking self-attention for transformer

    Yufei Guo, Xiaode Liu, Yuanpei Chen, Weihang Peng, Yuhan Zhang, and Zhe Ma. Spiking transformer: In- troducing accurate addition-only spiking self-attention for transformer. InProceedings of the IEEE/CVF Con- ference on Computer Vision and Pattern Recognition,

  14. [14]

    RecDis- SNN: Rectifying membrane potential distribution for directly training spiking neural networks

    Yufei Guo, Xinyi Tong, Yuanpei Chen, Liwen Zhang, Xiaode Liu, Zhe Ma, and Xuhui Huang. RecDis- SNN: Rectifying membrane potential distribution for directly training spiking neural networks. InProceed- ings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022. 3

  15. [15]

    Agent attention: On the integration of soft- max and linear attention

    Dongchen Han, Tianzhu Ye, Yizeng Han, Zhuofan Xia, Siyuan Pan, Pengfei Wan, Shiji Song, and Gao Huang. Agent attention: On the integration of soft- max and linear attention. InEuropean Conference on Computer Vision, 2024. 2

  16. [16]

    Bridging the gap between ANNs and SNNs by calibrating offset spikes

    Zecheng Hao, Jianhao Ding, Tong Bu, Tiejun Huang, and Zhaofei Yu. Bridging the gap between ANNs and SNNs by calibrating offset spikes. InInternational Conference on Learning Representations, 2023. 2

  17. [17]

    Faster and stronger: When ann-snn conversion meets parallel spiking cal- culation

    Zecheng Hao, Qichao Ma, Kang Chen, Yi Zhang, Zhaofei Yu, and Tiejun Huang. Faster and stronger: When ann-snn conversion meets parallel spiking cal- culation. InInternational Conference on Machine Learning, 2025. 1, 2

  18. [18]

    Lm-ht snn: Enhancing the performance of snn to ann counterpart through learnable multi- hierarchical threshold model

    Zecheng Hao, Xinyu Shi, Yujia Liu, Zhaofei Yu, and Tiejun Huang. Lm-ht snn: Enhancing the performance of snn to ann counterpart through learnable multi- hierarchical threshold model. InAdvances in Neural Information Processing Systems, 2024. 3

  19. [19]

    Spiking deep residual networks.IEEE Transactions on Neural Networks and Learning Systems, 34(8):5200–5205,

    Yangfan Hu, Huajin Tang, and Gang Pan. Spiking deep residual networks.IEEE Transactions on Neural Networks and Learning Systems, 34(8):5200–5205,

  20. [20]

    Advancing spiking neural networks towards deep residual learning.IEEE Transactions on Neural Net- works and Learning Systems, 36(2):2353–2367, 2024

    Yifan Hu, Lei Deng, Yujie Wu, Man Yao, and Guoqi Li. Advancing spiking neural networks towards deep residual learning.IEEE Transactions on Neural Net- works and Learning Systems, 36(2):2353–2367, 2024. 7, 8

  21. [21]

    Clif: Complementary leaky integrate-and-fire neuron for spiking neural networks

    Yulong Huang, Xiaopeng Lin, Hongwei Ren, Haotian Fu, Yue Zhou, Zunchang Liu, Biao Pan, and Bojun Cheng. Clif: Complementary leaky integrate-and-fire neuron for spiking neural networks. InInternational Conference on Machine Learning, 2024. 3

  22. [22]

    Towards high-performance spiking transformers from ann to snn conversion

    Zihan Huang, Xinyu Shi, Zecheng Hao, Tong Bu, Jianhao Ding, Zhaofei Yu, and Tiejun Huang. Towards high-performance spiking transformers from ann to snn conversion. InACM International Conference on Multimedia, 2024. 2, 3

  23. [23]

    Learning multiple layers of features from tiny images

    Alex Krizhevsky, Geoffrey Hinton, et al. Learning multiple layers of features from tiny images. 2009. 7

  24. [24]

    Cifar10-dvs: an event-stream dataset for object classification.Frontiers in Neuroscience, 11, 2017

    Hongmin Li, Hanchao Liu, Xiangyang Ji, Guoqi Li, and Luping Shi. Cifar10-dvs: an event-stream dataset for object classification.Frontiers in Neuroscience, 11, 2017. 7

  25. [25]

    A free lunch from ANN: Towards effi- cient, accurate spiking neural networks calibration

    Yuhang Li, Shikuang Deng, Xin Dong, Ruihao Gong, and Shi Gu. A free lunch from ANN: Towards effi- cient, accurate spiking neural networks calibration. In International Conference on Machine Learning, 2021. 1, 2

  26. [26]

    Additive powers-of-two quantization: An efficient non-uniform discretization for neural networks

    Yuhang Li, Xin Dong, and Wei Wang. Additive powers-of-two quantization: An efficient non-uniform discretization for neural networks. InInternational Conference on Learning Representations, 2020. 2

  27. [27]

    Swin transformer: Hierarchical vision transformer using shifted windows

    Ze Liu, Yutong Lin, Yue Cao, Han Hu, Yixuan Wei, Zheng Zhang, Stephen Lin, and Baining Guo. Swin transformer: Hierarchical vision transformer using shifted windows. InProceedings of the IEEE/CVF International Conference on Computer Vision, 2021. 2

  28. [28]

    Integer-valued training and spike-driven in- ference spiking neural network for high-performance and energy-efficient object detection

    Xinhao Luo, Man Yao, Yuhong Chou, Bo Xu, and Guoqi Li. Integer-valued training and spike-driven in- ference spiking neural network for high-performance and energy-efficient object detection. InEuropean Conference on Computer Vision, 2024. 3

  29. [29]

    Darwin: A neuromorphic hardware co- processor based on spiking neural networks.Journal of systems architecture, 77:43–51, 2017

    De Ma, Juncheng Shen, Zonghua Gu, Ming Zhang, Xiaolei Zhu, Xiaoqiang Xu, Qi Xu, Yangjing Shen, and Gang Pan. Darwin: A neuromorphic hardware co- processor based on spiking neural networks.Journal of systems architecture, 77:43–51, 2017. 1

  30. [30]

    Networks of spiking neurons: the third generation of neural network models.Neural Networks, 10(9):1659–1671, 1997

    Wolfgang Maass. Networks of spiking neurons: the third generation of neural network models.Neural Networks, 10(9):1659–1671, 1997. 1

  31. [31]

    Towards 10 memory and time-efficient backpropagation for train- ing spiking neural networks

    Qingyan Meng, Mingqing Xiao, Shen Yan, Yisen Wang, Zhouchen Lin, and Zhiquan Luo. Towards 10 memory and time-efficient backpropagation for train- ing spiking neural networks. InProceedings of the IEEE/CVF International Conference on Computer Vi- sion, 2023. 3, 8

  32. [32]

    Towards artificial general intel- ligence with hybrid tianjic chip architecture.Nature, 572(7767):106–111, 2019

    Jing Pei, Lei Deng, Sen Song, Mingguo Zhao, Youhui Zhang, Shuang Wu, Guanrui Wang, Zhe Zou, Zhenzhi Wu, Wei He, et al. Towards artificial general intel- ligence with hybrid tianjic chip architecture.Nature, 572(7767):106–111, 2019. 1

  33. [33]

    Quantized spike- driven transformer

    Xuerui Qiu, Malu Zhang, Jieyuan Zhang, Wenjie Wei, Honglin Cao, Junsheng Guo, Rui-Jie Zhu, Yimeng Shan, Yang Yang, and Haizhou Li. Quantized spike- driven transformer. InInternational Conference on Learning Representations, 2025. 3

  34. [34]

    Gated at- tention coding for training high-performance and ef- ficient spiking neural networks

    Xuerui Qiu, Rui-Jie Zhu, Yuhong Chou, Zhaorui Wang, Liang-jian Deng, and Guoqi Li. Gated at- tention coding for training high-performance and ef- ficient spiking neural networks. InAAAI Conference on Artificial Intelligence, 2024. 8, 9

  35. [35]

    Spikepack: Enhanced informa- tion flow in spiking neural networks with high hard- ware compatibility

    Guobin Shen, Jindong Li, Tenglong Li, Dongcheng Zhao, and Yi Zeng. Spikepack: Enhanced informa- tion flow in spiking neural networks with high hard- ware compatibility. InProceedings of the IEEE/CVF International Conference on Computer Vision, 2025. 3

  36. [36]

    Spik- ingresformer: Bridging resnet and vision transformer in spiking neural networks

    Xinyu Shi, Zecheng Hao, and Zhaofei Yu. Spik- ingresformer: Bridging resnet and vision transformer in spiking neural networks. InProceedings of the IEEE/CVF Conference on Computer Vision and Pat- tern Recognition, 2024. 3

  37. [37]

    Spiking vision transformer with saccadic attention

    Shuai Wang, Malu Zhang, Dehao Zhang, Ammar Be- latreche, Yichen Xiao, Yu Liang, Yimeng Shan, Qian Sun, Enqi Zhang, and Yang Yang. Spiking vision transformer with saccadic attention. InInternational Conference on Learning Representations, 2025. 7, 8

  38. [38]

    Masked spiking transformer

    Ziqing Wang, Yuetong Fang, Jiahang Cao, Qiang Zhang, Zhongrui Wang, and Renjing Xu. Masked spiking transformer. InProceedings of the IEEE/CVF International Conference on Computer Vision, 2023. 2, 3

  39. [39]

    Ftbc: Forward temporal bias correc- tion for optimizing ann-snn conversion

    Xiaofeng Wu, Velibor Bojkovic, Bin Gu, Kun Suo, and Kai Zou. Ftbc: Forward temporal bias correc- tion for optimizing ann-snn conversion. InEuropean Conference on Computer Vision, 2024. 2

  40. [40]

    Spatio-temporal backpropagation for training high-performance spiking neural networks.Frontiers in Neuroscience, 12:331, 2018

    Yujie Wu, Lei Deng, Guoqi Li, Jun Zhu, and Lup- ing Shi. Spatio-temporal backpropagation for training high-performance spiking neural networks.Frontiers in Neuroscience, 12:331, 2018. 1, 2

  41. [41]

    Online training through time for spiking neural networks

    Mingqing Xiao, Qingyan Meng, Zongpeng Zhang, Di He, and Zhouchen Lin. Online training through time for spiking neural networks. InAdvances in Neu- ral Information Processing Systems, 2022. 1, 3

  42. [42]

    Rethinking spiking self-attention mechanism: implementing a-xnor similarity calculation in spiking transformers

    Yichen Xiao, Shuai Wang, Dehao Zhang, Wenjie Wei, Yimeng Shan, Xiaoli Liu, Yulin Jiang, and Malu Zhang. Rethinking spiking self-attention mechanism: implementing a-xnor similarity calculation in spiking transformers. InProceedings of the IEEE/CVF Con- ference on Computer Vision and Pattern Recognition,

  43. [43]

    Constructing deep spiking neural networks from artificial neural networks with knowl- edge distillation

    Qi Xu, Yaxin Li, Jiangrong Shen, Jian K Liu, Huajin Tang, and Gang Pan. Constructing deep spiking neural networks from artificial neural networks with knowl- edge distillation. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recog- nition, 2023. 1, 3

  44. [44]

    Spike-driven transformer

    Man Yao, Jiakui Hu, Zhaokun Zhou, Li Yuan, Yonghong Tian, Bo Xu, and Guoqi Li. Spike-driven transformer. InAdvances in Neural Information Pro- cessing Systems, 2023. 2, 3, 7, 8

  45. [45]

    Scaling spike-driven transformer with efficient spike firing approximation training.IEEE Transactions on Pattern Analysis and Machine Intelligence, 47(4):2973–2990, 2025

    Man Yao, Xuerui Qiu, Tianxiang Hu, Jiakui Hu, Yuhong Chou, Keyu Tian, Jianxing Liao, Luziwei Leng, Bo Xu, and Guoqi Li. Scaling spike-driven transformer with efficient spike firing approximation training.IEEE Transactions on Pattern Analysis and Machine Intelligence, 47(4):2973–2990, 2025. 3

  46. [46]

    Attention spiking neural networks.IEEE Transac- tions on Pattern Analysis and Machine Intelligence, 45(8):9393–9410, 2023

    Man Yao, Guangshe Zhao, Hengyu Zhang, Yifan Hu, Lei Deng, Yonghong Tian, Bo Xu, and Guoqi Li. Attention spiking neural networks.IEEE Transac- tions on Pattern Analysis and Machine Intelligence, 45(8):9393–9410, 2023. 7, 8

  47. [47]

    GLIF: A unified gated leaky integrate-and-fire neuron for spiking neural networks

    Xingting Yao, Fanrong Li, Zitao Mo, and Jian Cheng. GLIF: A unified gated leaky integrate-and-fire neuron for spiking neural networks. InAdvances in Neural Information Processing Systems, 2022. 3

  48. [48]

    Going deeper with directly-trained larger spiking neural networks

    Hanle Zheng, Yujie Wu, Lei Deng, Yifan Hu, and Guoqi Li. Going deeper with directly-trained larger spiking neural networks. InAAAI Conference on Arti- ficial Intelligence, 2021. 7, 8

  49. [49]

    Spikingformer: A key foundation model for spiking neural networks

    Chenlin Zhou, Liutao Yu, Zhaokun Zhou, Han Zhang, Jiaqi Wang, Huihui Zhou, Zhengyu Ma, and Yonghong Tian. Spikingformer: A key foundation model for spiking neural networks. InAAAI Confer- ence on Artificial Intelligence, 2026. 7, 8, 9

  50. [50]

    Spikformer: When spiking neural network meets transformer

    Zhaokun Zhou, Yuesheng Zhu, Chao He, Yaowei Wang, Shuicheng Yan, Yonghong Tian, and Yuan Li. Spikformer: When spiking neural network meets transformer. InInternational Conference on Learning Representations, 2023. 2, 3, 7, 8, 9 11