A single Elastic Spiking Transformer model dynamically slices network width and attention heads at runtime via granularity-aware weight sharing, matching or exceeding fixed baselines on CIFAR and gesture datasets while reducing spike operations.
Spikingformer: Spike-driven residual learning for transformer-based spiking neural network
5 Pith papers cite this work. Polarity classification is still indexing.
years
2026 5verdicts
UNVERDICTED 5representative citing papers
Winner-take-all spiking self-attention replaces softmax in spiking transformers to support language modeling on 16 datasets with spike-driven, energy-efficient architectures.
Vision SmolMamba adds spike-guided spatio-temporal token pruning to a bidirectional spiking state-space backbone, cutting estimated energy by at least 1.5x versus prior spiking Transformers and Spiking Mamba variants on ImageNet-1K and event-based datasets while keeping competitive accuracy.
BiSpikCLM is the first fully binary spiking MatMul-free causal language model that matches ANN performance on generation tasks using only 4-6 percent of the compute via softmax-free spiking attention and spike-aware distillation.
LSFormer uses local structure-aware spiking self-attention and spiking response pooling to cut global attention bottlenecks, delivering 4.3% and 8.6% accuracy gains on Tiny-ImageNet and N-CALTECH101 over prior transformer-based SNNs.
citing papers explorer
-
Elastic Spiking Transformers for Efficient Gesture Understanding
A single Elastic Spiking Transformer model dynamically slices network width and attention heads at runtime via granularity-aware weight sharing, matching or exceeding fixed baselines on CIFAR and gesture datasets while reducing spike operations.
-
Winner-Take-All Spiking Transformer for Language Modeling
Winner-take-all spiking self-attention replaces softmax in spiking transformers to support language modeling on 16 datasets with spike-driven, energy-efficient architectures.
-
Vision SmolMamba: Spike-Guided Token Pruning for Energy-Efficient Spiking State-Space Vision Models
Vision SmolMamba adds spike-guided spatio-temporal token pruning to a bidirectional spiking state-space backbone, cutting estimated energy by at least 1.5x versus prior spiking Transformers and Spiking Mamba variants on ImageNet-1K and event-based datasets while keeping competitive accuracy.
-
BiSpikCLM: A Spiking Language Model integrating Softmax-Free Spiking Attention and Spike-Aware Alignment Distillation
BiSpikCLM is the first fully binary spiking MatMul-free causal language model that matches ANN performance on generation tasks using only 4-6 percent of the compute via softmax-free spiking attention and spike-aware distillation.
-
Breaking Global Self-Attention Bottlenecks in Transformer-based Spiking Neural Networks with Local Structure-Aware Self-Attention
LSFormer uses local structure-aware spiking self-attention and spiking response pooling to cut global attention bottlenecks, delivering 4.3% and 8.6% accuracy gains on Tiny-ImageNet and N-CALTECH101 over prior transformer-based SNNs.