A hierarchical spiking transformer using Q-K attention achieves 85.65% top-1 accuracy on ImageNet-1K, the first direct-trained SNN to exceed 85%.
Advancing residual learning towards powerful deep spiking neural networks
3 Pith papers cite this work. Polarity classification is still indexing.
fields
cs.NE 3representative citing papers
BiSpikCLM is the first fully binary spiking MatMul-free causal language model that matches ANN performance on generation tasks using only 4-6 percent of the compute via softmax-free spiking attention and spike-aware distillation.
ASN uses trainable parameters for adaptive membrane dynamics and firing in SNNs, with NASN adding normalization, and reports effectiveness across 19 vision and language datasets.
citing papers explorer
-
QKFormer: Hierarchical Spiking Transformer using Q-K Attention
A hierarchical spiking transformer using Q-K attention achieves 85.65% top-1 accuracy on ImageNet-1K, the first direct-trained SNN to exceed 85%.
-
BiSpikCLM: A Spiking Language Model integrating Softmax-Free Spiking Attention and Spike-Aware Alignment Distillation
BiSpikCLM is the first fully binary spiking MatMul-free causal language model that matches ANN performance on generation tasks using only 4-6 percent of the compute via softmax-free spiking attention and spike-aware distillation.
-
Adaptive Spiking Neurons for Vision and Language Modeling
ASN uses trainable parameters for adaptive membrane dynamics and firing in SNNs, with NASN adding normalization, and reports effectiveness across 19 vision and language datasets.