Temporal-Aware Spiking Transformer Hashing Based on 3D-DWT
Pith reviewed 2026-05-23 05:15 UTC · model grok-4.3
The pith
Spikinghash uses a 3D wavelet mixer and membrane-potential loss in spiking transformers to produce efficient hash codes for dynamic vision sensor data.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
Spikinghash is a hierarchical supervised hashing architecture for spiking neural networks. In shallow layers the Spiking WaveMixer applies multilevel 3D-DWT to separate spatiotemporal features into low- and high-frequency components and performs spectral fusion to capture temporal dependencies and local spatial structure. Deeper layers employ Spiking Self-Attention to extract global spatiotemporal information. The final hash layer integrates membrane activity across multiple time steps to produce binary hash codes. A dynamic soft similarity loss constructs a learnable similarity matrix from membrane potentials to serve as soft labels, thereby compensating for information loss in SNNs and提升检索
What carries the argument
Spiking WaveMixer (SWM) that performs multilevel 3D-DWT feature decoupling and spectral fusion, placed hierarchically with Spiking Self-Attention and a membrane-potential-based dynamic soft similarity loss.
If this is right
- Hash codes preserve the distance relationships present in the original DVS data.
- Energy consumption and parameter count remain lower than conventional deep-learning hashing methods.
- State-of-the-art retrieval accuracy is reached on multiple DVS datasets.
- The binary nature of spikes directly supplies the final hash codes without additional binarization steps.
Where Pith is reading between the lines
- The 3D-DWT decoupling step could be tested in other spiking architectures that process event streams.
- The membrane-potential loss formulation might transfer to spiking models for tasks beyond hashing, such as classification or clustering.
- Lower energy and parameter counts suggest possible deployment on resource-constrained edge hardware that receives DVS input directly.
Load-bearing premise
The specific combination of 3D-DWT decoupling, spectral fusion, and membrane-potential soft similarity loss will reliably offset information loss inside spiking networks and produce higher-quality hash codes.
What would settle it
An ablation that disables the 3D-DWT component or the membrane-potential loss and measures whether mean average precision on the evaluated DVS datasets falls below the best non-spiking hashing baseline.
Figures
read the original abstract
With the rapid growth of dynamic vision sensor (DVS) data, constructing a low-energy, efficient data retrieval system has become an urgent task. Hash learning is one of the most important retrieval technologies which can keep the distance between hash codes consistent with the distance between DVS data. As spiking neural networks (SNNs) can encode information through spikes, they demonstrate great potential in promoting energy efficiency. Based on the binary characteristics of SNNs, we first propose a novel supervised hashing method named Spikinghash with a hierarchical lightweight structure. Spiking WaveMixer (SWM) is deployed in shallow layers, utilizing a multilevel 3D discrete wavelet transform (3D-DWT) to decouple spatiotemporal features into various low-frequency and high frequency components, and then employing efficient spectral feature fusion. SWM can effectively capture the temporal dependencies and local spatial features. Spiking Self-Attention (SSA) is deployed in deeper layers to further extract global spatiotemporal information. We also design a hash layer utilizing binary characteristic of SNNs, which integrates information over multiple time steps to generate final hash codes. Furthermore, we propose a new dynamic soft similarity loss for SNNs, which utilizes membrane potentials to construct a learnable similarity matrix as soft labels to fully capture the similarity differences between classes and compensate information loss in SNNs, thereby improving retrieval performance. Experiments on multiple datasets demonstrate that Spikinghash can achieve state-of-the-art results with low energy consumption and fewer parameters.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript proposes Spikinghash, a supervised hashing method for dynamic vision sensor (DVS) data based on spiking neural networks. It employs a hierarchical architecture with Spiking WaveMixer (SWM) modules in shallow layers that apply multilevel 3D discrete wavelet transform (3D-DWT) for spatiotemporal feature decoupling into low- and high-frequency components followed by spectral fusion, Spiking Self-Attention (SSA) in deeper layers for global information, a membrane-potential hash layer that integrates spikes over time steps to produce binary codes, and a dynamic soft similarity loss that constructs learnable similarity matrices from membrane potentials to serve as soft labels. The central claim is that this combination achieves state-of-the-art retrieval performance on multiple datasets while maintaining low energy consumption and fewer parameters compared to existing methods.
Significance. If the experimental results hold, the work offers a meaningful contribution to energy-efficient content-based retrieval for event-based vision by integrating wavelet-based feature processing, spiking attention, and a membrane-potential loss within an SNN framework. The introduction of SWM and the dynamic soft similarity loss represent concrete attempts to address information loss typical in spiking networks for hashing tasks. The focus on practical metrics such as energy and parameter count aligns with deployment needs for DVS data. The manuscript supplies sufficient architectural and implementation detail to support reproducibility in principle.
minor comments (3)
- [Abstract] Abstract: The claim of state-of-the-art results would be strengthened by briefly naming the datasets and reporting the magnitude of improvements (e.g., mAP gains) rather than leaving the assertion unsupported in the abstract alone.
- [§4] §4 (Experiments): Confirm that all reported results include standard deviations across multiple runs and explicit baseline implementations with matching training protocols to ensure fair comparison.
- [§3.4] Notation: Define the precise mathematical form of the dynamic soft similarity loss (including how membrane potentials are mapped to the similarity matrix) in a dedicated equation to aid reproducibility.
Simulated Author's Rebuttal
We thank the referee for the positive evaluation of our manuscript and the recommendation for minor revision. The recognition of the contributions of Spikinghash, including the Spiking WaveMixer, Spiking Self-Attention, and dynamic soft similarity loss for energy-efficient DVS retrieval, is appreciated. No specific major comments were listed in the report.
Circularity Check
No significant circularity; method is empirical construction without self-referential derivations
full rationale
The paper presents an empirical architecture (SWM with 3D-DWT, SSA, membrane-potential hash layer, dynamic soft similarity loss) and reports experimental SOTA results on retrieval metrics, energy, and parameters. No equations, derivations, or first-principles claims appear in the provided text that reduce performance to quantities defined by the method's own fitted parameters or self-citations. The central claim rests on reproducible implementation details and external dataset benchmarks rather than any internal reduction by construction. No self-definitional, fitted-input, or uniqueness-imported steps are identifiable.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption Spiking neural networks encode information through spikes and possess binary characteristics suitable for generating hash codes.
invented entities (2)
-
Spiking WaveMixer (SWM)
no independent evidence
-
dynamic soft similarity loss
no independent evidence
Lean theorems connected to this paper
-
IndisputableMonolith/Foundation/DimensionForcing.leanreality_from_one_distinction (8-tick period) unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
We uniformly use the LIF model... time step of the spiking neuron is 16... timestep is set to 4... time steps to 8
-
IndisputableMonolith/Foundation/AlexanderDuality.leanalexander_duality_circle_linking (D=3) unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
Spiking WaveMixer (SWM) ... multilevel 3D discrete wavelet transform (3D-DWT) to decouple spatiotemporal features
-
IndisputableMonolith/Cost/FunctionalEquation.leanwashburn_uniqueness_aczel (J-cost) unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
dynamic soft similarity loss ... membrane potentials to construct a learnable similarity matrix
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
Reference graph
Works this paper leans on
-
[1]
Event- based vision: A survey,
G. Gallego, T. Delbr ¨uck, G. Orchard, C. Bartolozzi, B. Taba, A. Censi, S. Leutenegger, A. J. Davison, J. Conradt, K. Daniilidis et al., “Event- based vision: A survey,” IEEE transactions on pattern analysis and machine intelligence, vol. 44, no. 1, pp. 154–180, 2020. JOURNAL OF LATEX CLASS FILES, VOL. 14, NO. 8, AUGUST 2021 13
work page 2020
-
[2]
A low power, fully event-based gesture recognition system,
A. Amir, B. Taba, D. Berg, T. Melano, J. McKinstry, C. Di Nolfo, T. Nayak, A. Andreopoulos, G. Garreau, M. Mendoza et al. , “A low power, fully event-based gesture recognition system,” in Proceedings of the IEEE conference on computer vision and pattern recognition , 2017, pp. 7243–7252
work page 2017
-
[3]
High speed and high dynamic range video with an event camera,
H. Rebecq, R. Ranftl, V . Koltun, and D. Scaramuzza, “High speed and high dynamic range video with an event camera,” IEEE transactions on pattern analysis and machine intelligence, vol. 43, no. 6, pp. 1964–1980, 2019
work page 1964
-
[4]
Networks of spiking neurons: the third generation of neural network models,
W. Maass, “Networks of spiking neurons: the third generation of neural network models,” Neural networks, vol. 10, no. 9, pp. 1659–1671, 1997
work page 1997
-
[5]
Towards artificial general intelligence with hybrid tianjic chip architecture,
J. Pei, L. Deng, S. Song, M. Zhao, Y . Zhang, S. Wu, G. Wang, Z. Zou, Z. Wu, W. He et al., “Towards artificial general intelligence with hybrid tianjic chip architecture,” Nature, vol. 572, no. 7767, pp. 106–111, 2019
work page 2019
-
[6]
Spikformer: When spiking neural network meets transformer,
Z. Zhou, Y . Zhu, C. He, Y . Wang, S. Yan, Y . Tian, and L. Yuan, “Spikformer: When spiking neural network meets transformer,” arXiv preprint arXiv:2209.15425, 2022
-
[7]
M. Yao, J. Hu, Z. Zhou, L. Yuan, Y . Tian, B. Xu, and G. Li, “Spike-driven transformer,”Advances in Neural Information Processing Systems, vol. 36, 2024
work page 2024
-
[8]
Spikingformer: Spike-driven residual learning for transformer-based spiking neural network,
C. Zhou, L. Yu, Z. Zhou, Z. Ma, H. Zhang, H. Zhou, and Y . Tian, “Spikingformer: Spike-driven residual learning for transformer-based spiking neural network,” arXiv preprint arXiv:2304.11954 , 2023
-
[9]
M. Yao, J. Hu, T. Hu, Y . Xu, Z. Zhou, Y . Tian, B. Xu, and G. Li, “Spike-driven transformer v2: Meta spiking neural network architecture inspiring the design of next-generation neuromorphic chips,” arXiv preprint arXiv:2404.03663, 2024
-
[10]
Graph-based spatio-temporal feature learning for neuromorphic vision sensing,
Y . Bi, A. Chadha, A. Abbas, E. Bourtsoulatze, and Y . Andreopoulos, “Graph-based spatio-temporal feature learning for neuromorphic vision sensing,” IEEE Transactions on Image Processing , vol. 29, pp. 9084– 9098, 2020
work page 2020
-
[11]
Spatial- temporal self-attention for asynchronous spiking neural networks,
Y . Wang, K. Shi, C. Lu, Y . Liu, M. Zhang, and H. Qu, “Spatial- temporal self-attention for asynchronous spiking neural networks,” in Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, vol. 8, 2023, pp. 3085–3093
work page 2023
-
[12]
Attention spiking neural networks,
M. Yao, G. Zhao, H. Zhang, Y . Hu, L. Deng, Y . Tian, B. Xu, and G. Li, “Attention spiking neural networks,”IEEE transactions on pattern analysis and machine intelligence , 2023
work page 2023
-
[13]
Temporal-wise attention spiking neural networks for event streams clas- sification,
M. Yao, H. Gao, G. Zhao, D. Wang, Y . Lin, Z. Yang, and G. Li, “Temporal-wise attention spiking neural networks for event streams clas- sification,” inProceedings of the IEEE/CVF International Conference on Computer Vision, 2021, pp. 10 221–10 230
work page 2021
-
[14]
Tcja- snn: Temporal-channel joint attention for spiking neural networks,
R.-J. Zhu, M. Zhang, Q. Zhao, H. Deng, Y . Duan, and L.-J. Deng, “Tcja- snn: Temporal-channel joint attention for spiking neural networks,”IEEE Transactions on Neural Networks and Learning Systems , 2024
work page 2024
-
[15]
Gated attention coding for training high-performance and efficient spiking neural networks,
X. Qiu, R.-J. Zhu, Y . Chou, Z. Wang, L.-j. Deng, and G. Li, “Gated attention coding for training high-performance and efficient spiking neural networks,” in Proceedings of the AAAI Conference on Artificial Intelligence, vol. 38, no. 1, 2024, pp. 601–610
work page 2024
-
[16]
Spikingresformer: Bridging resnet and vision transformer in spiking neural networks,
X. Shi, Z. Hao, and Z. Yu, “Spikingresformer: Bridging resnet and vision transformer in spiking neural networks,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition , 2024, pp. 5610–5619
work page 2024
-
[17]
Three-dimensional discrete wavelet transform architectures,
M. Weeks and M. A. Bayoumi, “Three-dimensional discrete wavelet transform architectures,” IEEE Transactions on Signal Processing , vol. 50, no. 8, pp. 2050–2063, 2002
work page 2050
-
[18]
Efficient token mixing for transformers via adaptive fourier neural operators,
J. Guibas, M. Mardani, Z. Li, A. Tao, A. Anandkumar, and B. Catanzaro, “Efficient token mixing for transformers via adaptive fourier neural operators,” in International Conference on Learning Representations , 2021
work page 2021
-
[19]
Scattering vision transformer: Spectral mixing matters,
B. Patro and V . Agneeswaran, “Scattering vision transformer: Spectral mixing matters,” Advances in Neural Information Processing Systems , vol. 36, 2024
work page 2024
-
[20]
An image patch is a wave: Phase-aware vision mlp,
Y . Tang, K. Han, J. Guo, C. Xu, Y . Li, C. Xu, and Y . Wang, “An image patch is a wave: Phase-aware vision mlp,” in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2022, pp. 10 935–10 944
work page 2022
-
[21]
Wave-vit: Unifying wavelet and transformers for visual representation learning,
T. Yao, Y . Pan, Y . Li, C.-W. Ngo, and T. Mei, “Wave-vit: Unifying wavelet and transformers for visual representation learning,” inEuropean Conference on Computer Vision . Springer, 2022, pp. 328–345
work page 2022
-
[22]
Hashnet: Deep learning to hash by continuation,
Z. Cao, M. Long, J. Wang, and P. S. Yu, “Hashnet: Deep learning to hash by continuation,” in Proceedings of the IEEE international conference on computer vision , 2017, pp. 5608–5617
work page 2017
-
[23]
Deep polarized network for supervised learning of accurate binary hashing codes
L. Fan, K. W. Ng, C. Ju, T. Zhang, and C. S. Chan, “Deep polarized network for supervised learning of accurate binary hashing codes.” in IJCAI, 2020, pp. 825–831
work page 2020
-
[24]
Transhash: Transformer-based hamming hashing for efficient image retrieval,
Y . Chen, S. Zhang, F. Liu, Z. Chang, M. Ye, and Z. Qi, “Transhash: Transformer-based hamming hashing for efficient image retrieval,” in Proceedings of the 2022 international conference on multimedia re- trieval, 2022, pp. 127–136
work page 2022
-
[25]
Hashformer: Vision transformer based deep hashing for image retrieval,
T. Li, Z. Zhang, L. Pei, and Y . Gan, “Hashformer: Vision transformer based deep hashing for image retrieval,”IEEE Signal Processing Letters, vol. 29, pp. 827–831, 2022
work page 2022
-
[26]
Structure-adaptive neighborhood preserving hashing for scalable video search,
S. Li, X. Li, and J. Lu, “Structure-adaptive neighborhood preserving hashing for scalable video search,” IEEE Transactions on Circuits and Systems for Video Technology, vol. 32, no. 4, pp. 2441–2454, 2021
work page 2021
-
[27]
Self-supervised video hashing via bidirectional transformers,
S. Li, X. Li, J. Lu, and J. Zhou, “Self-supervised video hashing via bidirectional transformers,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition , 2021, pp. 13 549–13 558
work page 2021
-
[28]
Contrastive masked autoencoders for self-supervised video hashing,
Y . Wang, J. Wang, B. Chen, Z. Zeng, and S.-T. Xia, “Contrastive masked autoencoders for self-supervised video hashing,” in Proceedings of the AAAI Conference on Artificial Intelligence , vol. 37, no. 3, 2023, pp. 2733–2741
work page 2023
-
[29]
Generalized leaky integrate- and-fire models classify multiple neuron types,
C. Teeter, R. Iyer, V . Menon, N. Gouwens, D. Feng, J. Berg, A. Szafer, N. Cain, H. Zeng, M. Hawrylycz et al. , “Generalized leaky integrate- and-fire models classify multiple neuron types,”Nature communications, vol. 9, no. 1, p. 709, 2018
work page 2018
-
[30]
A. L. Hodgkin and A. F. Huxley, “A quantitative description of mem- brane current and its application to conduction and excitation in nerve,” The Journal of physiology , vol. 117, no. 4, p. 500, 1952
work page 1952
-
[31]
Simple model of spiking neurons,
E. M. Izhikevich, “Simple model of spiking neurons,” IEEE Transactions on neural networks , vol. 14, no. 6, pp. 1569–1572, 2003
work page 2003
-
[32]
Spatio-temporal backpropagation for training high-performance spiking neural networks,
Y . Wu, L. Deng, G. Li, and L. Shi, “Spatio-temporal backpropagation for training high-performance spiking neural networks,” Frontiers in neuroscience, vol. 12, p. 323875, 2018
work page 2018
-
[33]
Hardvs: Revisiting human activity recognition with dynamic vision sensors,
X. Wang, Z. Wu, B. Jiang, Z. Bao, L. Zhu, G. Li, Y . Wang, and Y . Tian, “Hardvs: Revisiting human activity recognition with dynamic vision sensors,” in Proceedings of the AAAI Conference on Artificial Intelligence, vol. 38, no. 6, 2024, pp. 5615–5623
work page 2024
-
[34]
Cifar10-dvs: an event-stream dataset for object classification,
H. Li, H. Liu, X. Ji, G. Li, and L. Shi, “Cifar10-dvs: an event-stream dataset for object classification,” Frontiers in neuroscience , vol. 11, p. 244131, 2017
work page 2017
-
[35]
Imagenet: A large-scale hierarchical image database,
J. Deng, W. Dong, R. Socher, L.-J. Li, K. Li, and L. Fei-Fei, “Imagenet: A large-scale hierarchical image database,” in 2009 IEEE conference on computer vision and pattern recognition . Ieee, 2009, pp. 248–255
work page 2009
-
[36]
Learning multiple layers of features from tiny images,
A. Krizhevsky, G. Hinton et al. , “Learning multiple layers of features from tiny images,” 2009
work page 2009
-
[37]
Spikingjelly: An open-source machine learning infrastructure platform for spike-based intelligence,
W. Fang, Y . Chen, J. Ding, Z. Yu, T. Masquelier, D. Chen, L. Huang, H. Zhou, G. Li, and Y . Tian, “Spikingjelly: An open-source machine learning infrastructure platform for spike-based intelligence,” Science Advances, vol. 9, no. 40, p. eadi1480, 2023
work page 2023
-
[38]
C. Zhou, H. Zhang, Z. Zhou, L. Yu, Z. Ma, H. Zhou, X. Fan, and Y . Tian, “Enhancing the performance of transformer-based spiking neural networks by snn-optimized downsampling with precise gradient backpropagation,” arXiv preprint arXiv:2305.05954 , 2023
-
[39]
An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale
A. Dosovitskiy, “An image is worth 16x16 words: Transformers for image recognition at scale,” arXiv preprint arXiv:2010.11929 , 2020
work page internal anchor Pith review Pith/arXiv arXiv 2010
-
[40]
1.1 computing’s energy problem (and what we can do about it),
M. Horowitz, “1.1 computing’s energy problem (and what we can do about it),” in 2014 IEEE international solid-state circuits conference digest of technical papers (ISSCC) . IEEE, 2014, pp. 10–14
work page 2014
-
[41]
Is space-time attention all you need for video understanding?
G. Bertasius, H. Wang, and L. Torresani, “Is space-time attention all you need for video understanding?” in ICML, vol. 2, no. 3, 2021, p. 4
work page 2021
-
[42]
Vivit: A video vision transformer,
A. Arnab, M. Dehghani, G. Heigold, C. Sun, M. Lu ˇci´c, and C. Schmid, “Vivit: A video vision transformer,” in Proceedings of the IEEE/CVF international conference on computer vision , 2021, pp. 6836–6846
work page 2021
-
[43]
Deep hashing network with hybrid attention and adaptive weighting for image retrieval,
Y . Pei, Z. Wang, N. Li, H. Chen, B. Huang, and W. Tu, “Deep hashing network with hybrid attention and adaptive weighting for image retrieval,” IEEE Transactions on Multimedia , 2023
work page 2023
-
[44]
Slowfast networks for video recognition,
C. Feichtenhofer, H. Fan, J. Malik, and K. He, “Slowfast networks for video recognition,” in Proceedings of the IEEE/CVF international conference on computer vision , 2019, pp. 6202–6211
work page 2019
-
[45]
Action-net: Multipath excitation for action recognition,
Z. Wang, Q. She, and A. Smolic, “Action-net: Multipath excitation for action recognition,” in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition , 2021, pp. 13 214–13 223
work page 2021
-
[46]
Tsm: Temporal shift module for efficient video understanding,
J. Lin, C. Gan, and S. Han, “Tsm: Temporal shift module for efficient video understanding,” in Proceedings of the IEEE/CVF international conference on computer vision , 2019, pp. 7083–7093
work page 2019
-
[47]
Going deeper with directly-trained larger spiking neural networks,
H. Zheng, Y . Wu, L. Deng, Y . Hu, and G. Li, “Going deeper with directly-trained larger spiking neural networks,” in Proceedings of the AAAI conference on artificial intelligence , vol. 35, no. 12, 2021, pp. 11 062–11 070. JOURNAL OF LATEX CLASS FILES, VOL. 14, NO. 8, AUGUST 2021 14
work page 2021
-
[48]
Incorporating learnable membrane time constant to enhance learning of spiking neural networks,
W. Fang, Z. Yu, Y . Chen, T. Masquelier, T. Huang, and Y . Tian, “Incorporating learnable membrane time constant to enhance learning of spiking neural networks,” in Proceedings of the IEEE/CVF international conference on computer vision , 2021, pp. 2661–2671
work page 2021
-
[49]
QKFormer: Hierarchical Spiking Transformer using Q-K Attention
C. Zhou, H. Zhang, Z. Zhou, L. Yu, L. Huang, X. Fan, L. Yuan, Z. Ma, H. Zhou, and Y . Tian, “Qkformer: Hierarchical spiking transformer using qk attention,” arXiv preprint arXiv:2403.16552 , 2024
work page internal anchor Pith review Pith/arXiv arXiv 2024
-
[50]
Spikepoint: An efficient point-based spiking neural network for event cameras action recognition,
H. Ren, Y . Zhou, Y . Huang, H. Fu, X. Lin, J. Song, and B. Cheng, “Spikepoint: An efficient point-based spiking neural network for event cameras action recognition,” arXiv preprint arXiv:2310.07189 , 2023
-
[51]
Deep residual learning in spiking neural networks,
W. Fang, Z. Yu, Y . Chen, T. Huang, T. Masquelier, and Y . Tian, “Deep residual learning in spiking neural networks,” Advances in Neural Information Processing Systems , vol. 34, pp. 21 056–21 069, 2021
work page 2021
-
[52]
M. Yao, H. Zhang, G. Zhao, X. Zhang, D. Wang, G. Cao, and G. Li, “Sparser spiking activity can be better: Feature refine-and-mask spiking neural network for event-based visual recognition,” Neural Networks , vol. 166, pp. 410–423, 2023
work page 2023
-
[53]
Differen- tiable spike: Rethinking gradient-descent for training spiking neural networks,
Y . Li, Y . Guo, S. Zhang, S. Deng, Y . Hai, and S. Gu, “Differen- tiable spike: Rethinking gradient-descent for training spiking neural networks,” Advances in Neural Information Processing Systems, vol. 34, pp. 23 426–23 439, 2021
work page 2021
-
[54]
Optimal ann- snn conversion for high-accuracy and ultra-low-latency spiking neural networks,
T. Bu, W. Fang, J. Ding, P. Dai, Z. Yu, and T. Huang, “Optimal ann- snn conversion for high-accuracy and ultra-low-latency spiking neural networks,” arXiv preprint arXiv:2303.04347 , 2023
-
[55]
Training spiking neural networks with local tandem learning,
Q. Yang, J. Wu, M. Zhang, Y . Chua, X. Wang, and H. Li, “Training spiking neural networks with local tandem learning,”Advances in Neural Information Processing Systems , vol. 35, pp. 12 662–12 676, 2022
work page 2022
-
[56]
Adaptive smoothing gradient learning for spiking neural networks,
Z. Wang, R. Jiang, S. Lian, R. Yan, and H. Tang, “Adaptive smoothing gradient learning for spiking neural networks,” in International Confer- ence on Machine Learning . PMLR, 2023, pp. 35 798–35 816
work page 2023
-
[57]
Videomae: Masked autoen- coders are data-efficient learners for self-supervised video pre-training,
Z. Tong, Y . Song, J. Wang, and L. Wang, “Videomae: Masked autoen- coders are data-efficient learners for self-supervised video pre-training,” Advances in neural information processing systems, vol. 35, pp. 10 078– 10 093, 2022
work page 2022
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.