Neural Architecture Search of Time-to-First-Spike-Coded Spiking Neural Networks for Efficient Eye-based Emotion Recognition
Pith reviewed 2026-05-17 03:09 UTC · model grok-4.3
The pith
A neural architecture search framework discovers time-to-first-spike spiking neural networks that recognize eye-based emotions accurately while using far less energy on neuromorphic hardware.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
TNAS-ER is the first neural architecture search framework for TTFS-coded SNNs in eye-based emotion recognition, using an ANN-assisted strategy with an evolutionary algorithm and recall-based fitness to discover architectures that deliver high recognition performance alongside significantly improved efficiency and superior energy efficiency on neuromorphic hardware.
What carries the argument
TNAS-ER, which employs an evolutionary algorithm guided by a ReLU-based ANN counterpart to optimize TTFS SNN architectures, with weighted and unweighted average recall as fitness objectives.
If this is right
- TNAS-ER networks achieve high recognition performance for eye-based emotions.
- The searched architectures have significantly improved efficiency over previous approaches.
- Deployment on neuromorphic hardware confirms superior energy efficiency.
- The ANN-assisted search stabilizes the optimization of the spiking networks.
Where Pith is reading between the lines
- This search strategy might apply to other tasks requiring precise spike timing, such as audio or motion recognition on edge devices.
- Future work could test if these architectures work across different hardware platforms beyond the one evaluated.
- Combining this with other training methods for SNNs could further boost performance without manual tuning.
- Scaling the evolutionary search to larger datasets may reveal more general design principles for efficient SNNs.
Load-bearing premise
The assumption that a conventional ReLU neural network can reliably guide the search for optimal spiking network structures without overlooking important timing details specific to eye emotion data.
What would settle it
If experiments without the ANN guidance produce networks with similar or better performance and efficiency, or if the found networks show no energy advantage on neuromorphic hardware, the value of the assisted search would be called into question.
Figures
read the original abstract
Eye-based emotion recognition enables eyewear devices to perceive users' emotional states and support emotion-aware interaction. However, deploying such functionality on their resource-limited embedded hardware remains challenging. Time-to-first-spike (TTFS)-coded spiking neural networks (SNNs) offer a promising solution due to their extremely sparse and energy-efficient computation, where each neuron emits at most one binary spike. While prior works have primarily focused on improving TTFS SNN training algorithms, the role of network architecture has been largely overlooked. This is particularly critical, as spike timing in TTFS SNNs is tightly coupled with architectural design, and eye-based emotion recognition requires compact yet highly efficient networks. In this paper, we propose TNAS-ER, the first neural architecture search (NAS) framework tailored to TTFS SNNs for eye-based emotion recognition. TNAS-ER presents a novel ANN-assisted search strategy that leverages a ReLU-based ANN counterpart to guide architecture optimization and stabilize training of the TTFS SNN. TNAS-ER employs an evolutionary algorithm, with weighted and unweighted average recall jointly defined as fitness objectives for emotion recognition. Extensive experiments demonstrate that TNAS-ER achieves high recognition performance with significantly improved efficiency. Furthermore, we evaluate TNAS-ER on a neuromorphic hardware, confirming its superior energy efficiency and strong potential for real-world applications.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript proposes TNAS-ER, the first neural architecture search framework for Time-to-First-Spike (TTFS) coded Spiking Neural Networks (SNNs) applied to eye-based emotion recognition. It introduces an ANN-assisted evolutionary search strategy that uses a ReLU-based ANN counterpart to guide architecture optimization and stabilize TTFS SNN training, with fitness defined via weighted and unweighted average recall. Experiments are reported to show high recognition performance, improved efficiency, and superior energy efficiency when evaluated on neuromorphic hardware.
Significance. If the central experimental claims hold, the work would be significant for enabling compact, low-power emotion recognition on embedded eyewear devices by addressing the overlooked role of architecture in TTFS SNNs. Strengths include the explicit tailoring of NAS to the TTFS regime, the multi-objective fitness formulation, and direct neuromorphic hardware validation, which together provide a concrete path toward real-world deployment of sparse spiking models.
major comments (2)
- [Methods / ANN-assisted search strategy] The ANN-assisted search strategy (described in the methods) relies on a ReLU ANN surrogate lacking any temporal dimension to guide TTFS SNN optimization. This raises a correctness risk for the central claim: because eye-based emotion recognition depends on precise first-spike timing, it is unclear whether the discovered architectures genuinely exploit TTFS dynamics or merely perform well after conversion; an ablation comparing native TTFS performance against post-conversion accuracy would be needed to confirm the search preserves timing-sensitive motifs.
- [Experiments / Results] The experimental section reports high recognition performance and efficiency gains but provides limited quantitative detail on baselines, absolute metrics (e.g., accuracy, energy per inference), and statistical significance of improvements over prior TTFS or ANN approaches. Without these, the claim of “significantly improved efficiency” and “superior energy efficiency” cannot be fully assessed as load-bearing support for the framework’s advantage.
minor comments (2)
- [Problem formulation] Notation for the fitness objectives (weighted vs. unweighted average recall) should be defined explicitly with equations to avoid ambiguity when readers compare against standard multi-class metrics.
- [Figures] Figure captions and axis labels on neuromorphic hardware results could be expanded to include exact energy figures and comparison models for immediate readability.
Simulated Author's Rebuttal
We thank the referee for their constructive and insightful comments on our manuscript. We have carefully addressed each major concern below and will incorporate the suggested improvements in the revised version to strengthen the presentation and validation of our claims.
read point-by-point responses
-
Referee: [Methods / ANN-assisted search strategy] The ANN-assisted search strategy (described in the methods) relies on a ReLU ANN surrogate lacking any temporal dimension to guide TTFS SNN optimization. This raises a correctness risk for the central claim: because eye-based emotion recognition depends on precise first-spike timing, it is unclear whether the discovered architectures genuinely exploit TTFS dynamics or merely perform well after conversion; an ablation comparing native TTFS performance against post-conversion accuracy would be needed to confirm the search preserves timing-sensitive motifs.
Authors: We acknowledge the referee's concern about the non-temporal nature of the ReLU ANN surrogate. In TNAS-ER, the ANN serves as an efficient proxy to stabilize fitness estimation and guide the evolutionary search across the architecture space, while all final architectures are trained and evaluated natively using the TTFS SNN training procedure that explicitly models first-spike timing. This design choice enables practical search scalability without sacrificing the timing-sensitive evaluation at the end of the pipeline. To directly address the correctness risk and demonstrate that the search preserves TTFS-specific motifs, we will add a dedicated ablation study in the revised manuscript comparing native TTFS performance of the discovered architectures against their post-conversion accuracy from the ANN counterparts. revision: yes
-
Referee: [Experiments / Results] The experimental section reports high recognition performance and efficiency gains but provides limited quantitative detail on baselines, absolute metrics (e.g., accuracy, energy per inference), and statistical significance of improvements over prior TTFS or ANN approaches. Without these, the claim of “significantly improved efficiency” and “superior energy efficiency” cannot be fully assessed as load-bearing support for the framework’s advantage.
Authors: We agree that additional quantitative details and rigorous comparisons are necessary to fully substantiate the efficiency claims. In the revised manuscript, we will expand the experimental results section with comprehensive tables that report absolute metrics including recognition accuracy, energy per inference on neuromorphic hardware, and direct comparisons against relevant prior TTFS SNN and ANN baselines. We will also include results aggregated over multiple independent runs with means and standard deviations to establish statistical significance of the reported improvements. revision: yes
Circularity Check
No circularity: empirical NAS framework with experimental validation
full rationale
The paper introduces TNAS-ER as a practical evolutionary NAS method that uses a ReLU ANN surrogate to guide TTFS SNN architecture search for eye-based emotion recognition. All central claims rest on reported experimental results (recognition performance, efficiency metrics, neuromorphic hardware measurements) rather than any mathematical derivation, equation, or first-principles result. No self-definitional loops, fitted inputs presented as predictions, or load-bearing self-citations appear in the provided text. The approach is self-contained through external benchmarks and hardware evaluation, consistent with the reader's assessment of minimal circularity risk.
Axiom & Free-Parameter Ledger
Lean theorems connected to this paper
-
IndisputableMonolith/Cost/FunctionalEquation.leanwashburn_uniqueness_aczel unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
TNAS-ER presents a novel ANN-assisted search strategy that leverages a ReLU-based ANN counterpart sharing an identity mapping with the TTFS SNN to guide architecture optimization and stabilize training
-
IndisputableMonolith/Foundation/DimensionForcing.leanreality_from_one_distinction unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
TTFS SNNs... each neuron emits at most one binary spike... evolutionary algorithm, with weighted and unweighted average recall jointly defined as fitness objectives
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
Reference graph
Works this paper leans on
-
[1]
The mug facial expression database
Niki Aifanti, Christos Papachristou, and Anastasios De- lopoulos. The mug facial expression database. In11th In- ternational Workshop on Image Analysis for Multimedia In- teractive Services WIAMIS 10, pages 1–4. IEEE, 2010. 6
work page 2010
-
[2]
Neural optimizer search with reinforcement learning
Irwan Bello, Barret Zoph, Vijay Vasudevan, and Quoc V Le. Neural optimizer search with reinforcement learning. InIn- ternational Conference on Machine Learning, pages 459–
-
[3]
Proxylessnas: Direct neural architecture search on target task and hardware
Han Cai, Ligeng Zhu, and Song Han. Proxylessnas: Direct neural architecture search on target task and hardware. InIn- ternational Conference on Learning Representations, 2019. 3
work page 2019
-
[4]
Kaiwei Che, Luziwei Leng, Kaixuan Zhang, Jianguo Zhang, Qinghu Meng, Jie Cheng, Qinghai Guo, and Jianxing Liao. Differentiable hierarchical and surrogate gradient search for spiking neural networks.Advances in Neural Information Processing Systems, 35:24975–24990, 2022. 3
work page 2022
-
[5]
Auto-spikformer: Spikformer architecture search
Kaiwei Che, Zhaokun Zhou, Jun Niu, Zhengyu Ma, Wei Fang, Yanqi Chen, Shuaijie Shen, Li Yuan, and Yonghong Tian. Auto-spikformer: Spikformer architecture search. Frontiers in Neuroscience, 18:1372257, 2024. 3
work page 2024
-
[6]
Kyle Timothy Ng Chu, Burin Amornpaisannon, Yaswanth Tavva, Venkata Pavan Kumar Miriyala, Jibin Wu, Malu Zhang, Haizhou Li, Trevor E Carlson, et al. You only spike once: Improving energy-efficient neuromorphic inference to ann-level accuracy.arXiv preprint arXiv:2006.09982, 2020. 2, 8
-
[7]
Emotion recognition in human-computer in- teraction.IEEE Signal processing magazine, 18(1):32–80,
Roddy Cowie, Ellen Douglas-Cowie, Nicolas Tsapatsoulis, George V otsis, Stefanos Kollias, Winfried Fellenz, and John G Taylor. Emotion recognition in human-computer in- teraction.IEEE Signal processing magazine, 18(1):32–80,
-
[8]
Guillermo Gallego, Tobi Delbr ¨uck, Garrick Orchard, Chiara Bartolozzi, Brian Taba, Andrea Censi, Stefan Leutenegger, Andrew J Davison, J ¨org Conradt, Kostas Daniilidis, et al. Event-based vision: A survey.IEEE Transactions on Pattern Analysis and Machine Intelligence, 44(1):154–180, 2020. 1
work page 2020
-
[9]
Runduo Han, Xiuping Liu, Yi Zhang, Jun Zhou, Hongchen Tan, and Xin Li. Hierarchical event-rgb interaction network for single-eye expression recognition.Information Sciences, 690:121539, 2025. 2
work page 2025
-
[10]
Kensho Hara, Hirokatsu Kataoka, and Yutaka Satoh. Can spatiotemporal 3d cnns retrace the history of 2d cnns and imagenet? InProceedings of the IEEE Conference on Com- puter Vision and Pattern Recognition, pages 6546–6555,
-
[11]
Eyemotion: Classifying facial expressions in vr using eye-tracking cameras
Steven Hickson, Nick Dufour, Avneesh Sud, Vivek Kwatra, and Irfan Essa. Eyemotion: Classifying facial expressions in vr using eye-tracking cameras. In2019 IEEE Winter Con- ference on Applications of Computer Vision (WACV), pages 1626–1635. IEEE, 2019. 1, 2, 6
work page 2019
-
[12]
v2e: From video frames to realistic dvs events
Yuhuang Hu, Shih-Chii Liu, and Tobi Delbruck. v2e: From video frames to realistic dvs events. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 1312–1321, 2021. 6
work page 2021
-
[13]
Yesmina Jaafra, Jean Luc Laurent, Aline Deruyver, and Mo- hamed Saber Naceur. Reinforcement learning for neural ar- chitecture search: A review.Image and Vision Computing, 89:57–66, 2019. 3
work page 2019
-
[14]
Saeed Reza Kheradpisheh and Timoth ´ee Masquelier. Tem- poral backpropagation for spiking neural networks with one spike per neuron.International Journal of Neural Systems, 30(06):2050027, 2020. 2
work page 2020
-
[15]
Darts: Differentiable architecture search
Hanxiao Liu, Karen Simonyan, and Yiming Yang. Darts: Differentiable architecture search. InInternational Confer- ence on Learning Representations, 2019. 3
work page 2019
-
[16]
Qianhui Liu, Jiaqi Yan, Malu Zhang, Gang Pan, and Haizhou Li. Lite-snn: designing lightweight and efficient spiking neural network through spatial-temporal compressive net- work search and joint optimization. InProceedings of the Thirty-Third International Joint Conference on Artificial In- telligence, pages 3097–3105, 2024. 3
work page 2024
-
[17]
Yuqiao Liu, Yanan Sun, Bing Xue, Mengjie Zhang, Gary G Yen, and Kay Chen Tan. A survey on evolutionary neural architecture search.IEEE Transactions on Neural Networks and Learning Systems, 34(2):550–570, 2021. 3
work page 2021
-
[18]
Patrick Lucey, Jeffrey F Cohn, Takeo Kanade, Jason Saragih, Zara Ambadar, and Iain Matthews. The extended cohn- kanade dataset (ck+): A complete dataset for action unit and emotion-specified expression. In2010 IEEE Computer Soci- ety Conference on Computer Vision and Pattern Recognition- Workshops, pages 94–101. IEEE, 2010. 6
work page 2010
-
[19]
Wolfgang Maass. Networks of spiking neurons: the third generation of neural network models.Neural Networks, 10 (9):1659–1671, 1997. 2
work page 1997
-
[20]
Autosnn: Towards energy-efficient spiking neural networks
Byunggook Na, Jisoo Mok, Seongsik Park, Dongjin Lee, Hyeokjun Choe, and Sungroh Yoon. Autosnn: Towards energy-efficient spiking neural networks. InInternational Conference on Machine Learning, pages 16253–16269. PMLR, 2022. 2, 3, 5, 7
work page 2022
-
[21]
Wenxuan Pan, Feifei Zhao, Guobin Shen, Bing Han, and Yi Zeng. Brain-inspired multi-scale evolutionary neural archi- tecture search for deep spiking neural networks.IEEE Trans- actions on Evolutionary Computation, 2024. 3
work page 2024
-
[22]
Web-based database for facial expression analysis
Maja Pantic, Michel Valstar, Ron Rademaker, and Ludo Maat. Web-based database for facial expression analysis. In2005 IEEE International Conference on Multimedia and Expo, pages 5–pp. IEEE, 2005. 6
work page 2005
-
[23]
T2fsnn: deep spiking neural networks with time-to- first-spike coding
Seongsik Park, Seijoon Kim, Byunggook Na, and Sungroh Yoon. T2fsnn: deep spiking neural networks with time-to- first-spike coding. In2020 57th ACM/IEEE design automa- tion conference (DAC), pages 1–6. IEEE, 2020. 2
work page 2020
-
[24]
Kaushik Roy, Akhilesh Jaiswal, and Priyadarshini Panda. Towards spike-based machine intelligence with neuromor- phic computing.Nature, 575(7784):607–617, 2019. 2
work page 2019
-
[25]
Conversion of analog to spiking neural networks using sparse temporal coding
Bodo Rueckauer and Shih-Chii Liu. Conversion of analog to spiking neural networks using sparse temporal coding. In 2018 IEEE International Symposium on Circuits and Sys- tems (ISCAS), pages 1–5. IEEE, 2018. 2
work page 2018
-
[26]
Ana Stanojevic, Stanisław Wo´zniak, Guillaume Bellec, Gio- vanni Cherubini, Angeliki Pantazi, and Wulfram Gerstner. High-performance deep spiking neural networks with 0.3 spikes per neuron.Nature Communications, 15(1):6793,
-
[27]
A closer look at spatiotemporal convolutions for action recognition
Du Tran, Heng Wang, Lorenzo Torresani, Jamie Ray, Yann LeCun, and Manohar Paluri. A closer look at spatiotemporal convolutions for action recognition. InProceedings of the IEEE Conference on Computer Vision and Pattern Recogni- tion, pages 6450–6459, 2018. 6
work page 2018
-
[28]
Job Van Der Schalk, Skyler T Hawk, Agneta H Fischer, and Bertjan Doosje. Moving faces, looking places: validation of the amsterdam dynamic facial expression set (adfes).Emo- tion, 11(4):907, 2011. 6
work page 2011
-
[29]
Eye-based emotion recognition via event-driven sparse transformers
Zixuan Wan, Jiqing Zhang, Yushan Wang, Hu Lin, Yafei Wang, Zetian Mi, Xin Yang, Xianping Fu, and Huibing Wang. Eye-based emotion recognition via event-driven sparse transformers. InProceedings of the 33rd ACM In- ternational Conference on Multimedia, pages 4659–4668,
-
[30]
Yang Wang, Haiyang Mei, Qirui Bao, Ziqi Wei, Mike Zheng Shou, Haizhou Li, Bo Dong, and Xin Yang. Apprenticeship- inspired elegance: synergistic knowledge distillation em- powers spiking neural networks for efficient single-eye emo- tion recognition. InProceedings of the Thirty-Third Inter- national Joint Conference on Artificial Intelligence (IJCAI), page...
work page 2024
-
[31]
Wenjie Wei, Malu Zhang, Hong Qu, Ammar Belatreche, Jian Zhang, and Hong Chen. Temporal-coded spiking neural net- works with dynamic firing threshold: Learning with event- driven backpropagation. InProceedings of the IEEE/CVF In- ternational Conference on Computer Vision (CVPR), pages 10552–10562, 2023. 2
work page 2023
-
[32]
Fbnet: Hardware-aware efficient con- vnet design via differentiable neural architecture search
Bichen Wu, Xiaoliang Dai, Peizhao Zhang, Yanghan Wang, Fei Sun, Yiming Wu, Yuandong Tian, Peter Vajda, Yangqing Jia, and Kurt Keutzer. Fbnet: Hardware-aware efficient con- vnet design via differentiable neural architecture search. In Proceedings of the IEEE/CVF Conference on Computer Vi- sion and Pattern Recognition, pages 10734–10742, 2019. 3
work page 2019
-
[33]
Emo: Real-time emotion recognition from single-eye images for resource-constrained eyewear devices
Hao Wu, Jinghao Feng, Xuejin Tian, Edward Sun, Yunxin Liu, Bo Dong, Fengyuan Xu, and Sheng Zhong. Emo: Real-time emotion recognition from single-eye images for resource-constrained eyewear devices. InProceedings of the 18th International Conference on Mobile Systems, Applica- tions, and Services, pages 448–461, 2020. 1, 2, 6, 8
work page 2020
-
[34]
Jiaqi Yan, Qianhui Liu, Malu Zhang, Lang Feng, De Ma, Haizhou Li, and Gang Pan. Efficient spiking neural network design via neural architecture search.Neural Networks, 173: 106172, 2024. 2, 3
work page 2024
-
[35]
Qu Yang, Malu Zhang, Jibin Wu, Kay Chen Tan, and Haizhou Li. Lc-ttfs: Toward lossless network conversion for spiking neural networks with ttfs coding.IEEE Transactions on Cognitive and Developmental Systems, 16(5):1626–1639,
-
[36]
A ttfs-based en- ergy and utilization efficient neuromorphic cnn accelerator
Miao Yu, Tingting Xiang, Srivatsa P, Kyle Timothy Ng Chu, Burin Amornpaisannon, Yaswanth Tavva, Venkata Pa- van Kumar Miriyala, and Trevor E Carlson. A ttfs-based en- ergy and utilization efficient neuromorphic cnn accelerator. Frontiers in Neuroscience, 17:1121592, 2023. 2, 8
work page 2023
-
[37]
In the blink of an eye: Event-based emotion recognition
Haiwei Zhang, Jiqing Zhang, Bo Dong, Pieter Peers, Wen- wei Wu, Xiaopeng Wei, Felix Heide, and Xin Yang. In the blink of an eye: Event-based emotion recognition. In ACM SIGGRAPH 2023 Conference Proceedings, pages 1– 11, 2023. 2, 6
work page 2023
-
[38]
Tdsnn: From deep neural networks to deep spike neural networks with temporal-coding
Lei Zhang, Shengyuan Zhou, Tian Zhi, Zidong Du, and Yunji Chen. Tdsnn: From deep neural networks to deep spike neural networks with temporal-coding. InProceedings of the AAAI Conference on Artificial Intelligence (AAAI), pages 1319–1326, 2019. 2
work page 2019
-
[39]
Guoying Zhao, Xiaohua Huang, Matti Taini, Stan Z Li, and Matti Pietik¨aInen. Facial expression recognition from near- infrared videos.Image and Vision Computing, 29(9):607– 619, 2011. 6
work page 2011
-
[40]
Ttfsformer: A ttfs-based lossless conversion of spiking trans- former
Lusen Zhao, Zihan Huang, Jianhao Ding, and Zhaofei Yu. Ttfsformer: A ttfs-based lossless conversion of spiking trans- former. InForty-second International Conference on Ma- chine Learning, 2025. 2
work page 2025
-
[41]
Former-dfer: Dynamic facial expression recognition transformer
Zengqun Zhao and Qingshan Liu. Former-dfer: Dynamic facial expression recognition transformer. InProceedings of the 29th ACM international conference on multimedia, pages 1553–1561, 2021. 6
work page 2021
-
[42]
Hangyu Zhu and Yaochu Jin. Real-time federated evolution- ary neural architecture search.IEEE Transactions on Evolu- tionary Computation, 26(2):364–378, 2021. 3
work page 2021
-
[43]
Neural Architecture Search with Reinforcement Learning
Barret Zoph and Quoc V Le. Neural architecture search with reinforcement learning.arXiv preprint arXiv:1611.01578,
work page internal anchor Pith review Pith/arXiv arXiv
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.