Dive into Waves: Morlet Spectral Transformer for Cross-Subject Emotion Decoding from EEG

Jiaxin Qing; Lexin Li

arxiv: 2606.00884 · v1 · pith:EX7NOHSVnew · submitted 2026-05-30 · 💻 cs.LG · cs.AI

Dive into Waves: Morlet Spectral Transformer for Cross-Subject Emotion Decoding from EEG

Jiaxin Qing , Lexin Li This is my paper

Pith reviewed 2026-06-28 18:54 UTC · model grok-4.3

classification 💻 cs.LG cs.AI

keywords cross-subject EEGemotion recognitionMorlet waveletspectral transformerbrain-computer interfacetime-frequency representationbaseline removal

0 comments

The pith

The Morlet Spectral Transformer outperforms large pretrained EEG models in cross-subject emotion recognition without pretraining.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper establishes that a Transformer built around three spectral-specific components can decode emotions from EEG more accurately across subjects than either massive pretrained foundation models or conventional frequency encoders. Emotion signals appear mainly as weak, noisy changes in spectral power that shift between people, so standard waveform or generic tokenization approaches fail to generalize. By tokenizing with Morlet wavelets to align with multi-scale brain rhythms, stripping subject drift via long-context baseline removal, and applying separate spatial projections per frequency band, the model produces better results on SEED-family datasets. This matters because it points to a lighter, more interpretable route for brain-computer interfaces that avoids the data and compute demands of large-scale pretraining.

Core claim

The Morlet Spectral Transformer integrates Morlet wavelet tokenization that extends differential entropy into a time-frequency form suitable for Transformers, long-context baseline removal that normalizes out subject-specific drift and redundancy, and frequency-specific spatial projection that learns independent channel mixers per band. These elements are placed inside a spatiotemporal Transformer backbone. The resulting model consistently exceeds both large pretrained EEG foundation models and prior frequency-based methods on every SEED-family dataset for cross-subject emotion recognition, even when trained from scratch.

What carries the argument

Morlet wavelet tokenization paired with long-context baseline removal and frequency-specific spatial projection inside a Transformer backbone.

If this is right

Careful spectral representation design can serve as an accurate, cost-effective alternative to large-scale pretraining for EEG tasks.
The frequency-specific spatial projection yields interpretable band-specific patterns that reduce unwanted cross-channel mixing.
Long-context baseline removal provides a simple normalization that improves generalization by removing subject drift across windows.
The overall approach is particularly suited to signals whose information lies in spectral power rather than clear waveform signatures.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same tokenization and normalization steps could be tested on other EEG tasks that rely on spectral rather than temporal signatures.
The emphasis on matching multi-scale rhythms suggests similar wavelet-based tokenization might help in other noisy, multi-scale biosignal problems.
If the baseline removal step generalizes, it could become a standard preprocessing choice for handling inter-subject variability in neural recordings.

Load-bearing premise

The reported gains come from the three proposed components rather than the shared Transformer backbone or characteristics of the SEED datasets.

What would settle it

An ablation study that removes each of the three components one at a time and checks whether accuracy falls to or below the level of the pretrained baselines or standard frequency methods.

Figures

Figures reproduced from arXiv: 2606.00884 by Jiaxin Qing, Lexin Li.

**Figure 2.** Figure 2: Temporal redundancy. The left panel shows pairwise cosine similarity of raw log-amplitude [PITH_FULL_IMAGE:figures/full_fig_p004_2.png] view at source ↗

**Figure 3.** Figure 3: Learned frequency-specific spatial projections. Each panel shows one spatial component [PITH_FULL_IMAGE:figures/full_fig_p005_3.png] view at source ↗

**Figure 4.** Figure 4: Effects of four data augmentation strategies on a representative sample. [PITH_FULL_IMAGE:figures/full_fig_p006_4.png] view at source ↗

**Figure 5.** Figure 5: Per-subject cross-subject LOSO accuracy on SEED for MST. Red bars indicate subjects whose [PITH_FULL_IMAGE:figures/full_fig_p008_5.png] view at source ↗

**Figure 6.** Figure 6: Morlet wavelet tokenization. (a) The complex Morlet kernels at different center frequencies, with [PITH_FULL_IMAGE:figures/full_fig_p012_6.png] view at source ↗

read the original abstract

We study cross-subject emotion recognition from EEG, a practically important yet challenging problem in brain-computer interfaces. Unlike tasks with clear waveform signatures, emotion-related EEG signals are primarily encoded in spectral power and are weak, noisy, and highly variable across subjects. Existing approaches rely either on large pretrained EEG foundation models, which require massive data yet still struggle with cross-subject variability, or frequency-domain encoders, which better reflect spectral structure but suffer from mismatched representations, drift-dominated tokenization, and lack of band-specific spatial modeling. In this article, we propose the Morlet Spectral Transformer (MST), built around three key components and integrated with a spatiotemporal Transformer backbone. First, Morlet wavelet tokenization provides a time-frequency representation that matches the multi-scale structure of brain rhythms, and extends classical differential entropy features to a form suitable for Transformers. Second, long-context baseline removal acts as a simple temporal normalization that removes subject-specific drift and redundancy across nearby windows. Third, frequency-specific spatial projection learns a separate channel mixer for each frequency band, capturing interpretable band-specific patterns and reducing cross-channel mixing. We show that, even without pretraining, MST consistently outperforms both large pretrained EEG foundation models and frequency-based methods across all SEED-family datasets. These results suggest that careful representation design can yield an accurate, cost-effective, and interpretable alternative to large-scale pretraining.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

MST proposes Morlet tokenization, long-context baseline removal, and per-band spatial projections for EEG Transformers, but the abstract supplies no ablations or numbers to show those pieces actually drive the reported gains over foundation models on SEED data.

read the letter

The paper's central move is to replace standard frequency features in an EEG Transformer with Morlet wavelet tokens, add a simple long-context baseline subtraction to cut subject drift, and use separate spatial mixers per frequency band. It claims this combination beats both large pretrained EEG models and prior frequency methods on cross-subject emotion tasks across the SEED family, all without any pretraining.

The three components address real pain points: brain rhythms are multi-scale, subject drift swamps short windows, and different bands carry distinct spatial structure. Treating the tokenization as an extension of differential entropy that fits Transformers is a clean step. The per-band projection is a straightforward way to avoid mixing everything together.

The abstract states consistent outperformance but gives no tables, error bars, dataset splits, or ablation results. The stress-test note is on target here: without controlled replacements (Morlet vs. DE/FFT, with vs. without baseline removal, shared vs. per-band mixer) it is impossible to attribute gains to the proposed pieces rather than the shared Transformer backbone or dataset quirks. If the full paper contains those ablations plus statistical tests, the argument strengthens; otherwise the attribution stays weak.

This work targets researchers building practical BCI systems who want lower-cost alternatives to foundation-model pretraining. Readers already working on frequency-domain EEG encoders will see the most direct value. The ideas are coherent enough and the problem is important enough that a serious editor should send it to review, with the expectation that referees will press hard on the missing controls and reproducibility details.

Referee Report

2 major / 2 minor

Summary. The paper proposes the Morlet Spectral Transformer (MST) for cross-subject EEG-based emotion recognition. It integrates a spatiotemporal Transformer backbone with three components: Morlet wavelet tokenization to produce time-frequency representations matching brain rhythms (extending differential entropy), long-context baseline removal as temporal normalization to reduce subject-specific drift, and frequency-specific spatial projection to learn band-wise channel mixers. The central empirical claim is that MST, even without pretraining, consistently outperforms both large pretrained EEG foundation models and existing frequency-based methods across SEED-family datasets.

Significance. If the performance gains can be rigorously attributed to the proposed components rather than the Transformer backbone or dataset artifacts, the work would support the value of domain-specific representation design over scale in EEG decoding tasks, providing a lower-cost, more interpretable alternative to foundation-model pretraining for cross-subject generalization.

major comments (2)

[Results / Experiments (section containing the main tables and comparisons)] The central claim that the three components (Morlet tokenization, long-context baseline removal, frequency-specific spatial projection) drive the reported gains requires controlled ablations; no such experiments (e.g., replacing Morlet with standard DE/FFT features, removing baseline correction, or using a shared spatial mixer) are described that would show corresponding performance drops when these elements are ablated.
[Abstract and Results section] The abstract and results presentation supply no statistical tests, error bars, dataset splits, or subject-wise variance measures to support the claim of 'consistent outperformance' across SEED-family datasets; without these, the strength of the empirical comparison cannot be evaluated.

minor comments (2)

[Method section] Notation for the Morlet wavelet parameters and the exact formulation of the long-context baseline removal should be made explicit with equations to allow reproduction.
[Figures] Figure captions for any architecture or tokenization diagrams should explicitly label the three proposed components and their integration with the Transformer backbone.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive comments, which highlight important aspects for strengthening the empirical claims. We agree that both controlled ablations and statistical reporting are needed to rigorously support the contribution of the proposed components and will add these elements to the revised manuscript.

read point-by-point responses

Referee: [Results / Experiments (section containing the main tables and comparisons)] The central claim that the three components (Morlet tokenization, long-context baseline removal, frequency-specific spatial projection) drive the reported gains requires controlled ablations; no such experiments (e.g., replacing Morlet with standard DE/FFT features, removing baseline correction, or using a shared spatial mixer) are described that would show corresponding performance drops when these elements are ablated.

Authors: We agree that the absence of component-specific ablations limits the ability to attribute gains specifically to Morlet tokenization, long-context baseline removal, and frequency-specific spatial projection rather than the Transformer backbone alone. In the revised manuscript we will add a dedicated ablation study section that systematically replaces each component (Morlet with standard DE/FFT, removal of baseline correction, and replacement of per-band spatial mixers with a shared mixer) and reports the resulting performance drops on the SEED-family datasets. These experiments will be presented alongside the main results tables. revision: yes
Referee: [Abstract and Results section] The abstract and results presentation supply no statistical tests, error bars, dataset splits, or subject-wise variance measures to support the claim of 'consistent outperformance' across SEED-family datasets; without these, the strength of the empirical comparison cannot be evaluated.

Authors: We acknowledge that the current presentation lacks statistical tests, error bars, explicit dataset splits, and subject-wise variance, which weakens the evaluation of the 'consistent outperformance' claim. In the revision we will (i) report mean and standard deviation across multiple random seeds and cross-subject folds, (ii) include paired statistical tests (e.g., Wilcoxon signed-rank) with p-values against baselines, (iii) specify the exact train/validation/test subject splits used, and (iv) add subject-wise performance variance measures. These will appear in both the abstract (concise summary) and the results section tables/figures. revision: yes

Circularity Check

0 steps flagged

No circularity; empirical claims rest on external benchmarks

full rationale

The paper introduces the MST architecture with three described components (Morlet wavelet tokenization, long-context baseline removal, frequency-specific spatial projection) integrated into a Transformer backbone, then reports empirical outperformance on SEED-family datasets without pretraining. No equations, derivations, or self-citations appear that reduce any claimed result to its inputs by construction, nor any fitted parameters renamed as predictions, ansatzes smuggled via citation, or uniqueness theorems. Performance attribution is presented via direct experimental comparison to baselines, making the chain self-contained against external data rather than internally tautological.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract provides no explicit free parameters, axioms, or invented entities; the design relies on standard assumptions of wavelet transforms and Transformer architectures from prior literature.

pith-pipeline@v0.9.1-grok · 5771 in / 998 out tokens · 18128 ms · 2026-06-28T18:54:29.417027+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

30 extracted references · 5 canonical work pages · 1 internal anchor

[1]

Filter bank common spatial pattern (FBCSP) in brain-computer interface

Kai Keng Ang, Zheng Y ang Chin, Haihong Zhang, and Cuntai Guan. Filter bank common spatial pattern (FBCSP) in brain-computer interface. In2008 IEEE International Joint Conference on Neural Networks, pages 2390–2397. IEEE, 2008

2008
[2]

MIT Press, 2014

Mike X Cohen.Analyzing Neural Time Series Data: Theory and Practice. MIT Press, 2014

2014
[3]

Cover and Joy A

Thomas M. Cover and Joy A. Thomas.Elements of Information Theory. Wiley-Interscience, 2 edition, 2006

2006
[4]

Differential entropy feature for EEG-based emotion recognition

Ruo-Nan Duan, Jia-Yi Zhu, and Bao-Liang Lu. Differential entropy feature for EEG-based emotion recognition. In6th International IEEE/EMBS Conference on Neural Engineering (NER), pages 81–84. IEEE, 2013

2013
[5]

Evgenia Gkintoni, Anthimos Aroutzidis, Hera Antonopoulou, and Constantinos Halkiopoulos. From neural networks to emotional networks: A systematic review of eeg-based emotion recognition in cognitive neuroscience and real-world applications.Brain Sciences, 15(3):220, 2025. doi: 10.3390/brainsci15030220

work page doi:10.3390/brainsci15030220 2025
[6]

SEED-VII: A multimodal dataset of six basic emotions with continuous labels for emotion recognition.IEEE Transactions on Affective Computing, 2024

Wei-Bang Jiang, Xuan-Hao Liu, Wei-Long Zheng, and Bao-Liang Lu. SEED-VII: A multimodal dataset of six basic emotions with continuous labels for emotion recognition.IEEE Transactions on Affective Computing, 2024

2024
[7]

Large brain model for learning generic representa- tions with tremendous EEG data in BCI

Wei-Bang Jiang, Li-Ming Zhao, and Bao-Liang Lu. Large brain model for learning generic representa- tions with tremendous EEG data in BCI. InInternational Conference on Learning Representations, 2024

2024
[8]

EEG alpha and theta oscillations reflect cognitive and memory performance: a review and analysis.Brain Research Reviews, 29(2-3):169–195, 1999

Wolfgang Klimesch. EEG alpha and theta oscillations reflect cognitive and memory performance: a review and analysis.Brain Research Reviews, 29(2-3):169–195, 1999

1999
[9]

DEAP: A database for emotion analysis using physiological signals.IEEE Transactions on Affective Computing, 3(1):18–31, 2012

Sander Koelstra, Christian Mühl, Mohammad Soleymani, Jong-Seok Lee, Ashkan Y azdani, Touradj Ebrahimi, Thierry Pun, Anton Nijholt, and Ioannis Patras. DEAP: A database for emotion analysis using physiological signals.IEEE Transactions on Affective Computing, 3(1):18–31, 2012

2012
[10]

Journal of Neural Engineering15(5), 056013 (2018).https: //doi.org/10.1088/1741-2552/aace8c

V ernon J. Lawhern, Amelia J. Solon, Nicholas R. Waytowich, Stephen M. Gordon, Chou P . Hung, and Brent J. Lance. EEGNet: A compact convolutional neural network for EEG-based brain-computer interfaces.Journal of Neural Engineering, 15(5):056013, 2018. doi: 10.1088/1741-2552/aace8c

work page doi:10.1088/1741-2552/aace8c 2018
[11]

Multisource transfer learning for cross-subject EEG emotion recognition.IEEE Transactions on Cybernetics, 50(7):3281–3293, 2020

Jinpeng Li, Shuang Qiu, Changde Du, Yixin Wang, and Huiguang He. Multisource transfer learning for cross-subject EEG emotion recognition.IEEE Transactions on Cybernetics, 50(7):3281–3293, 2020

2020
[12]

Domain adaptation for EEG emotion recognition based on latent representation similarity.IEEE Transactions on Cognitive and Developmental Systems, 12(2):344–353, 2020

Jinpeng Li, Shuang Qiu, Y uan-Y uan Shen, Cheng-Lin Liu, and Huiguang He. Domain adaptation for EEG emotion recognition based on latent representation similarity.IEEE Transactions on Cognitive and Developmental Systems, 12(2):344–353, 2020

2020
[13]

Eeg foundation models: Progresses, benchmarking, and open problems.arXiv preprint arXiv:2601.17883, 2026

Dingkun Liu, Y uheng Chen, Zhu Chen, Zhenyao Cui, Y aozhi Wen, Jiayu An, Jingwei Luo, and Dongrui Wu. EEG foundation models: Progresses, benchmarking, and open problems.arXiv preprint arXiv:2601.17883, 2025

work page arXiv 2025
[14]

Tokenizing single-channel EEG with time-frequency motif learning

Jiarui Liu et al. Tokenizing single-channel EEG with time-frequency motif learning. InInternational Conference on Learning Representations (ICLR), 2025

2025
[15]

Wei Liu, Jie-Lin Qiu, Wei-Long Zheng, and Bao-Liang Lu. Comparing recognition performance and robustness of multimodal deep learning models for multimodal emotion recognition.IEEE Transactions on Cognitive and Developmental Systems, 14(2):715–729, 2022

2022
[16]

Ravikiran Mane, Effie Chew, Karen Chua, Kai Keng Ang, Neethu Robinson, A. P . Vinod, Seong-Whan Lee, and Cuntai Guan. FBCNet: A multi-view convolutional neural network for brain-computer interface.arXiv preprint arXiv:2104.01233, 2021

work page arXiv 2021
[17]

Unnikrishna Pillai.Probability, Random V ariables, and Stochastic Processes

Athanasios Papoulis and S. Unnikrishna Pillai.Probability, Random V ariables, and Stochastic Processes. McGraw-Hill, 4 edition, 2002

2002
[18]

Stephen O. Rice. Mathematical analysis of random noise.The Bell System T echnical Journal, 23 (3):282–332, 1944. 10

1944
[19]

EEG conformer: Convolutional transformer for EEG decoding and visualization.IEEE Transactions on Neural Systems and Rehabilitation Engineering, 31:710–719, 2023

Y onghao Song, Qingqing Zheng, Bingchuan Liu, and Xiaorong Gao. EEG conformer: Convolutional transformer for EEG decoding and visualization.IEEE Transactions on Neural Systems and Rehabilitation Engineering, 31:710–719, 2023

2023
[20]

RoFormer: Enhanced transformer with rotary position embedding.Neurocomputing, 568:127063, 2024

Jianlin Su, Murtadha Ahmed, Y u Lu, Shengfeng Pan, Wen Bo, and Y unfeng Liu. RoFormer: Enhanced transformer with rotary position embedding.Neurocomputing, 568:127063, 2024

2024
[21]

Oscillatory gamma-band (30–70 Hz) activity induced by a visual search task in humans.Journal of Neuroscience, 17(2):722–734, 1997

Catherine Tallon-Baudry, Olivier Bertrand, Claude Delpuech, and Jacques Pernier. Oscillatory gamma-band (30–70 Hz) activity induced by a visual search task in humans.Journal of Neuroscience, 17(2):722–734, 1997

1997
[22]

Attention is all you need

Ashish V aswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Łukasz Kaiser, and Illia Polosukhin. Attention is all you need. InAdvances in Neural Information Processing Systems, volume 30, 2017

2017
[23]

BrainBERT: Self-supervised representation learning for intracranial recordings

Christopher Wang, Vighnesh Subramaniam, Adam Uri Y aari, Gabriel Kreiman, Boris Katz, Ignacio Cases, and Andrei Barbu. BrainBERT: Self-supervised representation learning for intracranial recordings. InInternational Conference on Learning Representations (ICLR), 2023

2023
[24]

EEGPT: Pretrained transformer for universal and reliable representation of EEG signals

Guangyu Wang, Wenchao Liu, Y uhong He, Cong Xu, Lin Ma, and Haifeng Li. EEGPT: Pretrained transformer for universal and reliable representation of EEG signals. InAdvances in Neural Information Processing Systems, volume 37, 2024

2024
[25]

CBraMod: A criss-cross brain foundation model for EEG decoding

Jiquan Wang, Sha Zhao, Zhiling Luo, Y angxuan Zhou, Haiteng Jiang, Shijian Li, Tao Li, and Gang Pan. CBraMod: A criss-cross brain foundation model for EEG decoding. InInternational Conference on Learning Representations, 2025

2025
[26]

Brandon Westover, and Jimeng Sun

Chaoqi Y ang, M. Brandon Westover, and Jimeng Sun. BIOT: Biosignal transformer for cross-data learning in the wild. InAdvances in Neural Information Processing Systems (NeurIPS), 2023

2023
[27]

CodeBrain: Bridging Decoupled Tokenizer and Multi-Scale Architecture for EEG Foundation Model

Jiahe Zhang et al. CodeBrain: Bridging decoupled tokenizer and multi-scale architecture for EEG foundation model.arXiv preprint arXiv:2506.09110, 2025

work page internal anchor Pith review Pith/arXiv arXiv 2025
[28]

Wei-Long Zheng and Bao-Liang Lu. Investigating critical frequency bands and channels for EEG-based emotion recognition with deep neural networks.IEEE Transactions on Autonomous Mental Development, 7(3):162–175, 2015

2015
[29]

EmotionMeter: A multimodal framework for recognizing human emotions.IEEE Transactions on Cybernetics, 49(3):1110–1122, 2018

Wei-Long Zheng, Wei Liu, Yifei Lu, Bao-Liang Lu, and Andrzej Cichocki. EmotionMeter: A multimodal framework for recognizing human emotions.IEEE Transactions on Cybernetics, 49(3):1110–1122, 2018

2018
[30]

CSBrain: A cross-scale spatiotemporal brain foundation model for EEG decoding

Y uchen Zhou, Jiamin Wu, Zichen Ren, Zhouheng Y ao, Weiheng Lu, Kunyu Peng, Qihao Zheng, Chunfeng Song, Wanli Ouyang, and Chao Gou. CSBrain: A cross-scale spatiotemporal brain foundation model for EEG decoding. InAdvances in Neural Information Processing Systems, volume 38, 2025. 11 A T echnical appendices and supplementary material A.1 Visualization of t...

2025

[1] [1]

Filter bank common spatial pattern (FBCSP) in brain-computer interface

Kai Keng Ang, Zheng Y ang Chin, Haihong Zhang, and Cuntai Guan. Filter bank common spatial pattern (FBCSP) in brain-computer interface. In2008 IEEE International Joint Conference on Neural Networks, pages 2390–2397. IEEE, 2008

2008

[2] [2]

MIT Press, 2014

Mike X Cohen.Analyzing Neural Time Series Data: Theory and Practice. MIT Press, 2014

2014

[3] [3]

Cover and Joy A

Thomas M. Cover and Joy A. Thomas.Elements of Information Theory. Wiley-Interscience, 2 edition, 2006

2006

[4] [4]

Differential entropy feature for EEG-based emotion recognition

Ruo-Nan Duan, Jia-Yi Zhu, and Bao-Liang Lu. Differential entropy feature for EEG-based emotion recognition. In6th International IEEE/EMBS Conference on Neural Engineering (NER), pages 81–84. IEEE, 2013

2013

[5] [5]

Evgenia Gkintoni, Anthimos Aroutzidis, Hera Antonopoulou, and Constantinos Halkiopoulos. From neural networks to emotional networks: A systematic review of eeg-based emotion recognition in cognitive neuroscience and real-world applications.Brain Sciences, 15(3):220, 2025. doi: 10.3390/brainsci15030220

work page doi:10.3390/brainsci15030220 2025

[6] [6]

SEED-VII: A multimodal dataset of six basic emotions with continuous labels for emotion recognition.IEEE Transactions on Affective Computing, 2024

Wei-Bang Jiang, Xuan-Hao Liu, Wei-Long Zheng, and Bao-Liang Lu. SEED-VII: A multimodal dataset of six basic emotions with continuous labels for emotion recognition.IEEE Transactions on Affective Computing, 2024

2024

[7] [7]

Large brain model for learning generic representa- tions with tremendous EEG data in BCI

Wei-Bang Jiang, Li-Ming Zhao, and Bao-Liang Lu. Large brain model for learning generic representa- tions with tremendous EEG data in BCI. InInternational Conference on Learning Representations, 2024

2024

[8] [8]

EEG alpha and theta oscillations reflect cognitive and memory performance: a review and analysis.Brain Research Reviews, 29(2-3):169–195, 1999

Wolfgang Klimesch. EEG alpha and theta oscillations reflect cognitive and memory performance: a review and analysis.Brain Research Reviews, 29(2-3):169–195, 1999

1999

[9] [9]

DEAP: A database for emotion analysis using physiological signals.IEEE Transactions on Affective Computing, 3(1):18–31, 2012

Sander Koelstra, Christian Mühl, Mohammad Soleymani, Jong-Seok Lee, Ashkan Y azdani, Touradj Ebrahimi, Thierry Pun, Anton Nijholt, and Ioannis Patras. DEAP: A database for emotion analysis using physiological signals.IEEE Transactions on Affective Computing, 3(1):18–31, 2012

2012

[10] [10]

Journal of Neural Engineering15(5), 056013 (2018).https: //doi.org/10.1088/1741-2552/aace8c

V ernon J. Lawhern, Amelia J. Solon, Nicholas R. Waytowich, Stephen M. Gordon, Chou P . Hung, and Brent J. Lance. EEGNet: A compact convolutional neural network for EEG-based brain-computer interfaces.Journal of Neural Engineering, 15(5):056013, 2018. doi: 10.1088/1741-2552/aace8c

work page doi:10.1088/1741-2552/aace8c 2018

[11] [11]

Multisource transfer learning for cross-subject EEG emotion recognition.IEEE Transactions on Cybernetics, 50(7):3281–3293, 2020

Jinpeng Li, Shuang Qiu, Changde Du, Yixin Wang, and Huiguang He. Multisource transfer learning for cross-subject EEG emotion recognition.IEEE Transactions on Cybernetics, 50(7):3281–3293, 2020

2020

[12] [12]

Domain adaptation for EEG emotion recognition based on latent representation similarity.IEEE Transactions on Cognitive and Developmental Systems, 12(2):344–353, 2020

Jinpeng Li, Shuang Qiu, Y uan-Y uan Shen, Cheng-Lin Liu, and Huiguang He. Domain adaptation for EEG emotion recognition based on latent representation similarity.IEEE Transactions on Cognitive and Developmental Systems, 12(2):344–353, 2020

2020

[13] [13]

Eeg foundation models: Progresses, benchmarking, and open problems.arXiv preprint arXiv:2601.17883, 2026

Dingkun Liu, Y uheng Chen, Zhu Chen, Zhenyao Cui, Y aozhi Wen, Jiayu An, Jingwei Luo, and Dongrui Wu. EEG foundation models: Progresses, benchmarking, and open problems.arXiv preprint arXiv:2601.17883, 2025

work page arXiv 2025

[14] [14]

Tokenizing single-channel EEG with time-frequency motif learning

Jiarui Liu et al. Tokenizing single-channel EEG with time-frequency motif learning. InInternational Conference on Learning Representations (ICLR), 2025

2025

[15] [15]

Wei Liu, Jie-Lin Qiu, Wei-Long Zheng, and Bao-Liang Lu. Comparing recognition performance and robustness of multimodal deep learning models for multimodal emotion recognition.IEEE Transactions on Cognitive and Developmental Systems, 14(2):715–729, 2022

2022

[16] [16]

Ravikiran Mane, Effie Chew, Karen Chua, Kai Keng Ang, Neethu Robinson, A. P . Vinod, Seong-Whan Lee, and Cuntai Guan. FBCNet: A multi-view convolutional neural network for brain-computer interface.arXiv preprint arXiv:2104.01233, 2021

work page arXiv 2021

[17] [17]

Unnikrishna Pillai.Probability, Random V ariables, and Stochastic Processes

Athanasios Papoulis and S. Unnikrishna Pillai.Probability, Random V ariables, and Stochastic Processes. McGraw-Hill, 4 edition, 2002

2002

[18] [18]

Stephen O. Rice. Mathematical analysis of random noise.The Bell System T echnical Journal, 23 (3):282–332, 1944. 10

1944

[19] [19]

EEG conformer: Convolutional transformer for EEG decoding and visualization.IEEE Transactions on Neural Systems and Rehabilitation Engineering, 31:710–719, 2023

Y onghao Song, Qingqing Zheng, Bingchuan Liu, and Xiaorong Gao. EEG conformer: Convolutional transformer for EEG decoding and visualization.IEEE Transactions on Neural Systems and Rehabilitation Engineering, 31:710–719, 2023

2023

[20] [20]

RoFormer: Enhanced transformer with rotary position embedding.Neurocomputing, 568:127063, 2024

Jianlin Su, Murtadha Ahmed, Y u Lu, Shengfeng Pan, Wen Bo, and Y unfeng Liu. RoFormer: Enhanced transformer with rotary position embedding.Neurocomputing, 568:127063, 2024

2024

[21] [21]

Oscillatory gamma-band (30–70 Hz) activity induced by a visual search task in humans.Journal of Neuroscience, 17(2):722–734, 1997

Catherine Tallon-Baudry, Olivier Bertrand, Claude Delpuech, and Jacques Pernier. Oscillatory gamma-band (30–70 Hz) activity induced by a visual search task in humans.Journal of Neuroscience, 17(2):722–734, 1997

1997

[22] [22]

Attention is all you need

Ashish V aswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Łukasz Kaiser, and Illia Polosukhin. Attention is all you need. InAdvances in Neural Information Processing Systems, volume 30, 2017

2017

[23] [23]

BrainBERT: Self-supervised representation learning for intracranial recordings

Christopher Wang, Vighnesh Subramaniam, Adam Uri Y aari, Gabriel Kreiman, Boris Katz, Ignacio Cases, and Andrei Barbu. BrainBERT: Self-supervised representation learning for intracranial recordings. InInternational Conference on Learning Representations (ICLR), 2023

2023

[24] [24]

EEGPT: Pretrained transformer for universal and reliable representation of EEG signals

Guangyu Wang, Wenchao Liu, Y uhong He, Cong Xu, Lin Ma, and Haifeng Li. EEGPT: Pretrained transformer for universal and reliable representation of EEG signals. InAdvances in Neural Information Processing Systems, volume 37, 2024

2024

[25] [25]

CBraMod: A criss-cross brain foundation model for EEG decoding

Jiquan Wang, Sha Zhao, Zhiling Luo, Y angxuan Zhou, Haiteng Jiang, Shijian Li, Tao Li, and Gang Pan. CBraMod: A criss-cross brain foundation model for EEG decoding. InInternational Conference on Learning Representations, 2025

2025

[26] [26]

Brandon Westover, and Jimeng Sun

Chaoqi Y ang, M. Brandon Westover, and Jimeng Sun. BIOT: Biosignal transformer for cross-data learning in the wild. InAdvances in Neural Information Processing Systems (NeurIPS), 2023

2023

[27] [27]

CodeBrain: Bridging Decoupled Tokenizer and Multi-Scale Architecture for EEG Foundation Model

Jiahe Zhang et al. CodeBrain: Bridging decoupled tokenizer and multi-scale architecture for EEG foundation model.arXiv preprint arXiv:2506.09110, 2025

work page internal anchor Pith review Pith/arXiv arXiv 2025

[28] [28]

Wei-Long Zheng and Bao-Liang Lu. Investigating critical frequency bands and channels for EEG-based emotion recognition with deep neural networks.IEEE Transactions on Autonomous Mental Development, 7(3):162–175, 2015

2015

[29] [29]

EmotionMeter: A multimodal framework for recognizing human emotions.IEEE Transactions on Cybernetics, 49(3):1110–1122, 2018

Wei-Long Zheng, Wei Liu, Yifei Lu, Bao-Liang Lu, and Andrzej Cichocki. EmotionMeter: A multimodal framework for recognizing human emotions.IEEE Transactions on Cybernetics, 49(3):1110–1122, 2018

2018

[30] [30]

CSBrain: A cross-scale spatiotemporal brain foundation model for EEG decoding

Y uchen Zhou, Jiamin Wu, Zichen Ren, Zhouheng Y ao, Weiheng Lu, Kunyu Peng, Qihao Zheng, Chunfeng Song, Wanli Ouyang, and Chao Gou. CSBrain: A cross-scale spatiotemporal brain foundation model for EEG decoding. InAdvances in Neural Information Processing Systems, volume 38, 2025. 11 A T echnical appendices and supplementary material A.1 Visualization of t...

2025