pith. sign in

arxiv: 2605.18298 · v1 · pith:W55QAK3Wnew · submitted 2026-05-18 · 💻 cs.AI · cs.HC· cs.LG

DARE-EEG: A Foundation Model for Mining Dual-Aligned Representation of EEG

Pith reviewed 2026-05-20 10:00 UTC · model grok-4.3

classification 💻 cs.AI cs.HCcs.LG
keywords EEGfoundation modelself-supervised learningcontrastive learningmask alignmenttransfer learningbrain-computer interface
0
0 comments X

The pith

DARE-EEG pre-trains EEG encoders to enforce mask-invariance by aligning multiple masked views of the same signal into a consistent latent subspace.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces a self-supervised foundation model that learns generalizable EEG representations by addressing the failure of existing masked reconstruction methods when different masked views of a signal share little overlap. It does so through dual-aligned representation learning that combines contrastive alignment across masked views with momentum-based anchoring to complete features. A parameter-efficient adaptation step then handles varying electrode layouts and sampling rates. A sympathetic reader would care because this targets the core obstacle to reusing pre-trained EEG models across heterogeneous brain-computer interface datasets and hardware.

Core claim

DARE-EEG is a foundation model that explicitly enforces the mask-invariance property during pre-training by introducing mask alignment, which constrains representations from multiple masked views of the same EEG sample via contrastive learning, together with anchor alignment that aligns masked representations to momentum-updated complete features for semantic stability, plus conv-linear-probing that adapts the representations to heterogeneous electrode configurations and sampling rates through decoupled spectro-spatial projections.

What carries the argument

Dual-aligned representation learning, which uses contrastive learning to align multiple masked views of one EEG sample and momentum alignment to tie those views to stable complete-signal features.

If this is right

  • State-of-the-art accuracy on diverse EEG benchmarks while keeping parameter counts relatively low.
  • Superior portability when the same pre-trained model is applied to new datasets with different electrode configurations or sampling rates.
  • More effective discovery of rich latent representations within EEG signals for downstream brain-computer interface tasks.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same dual-alignment pattern could be tested on other biosignals such as ECG to check whether mask-invariance helps transfer across recording hardware.
  • Real-time closed-loop BCI experiments would show whether the learned invariance reduces retraining needs when electrode placement varies session to session.

Load-bearing premise

That forcing representations from different masked views of the same EEG signal into one consistent latent subspace will automatically produce better transfer across datasets that differ in electrodes and sampling rates.

What would settle it

A controlled ablation that trains identical models with and without the dual-alignment losses and measures accuracy drop on a cross-dataset transfer task where masked views share minimal temporal overlap.

Figures

Figures reproduced from arXiv: 2605.18298 by Daoqiang Zhang, Peiliang Gong, Qun Dai, Yang Shao.

Figure 1
Figure 1. Figure 1: Illustration of the advantage of the dual alignment [PITH_FULL_IMAGE:figures/full_fig_p001_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: The structure of DARE-EEG used to discover dual-aligned representations of EEG signals. It mainly includes two steps: [PITH_FULL_IMAGE:figures/full_fig_p004_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Conv-Linear-Probing learns an effective mapping method for different downstream tasks by decoupling channel [PITH_FULL_IMAGE:figures/full_fig_p005_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: Overview of the DARE-EEG downstream training [PITH_FULL_IMAGE:figures/full_fig_p006_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: Results of the module ablation experiment. [PITH_FULL_IMAGE:figures/full_fig_p009_5.png] view at source ↗
Figure 6
Figure 6. Figure 6: Scaling Laws with DARE-EEG parameters. Saturated function fitting was used. The parameter axis is on a logarithmic [PITH_FULL_IMAGE:figures/full_fig_p012_6.png] view at source ↗
Figure 7
Figure 7. Figure 7: The changes in loss and the number of model parameters after removing different modules. [PITH_FULL_IMAGE:figures/full_fig_p012_7.png] view at source ↗
Figure 8
Figure 8. Figure 8: Analysis of the CLP during downstream training. The figure illustrates the learned channel projection weights and the [PITH_FULL_IMAGE:figures/full_fig_p013_8.png] view at source ↗
Figure 9
Figure 9. Figure 9: Small data learning performance on the MMWM dataset with subject-dependent evaluation. The line plots report [PITH_FULL_IMAGE:figures/full_fig_p015_9.png] view at source ↗
Figure 10
Figure 10. Figure 10: The t-SNE results for subject 1 on the BCIC-2A dataset are shown. The left image shows the results using MA, while [PITH_FULL_IMAGE:figures/full_fig_p015_10.png] view at source ↗
Figure 11
Figure 11. Figure 11: The t-SNE results for subject 7 on the BCIC-2B dataset are shown. The left image shows the results using MA, while [PITH_FULL_IMAGE:figures/full_fig_p016_11.png] view at source ↗
Figure 12
Figure 12. Figure 12: PSD topographic maps of Subjects 1 and 3 from the BCIC-2A dataset. PSD values are shown on a log10 scale over the [PITH_FULL_IMAGE:figures/full_fig_p016_12.png] view at source ↗
Figure 13
Figure 13. Figure 13: PSD topographic maps of Subjects 1 and 2 from the SEEDIV dataset. The gamma frequency band is shown, with [PITH_FULL_IMAGE:figures/full_fig_p017_13.png] view at source ↗
read the original abstract

Foundation models pre-trained through masked reconstruction on large-scale EEG data have emerged as a promising paradigm for learning generalizable neural representations across diverse brain-computer interface applications. However, a critical yet overlooked challenge is that EEG encoders must learn representations invariant to incomplete observations-when different masked views of the same signal have minimal overlap, existing methods fail to constrain them to a consistent latent subspace, leading to degraded transferability. To address this, we propose DARE-EEG, a self-supervised foundation model that explicitly enforces the mask-invariance property through dual-aligned representation learning during pre-training. Specifically, we introduce mask alignment that constrains representations from multiple masked views of the same EEG sample via contrastive learning, complementing anchor alignment that aligns masked representations to momentum-updated complete features for semantic stability. Additionally, we propose conv-linear-probing, a parameter-efficient strategy that adapts pre-trained representations to heterogeneous electrode configurations and sampling rates through decoupled spectro-spatial projections. Extensive experiments across diverse EEG benchmarks demonstrate that DARE-EEG consistently achieves state-of-the-art in accuracy performance while maintaining relatively low parameter complexity and superior cross-dataset portability compared to existing methods. Furthermore, DARE-EEG contributes to effectively discovering and utilizing the rich potential representations in EEG.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The paper proposes DARE-EEG, a self-supervised EEG foundation model that enforces mask-invariance during pre-training via dual-aligned representation learning: mask alignment (contrastive learning on multiple masked views of the same sample) combined with anchor alignment (momentum-updated complete features for semantic stability). It introduces conv-linear-probing as a parameter-efficient adaptation mechanism for heterogeneous electrode configurations and sampling rates. Experiments across EEG benchmarks claim state-of-the-art accuracy, low parameter complexity, and superior cross-dataset portability relative to prior masked-reconstruction baselines.

Significance. If the results hold after isolating the contribution of the alignment objectives, the work would offer a concrete mechanism for improving representation consistency under incomplete observations, which is a practical bottleneck in EEG transfer learning. The conv-linear-probing strategy is a clear engineering contribution for deployment across variable hardware setups.

major comments (2)
  1. [§3.2] §3.2 (Dual-Aligned Pre-training): the central claim that dual alignment produces superior cross-dataset portability rests on the untested assumption that the contrastive mask-alignment and momentum-anchor terms measurably tighten the latent subspace beyond what standard masked reconstruction already achieves. No invariance metric (e.g., average cosine similarity or mutual information between differently masked views of the same sample) is reported.
  2. [§4.3] §4.3 and Table 4 (Cross-dataset Portability): the reported gains are not accompanied by ablations that remove only the two alignment losses while keeping the conv-linear-probing head and all other hyperparameters fixed. Without this control, the portability advantage cannot be confidently attributed to the proposed dual-alignment rather than to the probing strategy or dataset-specific factors.
minor comments (2)
  1. [Figure 2] Figure 2 (architecture diagram): the flow from masked views through the dual-alignment heads to the momentum encoder is difficult to follow; adding explicit arrows or a legend for the contrastive and anchor losses would improve clarity.
  2. [§4.1] §4.1 (Experimental Setup): the description of how electrode configurations and sampling rates are normalized across source and target datasets is brief; a short table listing the exact channel counts and rates for each benchmark would aid reproducibility.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive comments, which help clarify the contribution of the dual-alignment objectives. We respond to each major comment below and have revised the manuscript accordingly to provide additional evidence.

read point-by-point responses
  1. Referee: [§3.2] §3.2 (Dual-Aligned Pre-training): the central claim that dual alignment produces superior cross-dataset portability rests on the untested assumption that the contrastive mask-alignment and momentum-anchor terms measurably tighten the latent subspace beyond what standard masked reconstruction already achieves. No invariance metric (e.g., average cosine similarity or mutual information between differently masked views of the same sample) is reported.

    Authors: We agree that an explicit invariance metric would strengthen the presentation of the mask-invariance property. In the revised manuscript we have added a quantitative analysis in §3.2 (and supplementary material) that reports average cosine similarity between representations of differently masked views of the same sample. The results show that the combination of mask alignment and anchor alignment yields measurably higher similarity scores than the masked-reconstruction baseline alone, supporting the claim that dual alignment tightens the latent subspace. revision: yes

  2. Referee: [§4.3] §4.3 and Table 4 (Cross-dataset Portability): the reported gains are not accompanied by ablations that remove only the two alignment losses while keeping the conv-linear-probing head and all other hyperparameters fixed. Without this control, the portability advantage cannot be confidently attributed to the proposed dual-alignment rather than to the probing strategy or dataset-specific factors.

    Authors: We acknowledge the value of isolating the alignment losses. The revised manuscript now includes an ablation in §4.3 that removes only the mask-alignment and anchor-alignment terms while retaining the conv-linear-probing head and all other hyperparameters. The updated Table 4 and accompanying text show that cross-dataset portability degrades when the alignment objectives are ablated, indicating that the dual-alignment mechanism contributes to the observed gains beyond the probing strategy. revision: yes

Circularity Check

0 steps flagged

No circularity: derivation chain is self-contained and empirically grounded

full rationale

The paper introduces dual-aligned contrastive and momentum alignment as explicit new objectives to enforce mask-invariance during self-supervised pre-training on EEG data, then validates resulting portability gains through experiments on heterogeneous datasets using conv-linear-probing adaptation. No load-bearing step reduces by construction to its own inputs: the alignment losses are not fitted parameters renamed as predictions, no self-citation chain justifies a uniqueness claim, and no ansatz or renaming of known results is presented as a derivation. The central claims rest on the proposed architectural additions and benchmark results rather than tautological equivalence.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The abstract relies on the standard assumption that masked reconstruction pre-training yields generalizable EEG representations and introduces dual alignment without specifying free parameters or new entities; details are insufficient for full ledger.

axioms (1)
  • domain assumption Masked reconstruction on large-scale EEG data learns generalizable neural representations across BCI applications
    Stated as an emerging promising paradigm in the abstract opening.

pith-pipeline@v0.9.0 · 5755 in / 1137 out tokens · 54359 ms · 2026-05-20T10:00:29.447192+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

59 extracted references · 59 canonical work pages

  1. [1]

    Al- shehri, Sultan Almakdi, Yazeed Yasin Ghadi, and Jawad Ahmad

    Saadullah Farooq Abbasi, Awais Abbas, Iftikhar Ahmad, Mohammed S. Al- shehri, Sultan Almakdi, Yazeed Yasin Ghadi, and Jawad Ahmad. 2023. Automatic neonatal sleep stage classification: A comparative study.Heliyon9, 11 (2023). doi:10.1016/j.heliyon.2023.e22195

  2. [2]

    Alexei Baevski, Wei-Ning Hsu, Qiantong Xu, Arun Babu, Jiatao Gu, and Michael Auli. 2022. data2vec: A General Framework for Self-supervised Learning in Speech, Vision and Language. InProceedings of the 39th International Confer- ence on Machine Learning (Proceedings of Machine Learning Research, Vol. 162), Kamalika Chaudhuri, Stefanie Jegelka, Le Song, Csa...

  3. [3]

    Mathilde Caron, Hugo Touvron, Ishan Misra, Hervé Jégou, Julien Mairal, Piotr Bojanowski, and Armand Joulin. 2021. Emerging Properties in Self-Supervised Vision Transformers. InProceedings of the IEEE/CVF International Conference on Computer Vision (ICCV). IEEE, Montreal, QC, Canada, 9650–9660. doi:10.1109/ ICCV48922.2021.00951

  4. [4]

    Ching Chang, Chiao-Tung Chan, Wei-Yao Wang, Wen-Chih Peng, and Tien-Fu Chen. 2024. TimeDRL: Disentangled Representation Learning for Multivariate Time-Series. In2024 IEEE 40th International Conference on Data Engineering (ICDE). 625–638. doi:10.1109/ICDE60146.2024.00054

  5. [5]

    Xinlei Chen and Kaiming He. 2021. Exploring Simple Siamese Representation Learning. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, Nashville, TN, USA, 15750–15758. doi:10.1109/ CVPR46437.2021.01550

  6. [6]

    Zhihua Chen, Bo Hu, Zhongsheng Chen, and Jiarui Zhang. 2024. Progress and Thinking on Self-Supervised Learning Methods in Computer Vision: A Review. IEEE Sensors Journal24, 19 (2024), 29524–29544. doi:10.1109/JSEN.2024.3443885

  7. [7]

    Xiaohan Ding, Yiyuan Zhang, Yixiao Ge, Sijie Zhao, Lin Song, Xiangyu Yue, and Ying Shan. 2024. UniRepLKNet: A Universal Perception Large-Kernel ConvNet for Audio Video Point Cloud Time-Series and Image Recognition. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (Seattle, WA, USA). 5513–5524

  8. [8]

    PhysioBank, PhysioToolkit, and PhysioNet: components of a new research resource for complex physiologic signals,

    Ary L. Goldberger, Luis A. N. Amaral, Leon Glass, Jeffrey M. Haus- dorff, Plamen Ch. Ivanov, Roger G. Mark, Joseph E. Mietus, George B. Moody, Chung-Kang Peng, and H. Eugene Stanley. 2000. PhysioBank, PhysioToolkit, and PhysioNet.Circulation101, 23 (2000), e215–e220. arXiv:https://www.ahajournals.org/doi/pdf/10.1161/01.CIR.101.23.e215 doi:10. 1161/01.CIR....

  9. [9]

    Jean-Bastien Grill, Florian Strub, Florent Altché, Corentin Tallec, Pierre Richemond, Elena Buchatskaya, Carl Doersch, Bernardo Avila Pires, Zhaohan Guo, Mohammad Gheshlaghi Azar, Bilal Piot, koray kavukcuoglu, Remi Munos, and Michal Valko. 2020. Bootstrap Your Own Latent - A New Approach to Self-Supervised Learning. InAdvances in Neural Information Proce...

  10. [10]

    He He and Dongrui Wu. 2020. Transfer Learning for Brain–Computer Interfaces: A Euclidean Space Data Alignment Approach.IEEE Transactions on Biomedical Engineering67, 2 (2020), 399–410. doi:10.1109/TBME.2019.2913914

  11. [11]

    Kaiming He, Xinlei Chen, Saining Xie, Yanghao Li, Piotr Dollár, and Ross Girshick

  12. [12]

    In: 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)

    Masked Autoencoders Are Scalable Vision Learners. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, New Orleans, LA, USA, 16000–16009. doi:10.1109/CVPR52688.2022.01553

  13. [13]

    Gan Huang, Zhenxing Hu, Weize Chen, Shaorong Zhang, Zhen Liang, Linling Li, Li Zhang, and Zhiguo Zhang. 2022. M3CV: A multi-subject, multi-session, and multi-task database for EEG-based biometrics challenge.NeuroImage264 (2022), 119666. doi:10.1016/j.neuroimage.2022.119666

  14. [14]

    Wei-Bang Jiang, Li-Ming Zhao, and Bao-Liang Lu. 2024. Large Brain Model for Learning Generic Representations with Tremendous EEG Data in BCI. InPro- ceedings of the Twelfth International Conference on Learning Representations (ICLR 2024). https://openreview.net/forum?id=QzTpTRVtrP Spotlight / Conference paper

  15. [15]

    Jin Jing, Wendong Ge, Shenda Hong, Marta Bento Fernandes, Zhen Lin, Chaoqi Yang, Sungtae An, Aaron F Struck, Aline Herlopian, Ioannis Karakis, et al. 2023. Development of expert-level classification of seizures and rhythmic and periodic patterns during EEG interpretation.Neurology100, 17 (2023), e1750–e1762

  16. [16]

    Kemp, A.H

    B. Kemp, A.H. Zwinderman, B. Tuk, H.A.C. Kamphuisen, and J.J.L. Oberye

  17. [17]

    Analysis of a sleep-dependent neuronal feedback loop: the slow-wave microcontinuity of the eeg,

    Analysis of a sleep-dependent neuronal feedback loop: the slow-wave microcontinuity of the EEG.IEEE Transactions on Biomedical Engineering47, 9 (2000), 1185–1194. doi:10.1109/10.867928

  18. [18]

    Demetres Kostas, Stéphane Aroca-Ouellette, and Frank Rudzicz. 2021. BENDR: Using Transformers and a Contrastive Self-Supervised Learning Task to Learn From Massive Amounts of EEG Data.Frontiers in Human NeuroscienceVolume 15 - 2021 (2021). doi:10.3389/fnhum.2021.653659

  19. [19]

    Vernon J Lawhern, Amelia J Solon, Nicholas R Waytowich, Stephen M Gordon, Chou P Hung, and Brent J Lance. 2018. EEGNet: a compact convolutional neural network for EEG-based brain–computer interfaces.Journal of Neural Engineering 15, 5 (jul 2018), 056013. doi:10.1088/1741-2552/aace8c

  20. [20]

    Hongli Li, Man Ding, Ronghua Zhang, and Chunbo Xiu. 2022. Motor imagery EEG classification algorithm based on CNN-LSTM feature fusion network.Biomedical Signal Processing and Control72 (2022), 103342. doi:10.1016/j.bspc.2021.103342

  21. [21]

    Laurens van der Maaten and Geoffrey Hinton. 2008. Visualizing data using t-SNE. Journal of machine learning research9, Nov (2008), 2579–2605

  22. [22]

    2005.Electroencephalography: basic principles, clinical applications, and related fields

    Ernst Niedermeyer and FH Lopes da Silva. 2005.Electroencephalography: basic principles, clinical applications, and related fields. Lippincott Williams & Wilkins

  23. [23]

    2006.Electric fields of the brain: the neurophysics of EEG

    Paul L Nunez and Ramesh Srinivasan. 2006.Electric fields of the brain: the neurophysics of EEG. Oxford university press

  24. [24]

    Iyad Obeid and Joseph Picone. 2016. The Temple University Hospital EEG Data Corpus.Frontiers in NeuroscienceVolume 10 - 2016 (2016). doi:10.3389/fnins. 2016.00196

  25. [25]

    Wei Yan Peh, Yuanyuan Yao, and Justin Dauwels. 2022. Transformer Convo- lutional Neural Networks for Automated Artifact Detection in Scalp EEG. In 2022 44th Annual International Conference of the IEEE Engineering in Medicine & Biology Society (EMBC). 3599–3602. doi:10.1109/EMBC48229.2022.9871916

  26. [26]

    Roy, Marcel Hinss, Fabien Lotte, Ludovic Darmet, and Si- mon Ladouce

    Raphaëlle N. Roy, Marcel Hinss, Fabien Lotte, Ludovic Darmet, and Si- mon Ladouce. 2021. Passive BCI Cross-Session Workload EEG Dataset. Neuroergonomics Conference Passive BCI Challenge. https://www. neuroergonomicsconference.um.ifi.lmu.de/pbci/ Accessed: 2025-9

  27. [27]

    Robin Tibor Schirrmeister, Jost Tobias Springenberg, Lukas Do- minique Josef Fiederer, Martin Glasstetter, Katharina Eggensperger, Michael Tangermann, Frank Hutter, Wolfram Burgard, and Tonio Ball

  28. [28]

    Deep learning with convolutional neural networks for EEG de- coding and visualization.Human Brain Mapping38, 11 (2017), 5391–

  29. [29]

    arXiv:https://onlinelibrary.wiley.com/doi/pdf/10.1002/hbm.23730 doi:10.1002/hbm.23730

  30. [30]

    Sina Shafiezadeh, Gian Marco Duma, Marco Pozza, and Alberto Testolin. 2024. A systematic review of cross-patient approaches for EEG epileptic seizure pre- diction.Journal of Neural Engineering21, 6 (dec 2024), 061004. doi:10.1088/1741- 2552/ad9682

  31. [31]

    Yang Shao, Yueying Zhou, Peiliang Gong, Qianru Sun, and Daoqiang Zhang

  32. [32]

    doi:10.1109/TNSRE.2024.3415364

    A Dual-Adversarial Model for Cross-Time and Cross-Subject Cognitive Workload Decoding.IEEE Transactions on Neural Systems and Rehabilitation Engineering32 (2024), 2324–2335. doi:10.1109/TNSRE.2024.3415364

  33. [33]

    Yang Shao, Yueying Zhou, Xuyun Wen, Peiliang Gong, Qun Dai, and Daoqiang Zhang. 2026. Exploring Cognitive Workload Recognition Using CogRepLKNet with EEG-fMRI.Neural Networks(2026), 108575. doi:10.1016/j.neunet.2026. 108575

  34. [34]

    Yang Shao, Yueying Zhou, and Daoqiang Zhang. 2025. BDANet: A Binary- Dimensional Aware Network with Multi-Wise Attention for Cognitive Workload Recognition. In2025 13th International Conference on Brain-Computer Interface (BCI). 1–8. doi:10.1109/BCI65088.2025.10931257 Shao et al

  35. [35]

    Qingshan She, Chenqi Zhang, Feng Fang, Yuliang Ma, and Yingchun Zhang

  36. [36]

    doi:10.1109/TIM.2023.3277985

    Multisource Associate Domain Adaptation for Cross-Subject and Cross- Session EEG Emotion Recognition.IEEE Transactions on Instrumentation and Measurement72 (2023), 1–12. doi:10.1109/TIM.2023.3277985

  37. [37]

    Yonghao Song, Xueyu Jia, Lie Yang, and Longhan Xie. 2021. Transformer- based spatial-temporal feature learning for EEG decoding.arXiv preprint arXiv:2106.11170(2021)

  38. [38]

    Yonghao Song, Qingqing Zheng, Bingchuan Liu, and Xiaorong Gao. 2023. EEG Conformer: Convolutional Transformer for EEG Decoding and Visualization. IEEE Transactions on Neural Systems and Rehabilitation Engineering31 (2023), 710–719. doi:10.1109/TNSRE.2022.3230250

  39. [39]

    Michael Tangermann, Kai-Robert Müller, Ad Aertsen, Niels Birbaumer, Benjamin Blankertz, Gottfried Curio, Bernhard Graimann, Christina Hammon, Jane E Huggins, Dean J Krusienski, et al . 2012. Review of the BCI Competition IV. Frontiers in Neuroscience6 (2012), 55

  40. [40]

    Akiyoshi Tomihari and Issei Sato. 2024. Understanding Linear Probing then Fine-tuning Language Models from NTK Perspective. InAdvances in Neural Information Processing Systems, A. Globerson, L. Mackey, D. Belgrave, A. Fan, U. Paquet, J. Tomczak, and C. Zhang (Eds.), Vol. 37. Curran Associates, Inc., 139786–139822. doi:10.52202/079017-4436

  41. [41]

    Guagnyu Wang, Wenchao Liu, Yuhong He, Cong Xu, Lin Ma, and Haifeng Li. 2024. EEGPT: Pretrained Transformer for Universal and Reliable Representation of EEG Signals. InAdvances in Neural Information Processing Systems, A. Globerson, L. Mackey, D. Belgrave, A. Fan, U. Paquet, J. Tomczak, and C. Zhang (Eds.), Vol. 37. Curran Associates, Inc., 39249–39280. do...

  42. [42]

    Jiquan Wang, Sha Zhao, Zhiling Luo, Yangxuan Zhou, Shijian Li, and Gang Pan

  43. [43]

    doi:10.1016/j.neunet.2025.107816

    EEGMamba: An EEG foundation model with Mamba.Neural Networks192 (2025), 107816. doi:10.1016/j.neunet.2025.107816

  44. [44]

    Pengpai Wang, Tiantian Xie, Yueying Zhou, Peiliang Gong, and Rosa H. M. Chan. 2025. TCPL: task-conditioned prompt learning for few-shot cross-subject motor imagery EEG decoding.Frontiers in NeuroscienceVolume 19 - 2025 (2025). doi:10.3389/fnins.2025.1689286

  45. [45]

    Xiaohu Wang, Yongmei Ren, Ze Luo, Wei He, Jun Hong, and Yinzhen Huang

  46. [46]

    doi:10.3389/fpsyg

    Deep learning-based EEG emotion recognition: Current trends and future perspectives.Frontiers in PsychologyVolume 14 - 2023 (2023). doi:10.3389/fpsyg. 2023.1126994

  47. [47]

    Yijun Wang, Xiaogang Chen, Xiaorong Gao, and Shangkai Gao. 2017. A Bench- mark Dataset for SSVEP-Based Brain-Computer Interfaces.IEEE Transactions on Neural Systems and Rehabilitation Engineering25, 10 (2017), 1746–1752. doi:10.1109/TNSRE.2016.2627556

  48. [48]

    Tracy Warbrick. 2022. Simultaneous EEG-fMRI: What Have We Learned and What Does the Future Hold?Sensors22, 6 (2022). doi:10.3390/s22062262

  49. [49]

    P. Welch. 1967. The use of fast Fourier transform for the estimation of power spectra: A method based on time averaging over short, modified periodograms. IEEE Transactions on Audio and Electroacoustics15, 2 (1967), 70–73. doi:10.1109/ TAU.1967.1161901

  50. [50]

    Chen Xu and Ren-Zhe Xia. 2023. EEG Signal Classification and Feature Extraction Methods Based on Deep Learning: A Review. In2023 2nd International Conference on Big Data, Information and Computer Network (BDICN). 186–189. doi:10.1109/ BDICN58493.2023.00046

  51. [51]

    Chaoqi Yang, M Westover, and Jimeng Sun. 2023. BIOT: Biosignal Transformer for Cross-data Learning in the Wild. InAdvances in Neu- ral Information Processing Systems, A. Oh, T. Naumann, A. Globerson, K. Saenko, M. Hardt, and S. Levine (Eds.), Vol. 36. Curran Associates, Inc., 78240–78260. https://proceedings.neurips.cc/paper_files/paper/2023/file/ f6b30f3...

  52. [52]

    Chaoqi Yang, Cao Xiao, M Brandon Westover, and Jimeng Sun. 2023. Self- Supervised Electroencephalogram Representation Learning for Automatic Sleep Staging: Model Development and Evaluation Study.JMIR AI2 (26 Jul 2023), e46769. doi:10.2196/46769

  53. [53]

    Chaoning Zhang, Chenshuang Zhang, Junha Song, John Seon Keun Yi, and In So Kweon. 2023. A Survey on Masked Autoencoder for Visual Self-supervised Learning. InProceedings of the Thirty-Second International Joint Conference on Artificial Intelligence (IJCAI-23). International Joint Conferences on Artificial Intelligence Organization, Macau, China, 6805–6813

  54. [54]

    Chaoning Zhang, Chenshuang Zhang, Junha Song, John Seon Keun Yi, Kang Zhang, and In So Kweon. 2022. A survey on masked autoencoder for self- supervised learning in vision and beyond.arXiv preprint arXiv:2208.00173(2022)

  55. [55]

    Zhang, Yuxuan Liang, Guansong Pang, Dongjin Song, and Shirui Pan

    Kexin Zhang, Qingsong Wen, Chaoli Zhang, Rongyao Cai, Ming Jin, Yong Liu, James Y. Zhang, Yuxuan Liang, Guansong Pang, Dongjin Song, and Shirui Pan

  56. [56]

    doi:10.1109/TPAMI.2024.3387317

    Self-Supervised Learning for Time Series Analysis: Taxonomy, Progress, and Prospects.IEEE Transactions on Pattern Analysis and Machine Intelligence46, 10 (2024), 6775–6794. doi:10.1109/TPAMI.2024.3387317

  57. [57]

    Pengbo Zhang, Xue Wang, Weihang Zhang, and Junfeng Chen. 2019. Learning Spatial–Spectral–Temporal EEG Features With Recurrent 3D Convolutional Neural Networks for Cross-Task Mental Workload Assessment.IEEE Transactions on Neural Systems and Rehabilitation Engineering27, 1 (2019), 31–42. doi:10.1109/ TNSRE.2018.2884641

  58. [58]

    Wei-Long Zheng, Wei Liu, Yifei Lu, Bao-Liang Lu, and Andrzej Cichocki. 2019. EmotionMeter: A Multimodal Framework for Recognizing Human Emotions. IEEE Transactions on Cybernetics49, 3 (2019), 1110–1122. doi:10.1109/TCYB.2018. 2797176

  59. [59]

    Zheng, W.-L

    Wei-Long Zheng and Bao-Liang Lu. 2015. Investigating Critical Frequency Bands and Channels for EEG-based Emotion Recognition with Deep Neural Networks.IEEE Transactions on Autonomous Mental Development7, 3 (2015), 162–175. doi:10.1109/TAMD.2015.2431497 DARE-EEG: A Foundation Model for Mining Dual-Aligned Representation of EEG A Notation Table This section...