CodeBrain: Bridging Decoupled Tokenizer and Multi-Scale Architecture for EEG Foundation Model
Pith reviewed 2026-05-19 10:10 UTC · model grok-4.3
The pith
Decoupling EEG signals into temporal and frequency tokens plus multi-scale attention lets a foundation model generalize across brain tasks and datasets.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
CodeBrain is a two-stage EFM. In the first stage, the TFDual-Tokenizer decouples heterogeneous temporal and frequency EEG signals into discrete tokens, quadratically expanding the representation space to enhance discriminative power and offering domain-specific representation-level interpretability by suggesting potential links to neural events and spectral rhythms. In the second stage, the multi-scale EEGSSM architecture combines structured global convolution with sliding window attention to efficiently capture both sparse long-range and local dependencies, reflecting the brain's small-world topology. Pretrained on the largest public EEG corpus, CodeBrain achieves strong generalization on a
What carries the argument
The TFDual-Tokenizer that separates temporal and frequency EEG components into discrete tokens, paired with the multi-scale EEGSSM that mixes global convolution and sliding-window attention to capture both long-range and local brain dependencies.
If this is right
- The model generalizes across eight downstream tasks on ten datasets even when data distributions shift.
- Ablation studies, scaling-law analysis, and interpretability checks support the design choices.
- The architecture mirrors the brain's small-world topology by handling both sparse long-range and local patterns.
- Representation-level interpretability arises from linking tokens to neural events and spectral rhythms.
Where Pith is reading between the lines
- The token-based approach could make it easier to combine EEG data with recordings from other sensors without retraining from scratch.
- If the scaling laws hold, larger versions of the model may continue to improve performance on rare or noisy brain signals.
- Clinicians might one day inspect which tokens activate for specific symptoms, turning the model into a diagnostic aid rather than a black box.
Load-bearing premise
That separating temporal and frequency parts of EEG signals into tokens will both enlarge the space of possible representations and create meaningful links to actual brain events for interpretability.
What would settle it
Finding that CodeBrain shows no measurable gain in accuracy or robustness over prior EEG models when evaluated on the same eight tasks and ten datasets under distribution shifts.
Figures
read the original abstract
Electroencephalography (EEG) provides real-time insights into brain activity and supports diverse applications in neuroscience. While EEG foundation models (EFMs) have emerged to address the scalability issues of task-specific models, current approaches still yield clinically uninterpretable and weakly discriminative representations, inefficiently capturing global dependencies and neglecting important local neural events. We present CodeBrain, a two-stage EFM designed to fill this gap. In the first stage, we introduce the TFDual-Tokenizer, which decouples heterogeneous temporal and frequency EEG signals into discrete tokens, quadratically expanding the representation space to enhance discriminative power and offering domain-specific representation-level interpretability by suggesting potential links to neural events and spectral rhythms. In the second stage, we propose the multi-scale EEGSSM architecture, which combines structured global convolution with sliding window attention to efficiently capture both sparse long-range and local dependencies, reflecting the brain's small-world topology. Pretrained on the largest public EEG corpus, CodeBrain achieves strong generalization across eight downstream tasks and ten datasets under distribution shifts, supported by comprehensive ablations, scaling-law analyzes, and interpretability evaluations. The code and the pretrained weights are available at https://github.com/jingyingma01/CodeBrain.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. CodeBrain is a two-stage EEG foundation model. Stage one introduces the TFDual-Tokenizer that decouples temporal and frequency EEG signals into discrete tokens, claimed to quadratically expand the representation space and supply domain-specific interpretability via links to neural events and spectral rhythms. Stage two deploys the multi-scale EEGSSM architecture that combines structured global convolution with sliding-window attention to capture sparse long-range and local dependencies consistent with brain small-world topology. The model is pretrained on the largest public EEG corpus and evaluated for generalization across eight downstream tasks on ten datasets under distribution shifts, accompanied by ablations, scaling-law analyses, and interpretability studies. Code and pretrained weights are released.
Significance. If the performance and attribution claims hold, the work would advance EEG foundation models by improving both discriminative power and neurophysiological interpretability while respecting brain topology. The public release of code and weights, together with scaling-law analyses and comprehensive ablations, constitutes a clear strength that supports reproducibility and further research.
major comments (3)
- [§3.1] §3.1 (TFDual-Tokenizer description): The assertion that decoupling into two discrete codebooks 'quadratically expands the representation space' is not accompanied by a derivation or explicit comparison. It remains unclear whether the model uses the Cartesian product of the two codebooks (size |C_t| × |C_f|) or a simple concatenation; without this formalization or an ablation against a single unified tokenizer, the claimed enhancement in discriminative power cannot be rigorously attributed to the decoupling step.
- [§4.3] §4.3 and interpretability subsection: The claim of 'domain-specific representation-level interpretability' via potential links to neural events and spectral rhythms is presented without quantitative validation, such as alignment metrics, statistical tests, or controls that compare learned token assignments against established neurophysiological markers. This leaves the interpretability benefit motivational rather than demonstrated, weakening the link to downstream gains.
- [Table 2] Table 2 (main results under distribution shifts): The reported generalization across ten datasets lacks error bars, confidence intervals, or statistical significance tests relative to baselines. Given that the central claim rests on 'strong generalization,' the absence of these elements makes it difficult to assess whether observed improvements are robust or attributable to the proposed architecture.
minor comments (2)
- [Abstract] Abstract: 'scaling-law analyzes' should read 'scaling-law analyses'.
- [Figure 2] Figure 2 (architecture diagram): The caption and legend could more explicitly distinguish the flow from TFDual-Tokenizer outputs to the EEGSSM blocks to aid reader comprehension.
Simulated Author's Rebuttal
We thank the referee for the insightful comments on our manuscript. We have carefully considered each point and outline our responses below, along with the revisions we plan to implement.
read point-by-point responses
-
Referee: §3.1 (TFDual-Tokenizer description): The assertion that decoupling into two discrete codebooks 'quadratically expands the representation space' is not accompanied by a derivation or explicit comparison. It remains unclear whether the model uses the Cartesian product of the two codebooks (size |C_t| × |C_f|) or a simple concatenation; without this formalization or an ablation against a single unified tokenizer, the claimed enhancement in discriminative power cannot be rigorously attributed to the decoupling step.
Authors: We appreciate this observation. Upon review, the TFDual-Tokenizer indeed employs separate discrete codebooks for temporal and frequency signals, with the effective representation space being the Cartesian product |C_t| × |C_f|. In the revised version, we will include a mathematical derivation of this quadratic expansion relative to a unified tokenizer, clarify the usage of the product, and add an ablation study comparing it to a single codebook approach to rigorously demonstrate the improvement in discriminative power. revision: yes
-
Referee: §4.3 and interpretability subsection: The claim of 'domain-specific representation-level interpretability' via potential links to neural events and spectral rhythms is presented without quantitative validation, such as alignment metrics, statistical tests, or controls that compare learned token assignments against established neurophysiological markers. This leaves the interpretability benefit motivational rather than demonstrated, weakening the link to downstream gains.
Authors: We agree that stronger quantitative evidence would enhance this section. We will revise the interpretability subsection to include quantitative validation, such as alignment metrics between token assignments and known neural events (e.g., P300, mu rhythm) and statistical tests against random baselines or controls, to better demonstrate the domain-specific interpretability and its connection to performance gains. revision: yes
-
Referee: Table 2 (main results under distribution shifts): The reported generalization across ten datasets lacks error bars, confidence intervals, or statistical significance tests relative to baselines. Given that the central claim rests on 'strong generalization,' the absence of these elements makes it difficult to assess whether observed improvements are robust or attributable to the proposed architecture.
Authors: We thank the referee for highlighting this. In the updated manuscript, we will augment Table 2 with error bars (standard deviations across runs or datasets), confidence intervals, and statistical significance tests (such as t-tests with p-values) against the baseline methods to provide a more robust assessment of the generalization performance under distribution shifts. revision: yes
Circularity Check
TFDual-Tokenizer quadratic expansion presented as derived benefit but follows directly from dual codebook definition
specific steps
-
self definitional
[Abstract, first-stage description]
"we introduce the TFDual-Tokenizer, which decouples heterogeneous temporal and frequency EEG signals into discrete tokens, quadratically expanding the representation space to enhance discriminative power and offering domain-specific representation-level interpretability by suggesting potential links to neural events and spectral rhythms."
The quadratic expansion is claimed as a benefit that enhances discriminative power, yet it is the immediate result of defining the tokenizer via two separate codebooks whose combined space size is their product; this holds by construction for any dual discrete tokenizer and does not require EEG-specific properties or additional derivation.
full rationale
The paper's central architectural claim in the first stage asserts that decoupling temporal and frequency signals into discrete tokens quadratically expands the representation space and supplies domain-specific interpretability. This expansion is a direct arithmetic consequence of combining two independent codebooks (product of their sizes) rather than an independent derivation from EEG signal properties or empirical validation. No equations are shown demonstrating quadratic growth beyond the definitional product, and the interpretability is framed as 'suggesting potential links' without quantitative mapping to neural events. The downstream generalization claims rest on this premise, but the expansion itself reduces to the tokenizer's construction. The multi-scale architecture and pretraining claims do not exhibit similar reductions and appear independent of fitted inputs or self-citations.
Axiom & Free-Parameter Ledger
axioms (2)
- domain assumption EEG signals contain heterogeneous temporal and frequency components that can be decoupled into discrete tokens for improved representation
- domain assumption The brain's small-world topology is effectively modeled by combining structured global convolution with sliding window attention
invented entities (2)
-
TFDual-Tokenizer
no independent evidence
-
EEGSSM
no independent evidence
Forward citations
Cited by 2 Pith papers
-
Foundation Model Guided Dual-Branch Co-Adaptation for Source-Free EEG Decoding
FUSED integrates EEG foundation models into source-free domain adaptation via dual-branch co-adaptation, consensus filtering, and two-stage pseudo-label refinement to achieve state-of-the-art cross-subject EEG decoding.
-
PRiSE-EEG: A Prior-Guided Foundation Model with Depth-Stratified Experts for Cross-Paradigm EEG Representation Learning
PRiSE-EEG is a prior-guided EEG foundation model that allocates shared and specialized experts across depth using CKA-derived sigmoid mappings and reports strong cross-paradigm results on 12 benchmarks.
Reference graph
Works this paper leans on
-
[1]
Lippincott Williams & Wilkins, 2005
Ernst Niedermeyer and FH Lopes da Silva.Electroencephalography: basic principles, clinical applications, and related fields. Lippincott Williams & Wilkins, 2005
work page 2005
-
[2]
Eeg and meg: relevance to neuroscience.Neuron, 80(5):1112–1128, 2013
Fernando Lopes da Silva. Eeg and meg: relevance to neuroscience.Neuron, 80(5):1112–1128, 2013
work page 2013
-
[3]
Colin D Binnie. Cognitive impairment during epileptiform discharges: is it ever justifiable to treat the eeg? The Lancet Neurology, 2(12):725–730, 2003
work page 2003
-
[4]
Huy Phan, Oliver Y Chén, Minh C Tran, Philipp Koch, Alfred Mertins, and Maarten De Vos. Xsleepnet: Multi-view sequential model for automatic sleep staging.IEEE Transactions on Pattern Analysis and Machine Intelligence, 44(9):5903–5915, 2021
work page 2021
-
[5]
Explainable vision transformer for automatic visual sleep staging on multimodal psg signals
Hyojin Lee, You Rim Choi, Hyun Kyung Lee, Jaemin Jeong, Joopyo Hong, Hyun-Woo Shin, and Hyung-Sin Kim. Explainable vision transformer for automatic visual sleep staging on multimodal psg signals. npj Digital Medicine, 8(1):55, 2025
work page 2025
-
[6]
Jingying Ma, Qika Lin, Ziyu Jia, and Mengling Feng. St-usleepnet: A spatial-temporal coupling prominence network for multi-channel sleep staging.arXiv preprint arXiv:2408.11884, 2024
-
[7]
Ziyu Jia, Youfang Lin, Xiyang Cai, Haobin Chen, Haijun Gou, and Jing Wang. Sst-emotionnet: Spatial-spectral-temporal based attention 3d dense network for eeg emotion recognition. In Proceedings of the 28th ACM international conference on multimedia, pages 2909–2917, 2020
work page 2020
-
[8]
Eeg emotion recognition based on dynamical graph attention network
Yi Guo, Chao Tang, Hao Wu, and Badong Chen. Eeg emotion recognition based on dynamical graph attention network. InICASSP 2024-2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pages 1921–1925. IEEE, 2024
work page 2024
-
[9]
Yiming Wang, Bin Zhang, and Yujiao Tang. Dmmr: Cross-subject domain generalization for eeg-based emotion recognition via denoising mixed mutual reconstruction. InProceedings of the AAAI Conference on Artificial Intelligence, volume 38, pages 628–636, 2024
work page 2024
-
[10]
St-gf: Graph-based fusion of spatial and temporal features for eeg motor imagery decoding
Xuhui Wang, Kui Zhao, Enze Shi, Sigang Yu, Geng Chen, and Shu Zhang. St-gf: Graph-based fusion of spatial and temporal features for eeg motor imagery decoding. In2024 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pages 3811–3816. IEEE, 2024
work page 2024
-
[11]
Emre Arı and Ertuğrul Taçgın. Nf-eeg: A generalized cnn model for multi class eeg motor imagery classification without signal preprocessing for brain computer interfaces.Biomedical Signal Processing and Control, 92:106081, 2024
work page 2024
-
[12]
Zhenqi Li, Jing Wang, Ziyu Jia, and Youfang Lin. Learning space-time-frequency representation with two-stream attention based 3d network for motor imagery classification. In2020 IEEE International Conference on Data Mining (ICDM), pages 1124–1129. IEEE, 2020
work page 2020
-
[13]
Exploring the diagnostic potential of llms in schizophrenia detection through eeg analysis
Michele Guerra, Roberto Milanese, Michele Deodato, Madalina G Ciobanu, and Fausto Fasano. Exploring the diagnostic potential of llms in schizophrenia detection through eeg analysis. In2024 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pages 6812–6819. IEEE, 2024
work page 2024
-
[14]
Exploring large-scale language models to evaluate eeg-based multimodal data for mental health
Yongquan Hu, Shuning Zhang, Ting Dang, Hong Jia, Flora D Salim, Wen Hu, and Aaron J Quigley. Exploring large-scale language models to evaluate eeg-based multimodal data for mental health. In Companion of the 2024 on ACM International Joint Conference on Pervasive and Ubiquitous Computing, pages 412–417, 2024
work page 2024
-
[15]
Brain foundation models: A survey on advancements in neural signal processing and brain discovery
Xinliang Zhou, Chenyu Liu, Zhisheng Chen, Kun Wang, Yi Ding, Ziyu Jia, and Qingsong Wen. Brain foundation models: A survey on advancements in neural signal processing and brain discovery. arXiv preprint arXiv:2503.00580, 2025
-
[16]
Neural discrete representation learning
Aaron Van Den Oord, Oriol Vinyals, et al. Neural discrete representation learning. InAdvances in neural information processing systems, volume 30, 2017. 11
work page 2017
-
[17]
Bert: Pre-training of deep bidirectional transformers for language understanding
Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. Bert: Pre-training of deep bidirectional transformers for language understanding. InProceedings of the 2019 conference of the North American chapter of the association for computational linguistics: human language technologies, volume 1 (long and short papers), pages 4171–4186, 2019
work page 2019
-
[18]
Eegpt: Pretrained transformer for universal and reliable representation of eeg signals
Guangyu Wang, Wenchao Liu, Yuhong He, Cong Xu, Lin Ma, and Haifeng Li. Eegpt: Pretrained transformer for universal and reliable representation of eeg signals. In Advances in Neural Information Processing Systems, volume 37, pages 39249–39280, 2024
work page 2024
-
[19]
Cbramod: A criss-cross brain foundation model for eeg decoding
Jiquan Wang, Sha Zhao, Zhiling Luo, Yangxuan Zhou, Haiteng Jiang, Shijian Li, Tao Li, and Gang Pan. Cbramod: A criss-cross brain foundation model for eeg decoding. InThe Third International Conference on Learning Representations, 2025
work page 2025
-
[20]
Large brain model for learning generic represen- tations with tremendous eeg data in bci
Weibang Jiang, Liming Zhao, and Bao-liang Lu. Large brain model for learning generic represen- tations with tremendous eeg data in bci. InThe Twelfth International Conference on Learning Representations, 2024
work page 2024
-
[21]
Tokenizing Single-Channel EEG with Time-Frequency Motif Learning
Jathurshan Pradeepkumar, Xihao Piao, Zheng Chen, and Jimeng Sun. Single-channel eeg tok- enization through time-frequency modeling.arXiv preprint arXiv:2502.16060, 2025
work page internal anchor Pith review Pith/arXiv arXiv 2025
-
[22]
Chuanxia Zheng and Andrea Vedaldi. Online clustered codebook. InProceedings of the IEEE/CVF International Conference on Computer Vision, pages 22798–22807, 2023
work page 2023
-
[23]
Finite scalar quantiza- tion: Vq-vae made simple
Fabian Mentzer, David Minnen, Eirikur Agustsson, and Michael Tschannen. Finite scalar quantiza- tion: Vq-vae made simple. InThe Twelfth International Conference on Learning Representations, 2024
work page 2024
-
[24]
Fumikazu Miwakeichi, Eduardo Martınez-Montes, Pedro A Valdés-Sosa, Nobuaki Nishiyama, Hiroaki Mizuhara, and Yoko Yamaguchi. Decomposing eeg data into space–time–frequency components using parallel factor analysis.NeuroImage, 22(3):1035–1045, 2004
work page 2004
-
[25]
Vector quantization for recommender systems: a review and outlook
Qijiong Liu, Xiaoyu Dong, Jiaren Xiao, Nuo Chen, Hengchang Hu, Jieming Zhu, Chenxu Zhu, Tetsuya Sakai, and Xiao-Ming Wu. Vector quantization for recommender systems: a review and outlook. arXiv preprint arXiv:2405.03110, 2024
-
[26]
Ed Bullmore and Olaf Sporns. Complex brain networks: graph theoretical analysis of structural and functional systems.Nature reviews neuroscience, 10(3):186–198, 2009
work page 2009
-
[27]
Danielle Smith Bassett and ED Bullmore. Small-world brain networks. The neuroscientist, 12(6):512–523, 2006
work page 2006
-
[28]
Yong He, Jinhui Wang, Liang Wang, Zhang J Chen, Chaogan Yan, Hong Yang, Hehan Tang, Chaozhe Zhu, Qiyong Gong, Yufeng Zang, et al. Uncovering intrinsic modular organization of spontaneous brain activity in humans.PloS one, 4(4):e5226, 2009
work page 2009
-
[29]
Biot: Biosignal transformer for cross-data learning in the wild
Chaoqi Yang, M Westover, and Jimeng Sun. Biot: Biosignal transformer for cross-data learning in the wild. InAdvances in Neural Information Processing Systems, volume 36, pages 78240–78260, 2023
work page 2023
-
[30]
Eeg2rep: enhancing self-supervised eeg representation through informative masked inputs
Navid Mohammadi Foumani, Geoffrey Mackellar, Soheila Ghane, Saad Irtza, Nam Nguyen, and Mahsa Salehi. Eeg2rep: enhancing self-supervised eeg representation through informative masked inputs. In Proceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pages 5544–5555, 2024
work page 2024
-
[31]
Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Łukasz Kaiser, and Illia Polosukhin. Attention is all you need. InAdvances in neural information processing systems, volume 30, 2017
work page 2017
-
[32]
Long range arena : A benchmark for efficient transformers
Yi Tay, Mostafa Dehghani, Samira Abnar, Yikang Shen, Dara Bahri, Philip Pham, Jinfeng Rao, Liu Yang, Sebastian Ruder, and Donald Metzler. Long range arena : A benchmark for efficient transformers. In The Ninth International Conference on Learning Representations, 2021. 12
work page 2021
-
[33]
An Efficient Self-Supervised Framework for Long-Sequence EEG Modeling
Jiazhen Hong, Geoffrey Mackellar, and Soheila Ghane. Eegm2: An efficient mamba-2-based self-supervised framework for long-sequence eeg modeling.arXiv preprint arXiv:2502.17873, 2025
work page internal anchor Pith review Pith/arXiv arXiv 2025
-
[34]
Anna Tegon, Thorir Mar Ingolfsson, Xiaying Wang, Luca Benini, and Yawei Li. Femba: Effi- cient and scalable eeg analysis with a bidirectional mamba foundation model.arXiv preprint arXiv:2502.06438, 2025
-
[35]
Springer Publishing Company, 2021
William O Tatum IV.Handbook of EEG interpretation. Springer Publishing Company, 2021
work page 2021
-
[36]
Efficiently Modeling Long Sequences with Structured State Spaces
Albert Gu, Karan Goel, and Christopher Ré. Efficiently modeling long sequences with structured state spaces. arXiv preprint arXiv:2111.00396, 2021
work page internal anchor Pith review Pith/arXiv arXiv 2021
-
[37]
Simplified state space layers for sequence modeling
Jimmy TH Smith, Andrew Warrington, and Scott W Linderman. Simplified state space layers for sequence modeling. InICLR, 2023
work page 2023
-
[38]
Yuhong Li, Tianle Cai, Yi Zhang, Deming Chen, and Debadeepta Dey. What makes convolutional models great on long sequence modeling?arXiv preprint arXiv:2210.09298, 2022
-
[39]
Albert Gu, Karan Goel, Ankit Gupta, and Christopher Ré. On the parameterization and ini- tialization of diagonal state space models.Advances in Neural Information Processing Systems, 35:35971–35983, 2022
work page 2022
-
[40]
Longformer: The Long-Document Transformer
Iz Beltagy, Matthew E Peters, and Arman Cohan. Longformer: The long-document transformer. arXiv preprint arXiv:2004.05150, 2020
work page internal anchor Pith review Pith/arXiv arXiv 2004
-
[41]
Internimage: Exploring large-scale vision foundation models with deformable convolutions
Wenhai Wang, Jifeng Dai, Zhe Chen, Zhenhang Huang, Zhiqi Li, Xizhou Zhu, Xiaowei Hu, Tong Lu, Lewei Lu, Hongsheng Li, et al. Internimage: Exploring large-scale vision foundation models with deformable convolutions. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 14408–14419, 2023
work page 2023
-
[42]
Josh Achiam, Steven Adler, Sandhini Agarwal, Lama Ahmad, Ilge Akkaya, Florencia Leoni Aleman, Diogo Almeida, Janko Altenschmidt, Sam Altman, Shyamal Anadkat, et al. Gpt-4 technical report. arXiv preprint arXiv:2303.08774, 2023
work page internal anchor Pith review Pith/arXiv arXiv 2023
-
[43]
Graphsleepnet: Adaptive spatial-temporal graph convolutional networks for sleep stage classification
Ziyu Jia, Youfang Lin, Jing Wang, Ronghao Zhou, Xiaojun Ning, Yuanlai He, and Yaoshuai Zhao. Graphsleepnet: Adaptive spatial-temporal graph convolutional networks for sleep stage classification. InIjcai, volume 2021, pages 1324–1330, 2020
work page 2021
-
[44]
Jiquan Wang, Sha Zhao, Haiteng Jiang, Yangxuan Zhou, Zhenghe Yu, Tao Li, Shijian Li, and Gang Pan. Caresleepnet: a hybrid deep learning network for automatic sleep staging.IEEE Journal of Biomedical and Health Informatics, 2024
work page 2024
-
[45]
Long-term eeg partitioning for seizure onset detection
Zheng Chen, Yasuko Matsubara, Yasushi Sakurai, and Jimeng Sun. Long-term eeg partitioning for seizure onset detection. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 39, pages 14221–14229, 2025
work page 2025
-
[46]
Large cognition model: Towards pretrained eeg foundation model.arXiv preprint arXiv:2502.17464, 2025
Chi-Sheng Chen, Ying-Jung Chen, and Aidan Hung-Wen Tsai. Large cognition model: Towards pretrained eeg foundation model.arXiv preprint arXiv:2502.17464, 2025
-
[47]
Demetres Kostas, Stephane Aroca-Ouellette, and Frank Rudzicz. Bendr: Using transformers and a contrastive self-supervised learning task to learn from massive amounts of eeg data.Frontiers in Human Neuroscience, 15:653659, 2021
work page 2021
-
[48]
Brant: Foundation model for intracranial neural signal
Daoze Zhang, Zhizhang Yuan, Yang Yang, Junru Chen, Jingjing Wang, and Yafeng Li. Brant: Foundation model for intracranial neural signal. Advances in Neural Information Processing Systems, 36:26304–26321, 2023
work page 2023
-
[49]
Brant-2: Foundation model for brain signals.CoRR, 2024
Zhizhang Yuan, Daoze Zhang, Junru Chen, Gefei Gu, and Yang Yang. Brant-2: Foundation model for brain signals.CoRR, 2024
work page 2024
-
[50]
Brant-x: A unified physiological signal alignment framework
Daoze Zhang, Zhizhang Yuan, Junru Chen, Kerui Chen, and Yang Yang. Brant-x: A unified physiological signal alignment framework. InProceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pages 4155–4166, 2024. 13
work page 2024
-
[51]
Weibang Jiang, Yansen Wang, Bao-liang Lu, and Dongsheng Li. Neurolm: A universal multi-task foundation model for bridging the gap between language and eeg signals. InThe Thirteenth International Conference on Learning Representations, 2025
work page 2025
-
[52]
Neural machine translation of rare words with subword units
Rico Sennrich, Barry Haddow, and Alexandra Birch. Neural machine translation of rare words with subword units. In54th Annual Meeting of the Association for Computational Linguistics, pages 1715–1725. Association for Computational Linguistics (ACL), 2016
work page 2016
-
[53]
Taku Kudo and John Richardson. Sentencepiece: A simple and language independent subword tokenizer and detokenizer for neural text processing. InProceedings of the 2018 Conference on Empirical Methods in Natural Language Processing: System Demonstrations, pages 66–71, 2018
work page 2018
-
[54]
Syama Sundar Rangapuram, Matthias W Seeger, Jan Gasthaus, Lorenzo Stella, Yuyang Wang, and Tim Januschowski. Deep state space models for time series forecasting.Advances in neural information processing systems, 31, 2018
work page 2018
-
[55]
Albert Gu, Tri Dao, Stefano Ermon, Atri Rudra, and Christopher Ré. Hippo: Recurrent memory with optimal polynomial projections.Advances in neural information processing systems, 33:1474– 1487, 2020
work page 2020
-
[56]
Simplified State Space Layers for Sequence Modeling
Jimmy TH Smith, Andrew Warrington, and Scott W Linderman. Simplified state space layers for sequence modeling. arXiv preprint arXiv:2208.04933, 2022
work page internal anchor Pith review Pith/arXiv arXiv 2022
-
[57]
Aniruddh Raghu, Payal Chandak, Ridwan Alam, John Guttag, and Collin M. Stultz. Sequential multi-dimensional self-supervised learning for clinical time series, 2023
work page 2023
-
[58]
Tri Dao, Daniel Y Fu, Khaled K Saab, Armin W Thomas, Atri Rudra, and Christopher Ré. Hungry hungry hippos: Towards language modeling with state space models.arXiv preprint arXiv:2212.14052, 2022
-
[59]
Deep latent state space models for time-series generation
Linqi Zhou, Michael Poli, Winnie Xu, Stefano Massaroli, and Stefano Ermon. Deep latent state space models for time-series generation. InInternational Conference on Machine Learning, pages 42625–42643. PMLR, 2023
work page 2023
-
[60]
Eeg-ssm: Leveraging state-space model for dementia detection
Xuan-The Tran, LinhLe, QuocToan Nguyen, Thomas Do, andChin-Teng Lin. Eeg-ssm: Leveraging state-space model for dementia detection.arXiv preprint arXiv:2407.17801, 2024
-
[61]
Yiyu Gui, MingZhi Chen, Yuqi Su, Guibo Luo, and Yuchao Yang. Eegmamba: Bidirectional state space model with mixture of experts for eeg multi-task classification, 2024
work page 2024
-
[62]
An algorithm for the machine calculation of complex fourier series
James W Cooley and John W Tukey. An algorithm for the machine calculation of complex fourier series. Mathematics of computation, 19(90):297–301, 1965
work page 1965
-
[63]
Clocs: Contrastive learning of cardiac signals across space, time, and patients
Dani Kiyasseh, Tingting Zhu, and David A Clifton. Clocs: Contrastive learning of cardiac signals across space, time, and patients. InInternational Conference on Machine Learning, pages 5606–5615. PMLR, 2021
work page 2021
-
[64]
A simple framework for contrastive learning of visual representations
Ting Chen, Simon Kornblith, Mohammad Norouzi, and Geoffrey Hinton. A simple framework for contrastive learning of visual representations. InInternational conference on machine learning, pages 1597–1607. PmLR, 2020
work page 2020
-
[65]
Root mean square layer normalization, 2019
Biao Zhang and Rico Sennrich. Root mean square layer normalization, 2019
work page 2019
-
[66]
The temple university hospital eeg data corpus.Frontiers in neuroscience, 10:196, 2016
Iyad Obeid and Joseph Picone. The temple university hospital eeg data corpus.Frontiers in neuroscience, 10:196, 2016
work page 2016
-
[67]
JASPER HH. The ten-twenty electrode system of the international federation.Electroenceph clin Neurophysiol, 10:367–380, 1958
work page 1958
-
[68]
A large finer-grained affective computing eeg dataset.Scientific Data, 10(1):740, 2023
Jingjing Chen, Xiaobin Wang, Chen Huang, Xin Hu, Xinke Shen, and Dan Zhang. A large finer-grained affective computing eeg dataset.Scientific Data, 10(1):740, 2023. 14
work page 2023
-
[69]
Wei Liu, Jie-Lin Qiu, Wei-Long Zheng, and Bao-Liang Lu. Comparing recognition performance and robustness of multimodal deep learning models for multimodal emotion recognition.IEEE Transactions on Cognitive and Developmental Systems, 14(2):715–729, 2021
work page 2021
-
[70]
Sirvan Khalighi, Teresa Sousa, José Moutinho Santos, and Urbano Nunes. Isruc-sleep: A com- prehensive public dataset for sleep researchers.Computer methods and programs in biomedicine, 124:180–192, 2016
work page 2016
-
[71]
Ji-Hoon Jeong, Jeong-Hyun Cho, Young-Eun Lee, Seo-Hyun Lee, Gi-Hwan Shin, Young-Seok Kweon, José del R Millán, Klaus-Robert Müller, and Seong-Whan Lee. 2020 international brain– computer interface competition: A review.Frontiers in human neuroscience, 16:898300, 2022
work page 2020
-
[72]
Application of machine learning to epileptic seizure onset detection and treatment
Ali Hossam Shoeb. Application of machine learning to epileptic seizure onset detection and treatment. PhD thesis, Massachusetts Institute of Technology, 2009
work page 2009
-
[73]
MDD Patients and Healthy Controls EEG Data (New)
Wajid Mumtaz. MDD Patients and Healthy Controls EEG Data (New). Figshare, November 2016
work page 2016
-
[74]
Ary L Goldberger, Luis AN Amaral, Leon Glass, Jeffrey M Hausdorff, Plamen Ch Ivanov, Roger G Mark, Joseph E Mietus, George B Moody, Chung-Kang Peng, and H Eugene Stanley. Physiobank, physiotoolkit, and physionet: components of a new research resource for complex physiologic signals. circulation, 101(23):e215–e220, 2000
work page 2000
-
[75]
Efficiently modeling long sequences with structured state spaces
Albert Gu, Karan Goel, and Christopher Re. Efficiently modeling long sequences with structured state spaces. InThe Tenth International Conference on Learning Representations, 2022
work page 2022
-
[76]
Simple hardware-efficient long convolutions for sequence modeling
Daniel Y Fu, Elliot L Epstein, Eric Nguyen, Armin W Thomas, Michael Zhang, Tri Dao, Atri Rudra, and Christopher Ré. Simple hardware-efficient long convolutions for sequence modeling. In International Conference on Machine Learning, pages 10373–10391. PMLR, 2023
work page 2023
-
[77]
Emerging properties in self-supervised vision transformers
Mathilde Caron, Hugo Touvron, Ishan Misra, Hervé Jégou, Julien Mairal, Piotr Bojanowski, and Armand Joulin. Emerging properties in self-supervised vision transformers. InProceedings of the IEEE/CVF international conference on computer vision, pages 9650–9660, 2021
work page 2021
-
[78]
Richard B Berry, Rita Brooks, Charlene E Gamaldo, Susan M Harding, Carole Marcus, Bradley V Vaughn, et al. The aasm manual for the scoring of sleep and associated events.Rules, Terminology and Technical Specifications, Darien, Illinois, American Academy of Sleep Medicine, 176(2012):7, 2012
work page 2012
-
[79]
Eegnet: a compact convolutional neural network for eeg-based brain–computer interfaces
Vernon J Lawhern, Amelia J Solon, Nicholas R Waytowich, Stephen M Gordon, Chou P Hung, and Brent J Lance. Eegnet: a compact convolutional neural network for eeg-based brain–computer interfaces. Journal of neural engineering, 15(5):056013, 2018
work page 2018
-
[80]
Yonghao Song, Xueyu Jia, Lie Yang, and Longhan Xie. Transformer-based spatial-temporal feature learning for eeg decoding.arXiv preprint arXiv:2106.11170, 2021. 15 A Preliminaries Convolution State Space Models The state-space model is a classic model in control theory, and it represents the operational state of a system using first-order differential equa...
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.