CaMBRAIN: Real-time, Continuous EEG Inference with Causal State Space Models
Pith reviewed 2026-06-29 11:36 UTC · model grok-4.3
The pith
CaMBRAIN is a causal Mamba state space model for real-time continuous inference on variable-length EEG signals.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
CaMBRAIN is the first Causal, Mamba-based state space model capable of real-time inference of EEG signals. It introduces a multi-stage self-supervised training pipeline specifically tailored to encourage long-range memory retention and strong performance on EEG signals, while preserving the linear-time complexity of state space models. CaMBRAIN achieves state-of-the-art results across 3 different EEG datasets with more than 10 times higher throughput than existing models, enabling the first model capable of long-range, continuous inference of variable-length EEG signals.
What carries the argument
The multi-stage self-supervised training pipeline for causal Mamba-based state space models that trains the hidden state to retain salient long-range context needed for streaming inference.
If this is right
- Enables processing of entire variable-length EEG recordings without sliding windows or quadratic costs.
- Supports continuous real-time inference at linear complexity.
- Delivers state-of-the-art accuracy on three distinct EEG datasets.
- Provides more than 10 times higher throughput than existing attention-based models.
Where Pith is reading between the lines
- The same causal training approach could be tested on other long sparse biomedical time series such as ECG recordings.
- Efficiency gains may allow continuous monitoring on portable or edge hardware where prior models cannot run.
- The pipeline's focus on hidden-state retention suggests extensions to other causal signal domains with brief critical events separated by long gaps.
Load-bearing premise
The multi-stage self-supervised training pipeline will successfully train the hidden state of a causal SSM to retain salient long-range context for streaming EEG inference.
What would settle it
A direct comparison on the three EEG datasets showing that CaMBRAIN does not reach state-of-the-art accuracy or fails to deliver at least 10 times the throughput of prior models on long sequences.
Figures
read the original abstract
Electroencephalography (EEG) is a critical, non-invasive method to monitor electrical brain activity. EEGs can span anywhere from a couple seconds to multiple hours, posing a major hurdle for existing deep learning methods due to two major factors: (1) existing EEG models are predominantly built upon the attention mechanism, incurring quadratic scaling as the sequence length increases, and (2) raw EEG signals must be processed in a sliding-window fashion due to fixed-length input requirements, preventing global understanding of the entire signal. To this extent, we propose CaMBRAIN - the first Causal, Mamba-based state space model (SSM) capable of real-time inference of EEG signals, arguing that bidirectional approaches are needlessly expensive given the causal, unidirectional nature of EEG. However, training such a model is non-trivial, as crucial EEG events can be extremely brief - within fractions of a second - yet separated by long intervals spanning minutes. Current EEG methods use self-supervised objectives that optimize for signal reconstruction, but these are not well suited for streaming SSMs; they fail to explicitly train the hidden state to retain the salient long-range context needed for streaming inference. We therefore introduce a multi-stage self-supervised training pipeline specifically tailored to encourage long-range memory retention and strong performance on EEG signals, while preserving the linear-time complexity of state space models. CaMBRAIN achieves state-of-the-art (SOTA) results across 3 different EEG datasets with >10x higher throughput than existing models, enabling the first model capable of long-range, continuous inference of variable-length EEG signals.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript proposes CaMBRAIN, a causal Mamba-based state space model for real-time, continuous EEG inference. It highlights limitations of attention-based models for long EEG sequences and introduces a multi-stage self-supervised training pipeline to enable long-range memory retention in streaming models, claiming state-of-the-art performance on three EEG datasets with over 10 times higher throughput.
Significance. If the empirical claims hold, this work could advance real-time EEG monitoring by enabling linear-time inference on variable-length signals without the quadratic costs of attention mechanisms. The tailored training pipeline for causal SSMs addresses a noted challenge in applying these models to EEG, where brief events are separated by long intervals.
major comments (2)
- [Abstract] Abstract: the claim of SOTA results across 3 datasets and >10x higher throughput supplies no metrics, baselines, dataset details, ablation studies, or error bars, making it impossible to evaluate the central empirical claims.
- [Abstract] Abstract: the multi-stage self-supervised training pipeline is described only at the level of motivation, with no details on the stages, loss functions, or how it trains the hidden state to retain salient long-range context for streaming inference.
Simulated Author's Rebuttal
We thank the referee for their thoughtful review and for highlighting areas where the abstract could be strengthened. We address each major comment below and indicate where revisions will be made to the manuscript.
read point-by-point responses
-
Referee: [Abstract] Abstract: the claim of SOTA results across 3 datasets and >10x higher throughput supplies no metrics, baselines, dataset details, ablation studies, or error bars, making it impossible to evaluate the central empirical claims.
Authors: We agree that the abstract, as currently written, presents the SOTA and throughput claims at a summary level without supporting numbers. The full manuscript provides these details in the Experiments section (including per-dataset accuracy/F1 scores against listed baselines, dataset descriptions, ablation tables, and standard error bars across runs). To improve evaluability from the abstract alone, we will revise it to include the key quantitative results (e.g., specific accuracy gains and throughput multiplier on each dataset). revision: yes
-
Referee: [Abstract] Abstract: the multi-stage self-supervised training pipeline is described only at the level of motivation, with no details on the stages, loss functions, or how it trains the hidden state to retain salient long-range context for streaming inference.
Authors: The abstract is necessarily concise and therefore focuses on motivation. The manuscript details the three stages, the specific loss functions (including the long-range retention objective), and the mechanism by which the hidden state is encouraged to preserve salient context in Section 3.2 and Algorithm 1. We will add one additional sentence to the abstract that names the stages and the core retention loss, while keeping within length constraints. revision: yes
Circularity Check
No significant circularity detected
full rationale
The provided abstract and description contain no equations, parameter-fitting steps, predictions, or derivation chains that could reduce to inputs by construction. The paper proposes a model architecture and training pipeline, then reports empirical SOTA results; these are external benchmarks rather than self-referential reductions. No self-citations, ansatzes, or uniqueness theorems are invoked in a load-bearing way within the visible text. The derivation chain is therefore self-contained against external evaluation.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption Causal state space models can retain long-range context when trained with a multi-stage self-supervised objective rather than reconstruction
Reference graph
Works this paper leans on
-
[1]
Electroencephalography
Andrea Biasiucci, Benedetta Franceschiello, and Micah M Murray. Electroencephalography. Current Biology, 29(3):R80–R85, 2019
2019
-
[2]
Application of machine learning in epileptic seizure detection.Diagnostics, 12(11):2879, 2022
Ly V Tran, Hieu M Tran, Tuan M Le, Tri TM Huynh, Hung T Tran, and Son VT Dao. Application of machine learning in epileptic seizure detection.Diagnostics, 12(11):2879, 2022
2022
-
[3]
Epileptic seizure detec- tion in eegs using time–frequency analysis.IEEE transactions on information technology in biomedicine, 13(5):703–710, 2009
Alexandros T Tzallas, Markos G Tsipouras, and Dimitrios I Fotiadis. Epileptic seizure detec- tion in eegs using time–frequency analysis.IEEE transactions on information technology in biomedicine, 13(5):703–710, 2009
2009
-
[4]
Deap: A database for emotion analysis; using physiological signals.IEEE transactions on affective computing, 3(1): 18–31, 2011
Sander Koelstra, Christian Muhl, Mohammad Soleymani, Jong-Seok Lee, Ashkan Yazdani, Touradj Ebrahimi, Thierry Pun, Anton Nijholt, and Ioannis Patras. Deap: A database for emotion analysis; using physiological signals.IEEE transactions on affective computing, 3(1): 18–31, 2011
2011
-
[5]
A review on evaluating mental stress by deep learning using eeg signals.Neural Computing and Applications, 36(21):12629–12654, 2024
Yara Badr, Usman Tariq, Fares Al-Shargie, Fabio Babiloni, Fadwa Al Mughairbi, and Hasan Al-Nashash. A review on evaluating mental stress by deep learning using eeg signals.Neural Computing and Applications, 36(21):12629–12654, 2024
2024
-
[6]
Transformers in time series: a survey
Qingsong Wen, Tian Zhou, Chaoli Zhang, Weiqi Chen, Ziqing Ma, Junchi Yan, and Liang Sun. Transformers in time series: a survey. InProceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, pages 6778–6786, 2023
2023
-
[7]
Reve: A foundation model for eeg-adapting to any setup with large-scale pretraining on 25,000 subjects
Yassine El Ouahidi, Jonathan Lys, Philipp Thölke, Nicolas Farrugia, Bastien Pasdeloup, Vincent Gripon, Karim Jerbi, and Giulia Lioi. Reve: A foundation model for eeg-adapting to any setup with large-scale pretraining on 25,000 subjects. InThe Thirty-ninth Annual Conference on Neural Information Processing Systems, 2025
2025
-
[8]
Femba: Efficient and scalable eeg analysis with a bidirectional mamba foundation model
Anna Tegon, Thorir Mar Ingolfsson, Xiaying Wang, Luca Benini, and Yawei Li. Femba: Efficient and scalable eeg analysis with a bidirectional mamba foundation model. In2025 47th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC), pages 1–7. IEEE, 2025
2025
-
[9]
Gmaeeg: A self-supervised graph masked autoencoder for eeg representation learning.IEEE Journal of Biomedical and Health Informatics, 28(11):6486–6497, 2024
Zanhao Fu, Huaiyu Zhu, Yisheng Zhao, Ruohong Huan, Yi Zhang, Shuohui Chen, and Yun Pan. Gmaeeg: A self-supervised graph masked autoencoder for eeg representation learning.IEEE Journal of Biomedical and Health Informatics, 28(11):6486–6497, 2024. 10
2024
-
[10]
Uncovering the structure of clinical eeg signals with self-supervised learning.Journal of Neural Engineering, 18(4):046020, 2021
Hubert Banville, Omar Chehab, Aapo Hyvärinen, Denis-Alexander Engemann, and Alexandre Gramfort. Uncovering the structure of clinical eeg signals with self-supervised learning.Journal of Neural Engineering, 18(4):046020, 2021
2021
-
[11]
Eeg2rep: enhancing self-supervised eeg representation through informative masked inputs
Navid Mohammadi Foumani, Geoffrey Mackellar, Soheila Ghane, Saad Irtza, Nam Nguyen, and Mahsa Salehi. Eeg2rep: enhancing self-supervised eeg representation through informative masked inputs. InProceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pages 5544–5555, 2024
2024
-
[12]
Deep learning for electroen- cephalogram (eeg) classification tasks: a review.Journal of neural engineering, 16(3):031001, 2019
Alexander Craik, Yongtian He, and Jose L Contreras-Vidal. Deep learning for electroen- cephalogram (eeg) classification tasks: a review.Journal of neural engineering, 16(3):031001, 2019
2019
-
[13]
Bendr: Using transformers and a contrastive self-supervised learning task to learn from massive amounts of eeg data
Demetres Kostas, Stephane Aroca-Ouellette, and Frank Rudzicz. Bendr: Using transformers and a contrastive self-supervised learning task to learn from massive amounts of eeg data. Frontiers in Human Neuroscience, 15:653659, 2021
2021
-
[14]
Biot: Biosignal transformer for cross-data learning in the wild.Advances in Neural Information Processing Systems, 36:78240–78260, 2023
Chaoqi Yang, M Westover, and Jimeng Sun. Biot: Biosignal transformer for cross-data learning in the wild.Advances in Neural Information Processing Systems, 36:78240–78260, 2023
2023
-
[15]
Wei-Bang Jiang, Li-Ming Zhao, and Bao-Liang Lu. Large brain model for learning generic representations with tremendous eeg data in bci.arXiv preprint arXiv:2405.18765, 2024
-
[16]
Cbramod: A criss-cross brain foundation model for eeg decoding.arXiv preprint arXiv:2412.07236, 2024
Jiquan Wang, Sha Zhao, Zhiling Luo, Yangxuan Zhou, Haiteng Jiang, Shijian Li, Tao Li, and Gang Pan. Cbramod: A criss-cross brain foundation model for eeg decoding.arXiv preprint arXiv:2412.07236, 2024
-
[17]
Luna: Efficient and topology- agnostic foundation model for eeg signal analysis
Berkay Döner, Thorir Mar Ingolfsson, Luca Benini, and Yawei Li. Luna: Efficient and topology- agnostic foundation model for eeg signal analysis. InThe Thirty-ninth Annual Conference on Neural Information Processing Systems, 2025
2025
-
[18]
The temple university hospital eeg data corpus.Frontiers in neuroscience, 10:196, 2016
Iyad Obeid and Joseph Picone. The temple university hospital eeg data corpus.Frontiers in neuroscience, 10:196, 2016
2016
-
[19]
Inter-database validation of a deep learning approach for automatic sleep scoring.PloS one, 16(8):e0256111, 2021
Diego Alvarez-Estevez and Roselyne M Rijsman. Inter-database validation of a deep learning approach for automatic sleep scoring.PloS one, 16(8):e0256111, 2021
2021
-
[20]
Electroencephalograms during mental arithmetic task performance.Data, 4 (1):14, 2019
Igor Zyma, Sergii Tukaev, Ivan Seleznov, Ken Kiyono, Anton Popov, Mariia Chernykh, and Oleksii Shpenkov. Electroencephalograms during mental arithmetic task performance.Data, 4 (1):14, 2019
2019
-
[21]
Electroencephalogram (eeg)-based computer-aided technique to diagnose major depressive disorder (mdd).Biomedical Signal Processing and Control, 31: 108–115, 2017
Wajid Mumtaz, Likun Xia, Syed Saad Azhar Ali, Mohd Azhar Mohd Yasin, Muhammad Hussain, and Aamir Saeed Malik. Electroencephalogram (eeg)-based computer-aided technique to diagnose major depressive disorder (mdd).Biomedical Signal Processing and Control, 31: 108–115, 2017
2017
-
[22]
PhD thesis, Massachusetts Institute of Technology, 2009
Ali Hossam Shoeb.Application of machine learning to epileptic seizure onset detection and treatment. PhD thesis, Massachusetts Institute of Technology, 2009
2009
-
[23]
Efficiently modeling long sequences with structured state spaces
Albert Gu, Karan Goel, and Christopher Re. Efficiently modeling long sequences with structured state spaces. InInternational Conference on Learning Representations, 2022
2022
-
[24]
Simplified state space layers for sequence modeling
Jimmy TH Smith, Andrew Warrington, and Scott Linderman. Simplified state space layers for sequence modeling. InThe Eleventh International Conference on Learning Representations, 2023
2023
-
[25]
Mamba: Linear-Time Sequence Modeling with Selective State Spaces
Albert Gu and Tri Dao. Mamba: Linear-time sequence modeling with selective state spaces. arXiv preprint arXiv:2312.00752, 2023
work page internal anchor Pith review Pith/arXiv arXiv 2023
-
[26]
Eegmamba: An eeg foundation model with mamba.Neural Networks, page 107816, 2025
Jiquan Wang, Sha Zhao, Zhiling Luo, Yangxuan Zhou, Shijian Li, and Gang Pan. Eegmamba: An eeg foundation model with mamba.Neural Networks, page 107816, 2025
2025
-
[27]
Mamba-3: Improved sequence modeling using state space principles
Aakash Lahoti, Kevin Li, Berlin Chen, Caitlin Wang, Aviv Bick, J Zico Kolter, Tri Dao, and Albert Gu. Mamba-3: Improved sequence modeling using state space principles. InThe Fourteenth International Conference on Learning Representations, 2026. 11
2026
-
[28]
Eeg conformer: Convolu- tional transformer for eeg decoding and visualization.IEEE Transactions on Neural Systems and Rehabilitation Engineering, 31:710–719, 2022
Yonghao Song, Qingqing Zheng, Bingchuan Liu, and Xiaorong Gao. Eeg conformer: Convolu- tional transformer for eeg decoding and visualization.IEEE Transactions on Neural Systems and Rehabilitation Engineering, 31:710–719, 2022
2022
-
[29]
Self-supervised learning from images with a joint- embedding predictive architecture
Mahmoud Assran, Quentin Duval, Ishan Misra, Piotr Bojanowski, Pascal Vincent, Michael Rabbat, Yann LeCun, and Nicolas Ballas. Self-supervised learning from images with a joint- embedding predictive architecture. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 15619–15629, 2023
2023
-
[30]
Eegnet: a compact convolutional neural network for eeg-based brain–computer interfaces.Journal of neural engineering, 15(5):056013, 2018
Vernon J Lawhern, Amelia J Solon, Nicholas R Waytowich, Stephen M Gordon, Chou P Hung, and Brent J Lance. Eegnet: a compact convolutional neural network for eeg-based brain–computer interfaces.Journal of neural engineering, 15(5):056013, 2018
2018
-
[31]
Eeg-gnn: Graph neural networks for classification of electroencephalogram (eeg) signals
Andac Demir, Toshiaki Koike-Akino, Ye Wang, Masaki Haruna, and Deniz Erdogmus. Eeg-gnn: Graph neural networks for classification of electroencephalogram (eeg) signals. In2021 43rd Annual International Conference of the IEEE Engineering in Medicine & Biology Society (EMBC), pages 1061–1067. IEEE, 2021
2021
-
[32]
Modeling multivariate biosignals with graph neural networks and structured state space
Siyi Tang, Jared Dunnmon, Liangqiong Qu, Khaled Kamal Saab, Tina Baykaner, Christopher Lee-Messer, and Daniel Rubin. Modeling multivariate biosignals with graph neural networks and structured state space. InICLR 2023 Workshop on Time Series Representation Learning for Health, 2023
2023
-
[33]
Brainbert: Self-supervised representation learning for intracranial recordings
Christopher Wang, Vighnesh Subramaniam, Adam Uri Yaari, Gabriel Kreiman, Boris Katz, Ignacio Cases, and Andrei Barbu. Brainbert: Self-supervised representation learning for intracranial recordings. InThe Eleventh International Conference on Learning Representations, 2023
2023
-
[34]
Eegformer: Towards transferable and interpretable large-scale eeg foundation model
Yuqi Chen, Kan Ren, Kaitao Song, Yansen Wang, Yifan Wang, Dongsheng Li, and Lili Qiu. Eegformer: Towards transferable and interpretable large-scale eeg foundation model. InAAAI 2024 Spring Symposium on Clinical Foundation Models, 2024
2024
-
[35]
Yonghao Song, Xueyu Jia, Lie Yang, and Longhan Xie. Transformer-based spatial-temporal feature learning for eeg decoding.arXiv preprint arXiv:2106.11170, 2021
-
[36]
Development of expert-level classification of seizures and rhythmic and periodic patterns during eeg interpretation.Neurology, 100(17):e1750–e1762, 2023
Jin Jing, Wendong Ge, Shenda Hong, Marta Bento Fernandes, Zhen Lin, Chaoqi Yang, Sungtae An, Aaron F Struck, Aline Herlopian, Ioannis Karakis, et al. Development of expert-level classification of seizures and rhythmic and periodic patterns during eeg interpretation.Neurology, 100(17):e1750–e1762, 2023
2023
-
[37]
Self-supervised electroen- cephalogram representation learning for automatic sleep staging: model development and evaluation study.JMIR AI, 2(1):e46769, 2023
Chaoqi Yang, Cao Xiao, M Brandon Westover, and Jimeng Sun. Self-supervised electroen- cephalogram representation learning for automatic sleep staging: model development and evaluation study.JMIR AI, 2(1):e46769, 2023
2023
-
[38]
Transformer convolutional neural networks for automated artifact detection in scalp eeg
Wei Yan Peh, Yuanyuan Yao, and Justin Dauwels. Transformer convolutional neural networks for automated artifact detection in scalp eeg. In2022 44th Annual International Conference of the IEEE Engineering in Medicine & Biology Society (EMBC), pages 3599–3602. IEEE, 2022
2022
-
[39]
Alexandru Dimofte, Glenn Anta Bucagu, Thorir Mar Ingolfsson, Xiaying Wang, Andrea Cossettini, Luca Benini, and Yawei Li. Cerebro: Compact encoder for representations of brain oscillations using efficient alternating attention.arXiv preprint arXiv:2501.10885, 2025
-
[40]
Motor imagery eeg classification algorithm based on cnn-lstm feature fusion network.Biomedical signal processing and control, 72:103342, 2022
Hongli Li, Man Ding, Ronghua Zhang, and Chunbo Xiu. Motor imagery eeg classification algorithm based on cnn-lstm feature fusion network.Biomedical signal processing and control, 72:103342, 2022
2022
-
[41]
Tri Dao and Albert Gu. Transformers are ssms: Generalized models and efficient algorithms through structured state space duality.arXiv preprint arXiv:2405.21060, 2024
work page internal anchor Pith review Pith/arXiv arXiv 2024
-
[42]
How jepa avoids noisy features: The implicit bias of deep linear self distillation networks
Etai Littwin, Omid Saremi, Madhu Advani, Vimal Thilak, Preetum Nakkiran, Chen Huang, and Joshua Susskind. How jepa avoids noisy features: The implicit bias of deep linear self distillation networks. In A. Globerson, L. Mackey, D. Belgrave, A. Fan, U. Paquet, 12 J. Tomczak, and C. Zhang, editors,Advances in Neural Information Processing Systems, vol- ume 3...
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.