pith. sign in

arxiv: 2605.28792 · v1 · pith:WOEGDRLFnew · submitted 2026-05-27 · 💻 cs.AI · cs.HC· cs.LG

CaMBRAIN: Real-time, Continuous EEG Inference with Causal State Space Models

Pith reviewed 2026-06-29 11:36 UTC · model grok-4.3

classification 💻 cs.AI cs.HCcs.LG
keywords EEGstate space modelsMambacausal modelsself-supervised learningreal-time inferencebrain activitytime series
0
0 comments X

The pith

CaMBRAIN is a causal Mamba state space model for real-time continuous inference on variable-length EEG signals.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper aims to establish that attention-based EEG models face quadratic scaling and require sliding windows on fixed-length inputs, which prevents global understanding of long recordings that can span hours. CaMBRAIN counters this with a causal unidirectional state space model trained via a multi-stage self-supervised pipeline that explicitly builds long-range memory into the hidden state. This design preserves linear-time complexity while supporting streaming inference. A sympathetic reader would care because it opens the door to efficient, delay-free monitoring of brain activity over extended periods.

Core claim

CaMBRAIN is the first Causal, Mamba-based state space model capable of real-time inference of EEG signals. It introduces a multi-stage self-supervised training pipeline specifically tailored to encourage long-range memory retention and strong performance on EEG signals, while preserving the linear-time complexity of state space models. CaMBRAIN achieves state-of-the-art results across 3 different EEG datasets with more than 10 times higher throughput than existing models, enabling the first model capable of long-range, continuous inference of variable-length EEG signals.

What carries the argument

The multi-stage self-supervised training pipeline for causal Mamba-based state space models that trains the hidden state to retain salient long-range context needed for streaming inference.

If this is right

  • Enables processing of entire variable-length EEG recordings without sliding windows or quadratic costs.
  • Supports continuous real-time inference at linear complexity.
  • Delivers state-of-the-art accuracy on three distinct EEG datasets.
  • Provides more than 10 times higher throughput than existing attention-based models.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same causal training approach could be tested on other long sparse biomedical time series such as ECG recordings.
  • Efficiency gains may allow continuous monitoring on portable or edge hardware where prior models cannot run.
  • The pipeline's focus on hidden-state retention suggests extensions to other causal signal domains with brief critical events separated by long gaps.

Load-bearing premise

The multi-stage self-supervised training pipeline will successfully train the hidden state of a causal SSM to retain salient long-range context for streaming EEG inference.

What would settle it

A direct comparison on the three EEG datasets showing that CaMBRAIN does not reach state-of-the-art accuracy or fails to deliver at least 10 times the throughput of prior models on long sequences.

Figures

Figures reproduced from arXiv: 2605.28792 by Abhilash Durgam, Elakkat D. Gireesh, Jeffrey A. Chan-Santiago, Mubarak Shah, Nyle Siddiqui, Qiushi Fu.

Figure 1
Figure 1. Figure 1: Existing methods (top) operate on overlapping sliding windows, repeatedly recomputing [PITH_FULL_IMAGE:figures/full_fig_p002_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Overview of CaMBRAIN. CaMBRAIN is a causal state space model that processes EEG in 62.5 ms patches using a unidirectional state space model with a persistent hidden state. We train it with a three-stage pipeline: (1) causal predictive pretraining with autoregressive and masked reconstruction objectives to learn local temporal structure, (2) decoder-free latent JEPA training using a student–teacher framewor… view at source ↗
Figure 3
Figure 3. Figure 3: Persistent vs. windowed streaming on chb22_20. [PITH_FULL_IMAGE:figures/full_fig_p015_3.png] view at source ↗
read the original abstract

Electroencephalography (EEG) is a critical, non-invasive method to monitor electrical brain activity. EEGs can span anywhere from a couple seconds to multiple hours, posing a major hurdle for existing deep learning methods due to two major factors: (1) existing EEG models are predominantly built upon the attention mechanism, incurring quadratic scaling as the sequence length increases, and (2) raw EEG signals must be processed in a sliding-window fashion due to fixed-length input requirements, preventing global understanding of the entire signal. To this extent, we propose CaMBRAIN - the first Causal, Mamba-based state space model (SSM) capable of real-time inference of EEG signals, arguing that bidirectional approaches are needlessly expensive given the causal, unidirectional nature of EEG. However, training such a model is non-trivial, as crucial EEG events can be extremely brief - within fractions of a second - yet separated by long intervals spanning minutes. Current EEG methods use self-supervised objectives that optimize for signal reconstruction, but these are not well suited for streaming SSMs; they fail to explicitly train the hidden state to retain the salient long-range context needed for streaming inference. We therefore introduce a multi-stage self-supervised training pipeline specifically tailored to encourage long-range memory retention and strong performance on EEG signals, while preserving the linear-time complexity of state space models. CaMBRAIN achieves state-of-the-art (SOTA) results across 3 different EEG datasets with >10x higher throughput than existing models, enabling the first model capable of long-range, continuous inference of variable-length EEG signals.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 0 minor

Summary. The manuscript proposes CaMBRAIN, a causal Mamba-based state space model for real-time, continuous EEG inference. It highlights limitations of attention-based models for long EEG sequences and introduces a multi-stage self-supervised training pipeline to enable long-range memory retention in streaming models, claiming state-of-the-art performance on three EEG datasets with over 10 times higher throughput.

Significance. If the empirical claims hold, this work could advance real-time EEG monitoring by enabling linear-time inference on variable-length signals without the quadratic costs of attention mechanisms. The tailored training pipeline for causal SSMs addresses a noted challenge in applying these models to EEG, where brief events are separated by long intervals.

major comments (2)
  1. [Abstract] Abstract: the claim of SOTA results across 3 datasets and >10x higher throughput supplies no metrics, baselines, dataset details, ablation studies, or error bars, making it impossible to evaluate the central empirical claims.
  2. [Abstract] Abstract: the multi-stage self-supervised training pipeline is described only at the level of motivation, with no details on the stages, loss functions, or how it trains the hidden state to retain salient long-range context for streaming inference.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their thoughtful review and for highlighting areas where the abstract could be strengthened. We address each major comment below and indicate where revisions will be made to the manuscript.

read point-by-point responses
  1. Referee: [Abstract] Abstract: the claim of SOTA results across 3 datasets and >10x higher throughput supplies no metrics, baselines, dataset details, ablation studies, or error bars, making it impossible to evaluate the central empirical claims.

    Authors: We agree that the abstract, as currently written, presents the SOTA and throughput claims at a summary level without supporting numbers. The full manuscript provides these details in the Experiments section (including per-dataset accuracy/F1 scores against listed baselines, dataset descriptions, ablation tables, and standard error bars across runs). To improve evaluability from the abstract alone, we will revise it to include the key quantitative results (e.g., specific accuracy gains and throughput multiplier on each dataset). revision: yes

  2. Referee: [Abstract] Abstract: the multi-stage self-supervised training pipeline is described only at the level of motivation, with no details on the stages, loss functions, or how it trains the hidden state to retain salient long-range context for streaming inference.

    Authors: The abstract is necessarily concise and therefore focuses on motivation. The manuscript details the three stages, the specific loss functions (including the long-range retention objective), and the mechanism by which the hidden state is encouraged to preserve salient context in Section 3.2 and Algorithm 1. We will add one additional sentence to the abstract that names the stages and the core retention loss, while keeping within length constraints. revision: yes

Circularity Check

0 steps flagged

No significant circularity detected

full rationale

The provided abstract and description contain no equations, parameter-fitting steps, predictions, or derivation chains that could reduce to inputs by construction. The paper proposes a model architecture and training pipeline, then reports empirical SOTA results; these are external benchmarks rather than self-referential reductions. No self-citations, ansatzes, or uniqueness theorems are invoked in a load-bearing way within the visible text. The derivation chain is therefore self-contained against external evaluation.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

Abstract-only review supplies no explicit free parameters, new entities, or ad-hoc axioms beyond the domain assumption that causal SSMs are appropriate for unidirectional EEG signals.

axioms (1)
  • domain assumption Causal state space models can retain long-range context when trained with a multi-stage self-supervised objective rather than reconstruction
    Invoked to justify the new training pipeline over standard objectives

pith-pipeline@v0.9.1-grok · 5841 in / 1199 out tokens · 24630 ms · 2026-06-29T11:36:11.820527+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

42 extracted references · 7 canonical work pages · 2 internal anchors

  1. [1]

    Electroencephalography

    Andrea Biasiucci, Benedetta Franceschiello, and Micah M Murray. Electroencephalography. Current Biology, 29(3):R80–R85, 2019

  2. [2]

    Application of machine learning in epileptic seizure detection.Diagnostics, 12(11):2879, 2022

    Ly V Tran, Hieu M Tran, Tuan M Le, Tri TM Huynh, Hung T Tran, and Son VT Dao. Application of machine learning in epileptic seizure detection.Diagnostics, 12(11):2879, 2022

  3. [3]

    Epileptic seizure detec- tion in eegs using time–frequency analysis.IEEE transactions on information technology in biomedicine, 13(5):703–710, 2009

    Alexandros T Tzallas, Markos G Tsipouras, and Dimitrios I Fotiadis. Epileptic seizure detec- tion in eegs using time–frequency analysis.IEEE transactions on information technology in biomedicine, 13(5):703–710, 2009

  4. [4]

    Deap: A database for emotion analysis; using physiological signals.IEEE transactions on affective computing, 3(1): 18–31, 2011

    Sander Koelstra, Christian Muhl, Mohammad Soleymani, Jong-Seok Lee, Ashkan Yazdani, Touradj Ebrahimi, Thierry Pun, Anton Nijholt, and Ioannis Patras. Deap: A database for emotion analysis; using physiological signals.IEEE transactions on affective computing, 3(1): 18–31, 2011

  5. [5]

    A review on evaluating mental stress by deep learning using eeg signals.Neural Computing and Applications, 36(21):12629–12654, 2024

    Yara Badr, Usman Tariq, Fares Al-Shargie, Fabio Babiloni, Fadwa Al Mughairbi, and Hasan Al-Nashash. A review on evaluating mental stress by deep learning using eeg signals.Neural Computing and Applications, 36(21):12629–12654, 2024

  6. [6]

    Transformers in time series: a survey

    Qingsong Wen, Tian Zhou, Chaoli Zhang, Weiqi Chen, Ziqing Ma, Junchi Yan, and Liang Sun. Transformers in time series: a survey. InProceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, pages 6778–6786, 2023

  7. [7]

    Reve: A foundation model for eeg-adapting to any setup with large-scale pretraining on 25,000 subjects

    Yassine El Ouahidi, Jonathan Lys, Philipp Thölke, Nicolas Farrugia, Bastien Pasdeloup, Vincent Gripon, Karim Jerbi, and Giulia Lioi. Reve: A foundation model for eeg-adapting to any setup with large-scale pretraining on 25,000 subjects. InThe Thirty-ninth Annual Conference on Neural Information Processing Systems, 2025

  8. [8]

    Femba: Efficient and scalable eeg analysis with a bidirectional mamba foundation model

    Anna Tegon, Thorir Mar Ingolfsson, Xiaying Wang, Luca Benini, and Yawei Li. Femba: Efficient and scalable eeg analysis with a bidirectional mamba foundation model. In2025 47th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC), pages 1–7. IEEE, 2025

  9. [9]

    Gmaeeg: A self-supervised graph masked autoencoder for eeg representation learning.IEEE Journal of Biomedical and Health Informatics, 28(11):6486–6497, 2024

    Zanhao Fu, Huaiyu Zhu, Yisheng Zhao, Ruohong Huan, Yi Zhang, Shuohui Chen, and Yun Pan. Gmaeeg: A self-supervised graph masked autoencoder for eeg representation learning.IEEE Journal of Biomedical and Health Informatics, 28(11):6486–6497, 2024. 10

  10. [10]

    Uncovering the structure of clinical eeg signals with self-supervised learning.Journal of Neural Engineering, 18(4):046020, 2021

    Hubert Banville, Omar Chehab, Aapo Hyvärinen, Denis-Alexander Engemann, and Alexandre Gramfort. Uncovering the structure of clinical eeg signals with self-supervised learning.Journal of Neural Engineering, 18(4):046020, 2021

  11. [11]

    Eeg2rep: enhancing self-supervised eeg representation through informative masked inputs

    Navid Mohammadi Foumani, Geoffrey Mackellar, Soheila Ghane, Saad Irtza, Nam Nguyen, and Mahsa Salehi. Eeg2rep: enhancing self-supervised eeg representation through informative masked inputs. InProceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pages 5544–5555, 2024

  12. [12]

    Deep learning for electroen- cephalogram (eeg) classification tasks: a review.Journal of neural engineering, 16(3):031001, 2019

    Alexander Craik, Yongtian He, and Jose L Contreras-Vidal. Deep learning for electroen- cephalogram (eeg) classification tasks: a review.Journal of neural engineering, 16(3):031001, 2019

  13. [13]

    Bendr: Using transformers and a contrastive self-supervised learning task to learn from massive amounts of eeg data

    Demetres Kostas, Stephane Aroca-Ouellette, and Frank Rudzicz. Bendr: Using transformers and a contrastive self-supervised learning task to learn from massive amounts of eeg data. Frontiers in Human Neuroscience, 15:653659, 2021

  14. [14]

    Biot: Biosignal transformer for cross-data learning in the wild.Advances in Neural Information Processing Systems, 36:78240–78260, 2023

    Chaoqi Yang, M Westover, and Jimeng Sun. Biot: Biosignal transformer for cross-data learning in the wild.Advances in Neural Information Processing Systems, 36:78240–78260, 2023

  15. [15]

    Large brain model for learning generic representations with tremendous eeg data in bci.arXiv preprint arXiv:2405.18765, 2024

    Wei-Bang Jiang, Li-Ming Zhao, and Bao-Liang Lu. Large brain model for learning generic representations with tremendous eeg data in bci.arXiv preprint arXiv:2405.18765, 2024

  16. [16]

    Cbramod: A criss-cross brain foundation model for eeg decoding.arXiv preprint arXiv:2412.07236, 2024

    Jiquan Wang, Sha Zhao, Zhiling Luo, Yangxuan Zhou, Haiteng Jiang, Shijian Li, Tao Li, and Gang Pan. Cbramod: A criss-cross brain foundation model for eeg decoding.arXiv preprint arXiv:2412.07236, 2024

  17. [17]

    Luna: Efficient and topology- agnostic foundation model for eeg signal analysis

    Berkay Döner, Thorir Mar Ingolfsson, Luca Benini, and Yawei Li. Luna: Efficient and topology- agnostic foundation model for eeg signal analysis. InThe Thirty-ninth Annual Conference on Neural Information Processing Systems, 2025

  18. [18]

    The temple university hospital eeg data corpus.Frontiers in neuroscience, 10:196, 2016

    Iyad Obeid and Joseph Picone. The temple university hospital eeg data corpus.Frontiers in neuroscience, 10:196, 2016

  19. [19]

    Inter-database validation of a deep learning approach for automatic sleep scoring.PloS one, 16(8):e0256111, 2021

    Diego Alvarez-Estevez and Roselyne M Rijsman. Inter-database validation of a deep learning approach for automatic sleep scoring.PloS one, 16(8):e0256111, 2021

  20. [20]

    Electroencephalograms during mental arithmetic task performance.Data, 4 (1):14, 2019

    Igor Zyma, Sergii Tukaev, Ivan Seleznov, Ken Kiyono, Anton Popov, Mariia Chernykh, and Oleksii Shpenkov. Electroencephalograms during mental arithmetic task performance.Data, 4 (1):14, 2019

  21. [21]

    Electroencephalogram (eeg)-based computer-aided technique to diagnose major depressive disorder (mdd).Biomedical Signal Processing and Control, 31: 108–115, 2017

    Wajid Mumtaz, Likun Xia, Syed Saad Azhar Ali, Mohd Azhar Mohd Yasin, Muhammad Hussain, and Aamir Saeed Malik. Electroencephalogram (eeg)-based computer-aided technique to diagnose major depressive disorder (mdd).Biomedical Signal Processing and Control, 31: 108–115, 2017

  22. [22]

    PhD thesis, Massachusetts Institute of Technology, 2009

    Ali Hossam Shoeb.Application of machine learning to epileptic seizure onset detection and treatment. PhD thesis, Massachusetts Institute of Technology, 2009

  23. [23]

    Efficiently modeling long sequences with structured state spaces

    Albert Gu, Karan Goel, and Christopher Re. Efficiently modeling long sequences with structured state spaces. InInternational Conference on Learning Representations, 2022

  24. [24]

    Simplified state space layers for sequence modeling

    Jimmy TH Smith, Andrew Warrington, and Scott Linderman. Simplified state space layers for sequence modeling. InThe Eleventh International Conference on Learning Representations, 2023

  25. [25]

    Mamba: Linear-Time Sequence Modeling with Selective State Spaces

    Albert Gu and Tri Dao. Mamba: Linear-time sequence modeling with selective state spaces. arXiv preprint arXiv:2312.00752, 2023

  26. [26]

    Eegmamba: An eeg foundation model with mamba.Neural Networks, page 107816, 2025

    Jiquan Wang, Sha Zhao, Zhiling Luo, Yangxuan Zhou, Shijian Li, and Gang Pan. Eegmamba: An eeg foundation model with mamba.Neural Networks, page 107816, 2025

  27. [27]

    Mamba-3: Improved sequence modeling using state space principles

    Aakash Lahoti, Kevin Li, Berlin Chen, Caitlin Wang, Aviv Bick, J Zico Kolter, Tri Dao, and Albert Gu. Mamba-3: Improved sequence modeling using state space principles. InThe Fourteenth International Conference on Learning Representations, 2026. 11

  28. [28]

    Eeg conformer: Convolu- tional transformer for eeg decoding and visualization.IEEE Transactions on Neural Systems and Rehabilitation Engineering, 31:710–719, 2022

    Yonghao Song, Qingqing Zheng, Bingchuan Liu, and Xiaorong Gao. Eeg conformer: Convolu- tional transformer for eeg decoding and visualization.IEEE Transactions on Neural Systems and Rehabilitation Engineering, 31:710–719, 2022

  29. [29]

    Self-supervised learning from images with a joint- embedding predictive architecture

    Mahmoud Assran, Quentin Duval, Ishan Misra, Piotr Bojanowski, Pascal Vincent, Michael Rabbat, Yann LeCun, and Nicolas Ballas. Self-supervised learning from images with a joint- embedding predictive architecture. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 15619–15629, 2023

  30. [30]

    Eegnet: a compact convolutional neural network for eeg-based brain–computer interfaces.Journal of neural engineering, 15(5):056013, 2018

    Vernon J Lawhern, Amelia J Solon, Nicholas R Waytowich, Stephen M Gordon, Chou P Hung, and Brent J Lance. Eegnet: a compact convolutional neural network for eeg-based brain–computer interfaces.Journal of neural engineering, 15(5):056013, 2018

  31. [31]

    Eeg-gnn: Graph neural networks for classification of electroencephalogram (eeg) signals

    Andac Demir, Toshiaki Koike-Akino, Ye Wang, Masaki Haruna, and Deniz Erdogmus. Eeg-gnn: Graph neural networks for classification of electroencephalogram (eeg) signals. In2021 43rd Annual International Conference of the IEEE Engineering in Medicine & Biology Society (EMBC), pages 1061–1067. IEEE, 2021

  32. [32]

    Modeling multivariate biosignals with graph neural networks and structured state space

    Siyi Tang, Jared Dunnmon, Liangqiong Qu, Khaled Kamal Saab, Tina Baykaner, Christopher Lee-Messer, and Daniel Rubin. Modeling multivariate biosignals with graph neural networks and structured state space. InICLR 2023 Workshop on Time Series Representation Learning for Health, 2023

  33. [33]

    Brainbert: Self-supervised representation learning for intracranial recordings

    Christopher Wang, Vighnesh Subramaniam, Adam Uri Yaari, Gabriel Kreiman, Boris Katz, Ignacio Cases, and Andrei Barbu. Brainbert: Self-supervised representation learning for intracranial recordings. InThe Eleventh International Conference on Learning Representations, 2023

  34. [34]

    Eegformer: Towards transferable and interpretable large-scale eeg foundation model

    Yuqi Chen, Kan Ren, Kaitao Song, Yansen Wang, Yifan Wang, Dongsheng Li, and Lili Qiu. Eegformer: Towards transferable and interpretable large-scale eeg foundation model. InAAAI 2024 Spring Symposium on Clinical Foundation Models, 2024

  35. [35]

    Transformer-based spatial-temporal feature learning for eeg decoding.arXiv preprint arXiv:2106.11170, 2021

    Yonghao Song, Xueyu Jia, Lie Yang, and Longhan Xie. Transformer-based spatial-temporal feature learning for eeg decoding.arXiv preprint arXiv:2106.11170, 2021

  36. [36]

    Development of expert-level classification of seizures and rhythmic and periodic patterns during eeg interpretation.Neurology, 100(17):e1750–e1762, 2023

    Jin Jing, Wendong Ge, Shenda Hong, Marta Bento Fernandes, Zhen Lin, Chaoqi Yang, Sungtae An, Aaron F Struck, Aline Herlopian, Ioannis Karakis, et al. Development of expert-level classification of seizures and rhythmic and periodic patterns during eeg interpretation.Neurology, 100(17):e1750–e1762, 2023

  37. [37]

    Self-supervised electroen- cephalogram representation learning for automatic sleep staging: model development and evaluation study.JMIR AI, 2(1):e46769, 2023

    Chaoqi Yang, Cao Xiao, M Brandon Westover, and Jimeng Sun. Self-supervised electroen- cephalogram representation learning for automatic sleep staging: model development and evaluation study.JMIR AI, 2(1):e46769, 2023

  38. [38]

    Transformer convolutional neural networks for automated artifact detection in scalp eeg

    Wei Yan Peh, Yuanyuan Yao, and Justin Dauwels. Transformer convolutional neural networks for automated artifact detection in scalp eeg. In2022 44th Annual International Conference of the IEEE Engineering in Medicine & Biology Society (EMBC), pages 3599–3602. IEEE, 2022

  39. [39]

    Cerebro: Compact encoder for representations of brain oscillations using efficient alternating attention.arXiv preprint arXiv:2501.10885, 2025

    Alexandru Dimofte, Glenn Anta Bucagu, Thorir Mar Ingolfsson, Xiaying Wang, Andrea Cossettini, Luca Benini, and Yawei Li. Cerebro: Compact encoder for representations of brain oscillations using efficient alternating attention.arXiv preprint arXiv:2501.10885, 2025

  40. [40]

    Motor imagery eeg classification algorithm based on cnn-lstm feature fusion network.Biomedical signal processing and control, 72:103342, 2022

    Hongli Li, Man Ding, Ronghua Zhang, and Chunbo Xiu. Motor imagery eeg classification algorithm based on cnn-lstm feature fusion network.Biomedical signal processing and control, 72:103342, 2022

  41. [41]

    Transformers are SSMs: Generalized Models and Efficient Algorithms Through Structured State Space Duality

    Tri Dao and Albert Gu. Transformers are ssms: Generalized models and efficient algorithms through structured state space duality.arXiv preprint arXiv:2405.21060, 2024

  42. [42]

    How jepa avoids noisy features: The implicit bias of deep linear self distillation networks

    Etai Littwin, Omid Saremi, Madhu Advani, Vimal Thilak, Preetum Nakkiran, Chen Huang, and Joshua Susskind. How jepa avoids noisy features: The implicit bias of deep linear self distillation networks. In A. Globerson, L. Mackey, D. Belgrave, A. Fan, U. Paquet, 12 J. Tomczak, and C. Zhang, editors,Advances in Neural Information Processing Systems, vol- ume 3...