STAMP: Spatial-Temporal Adapter with Multi-Head Pooling
Pith reviewed 2026-05-17 21:41 UTC · model grok-4.3
The pith
A lightweight adapter lets general time series foundation models match specialized EEG models on clinical classification tasks.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
We introduce a novel Spatial-Temporal Adapter with Multi-Head Pooling (STAMP), which leverages univariate embeddings produced by a general TSFM, implicitly models spatial-temporal characteristics of EEG data, and achieves performance comparable to state-of-the-art EEGFMs on classification tasks.
What carries the argument
Multi-head pooling applied to univariate embeddings from a general time series foundation model to implicitly capture spatial relationships across EEG channels.
If this is right
- General time series foundation models can be reused for EEG tasks with only lightweight adaptation.
- Domain-specific pretraining may not be required to reach competitive results on EEG benchmarks.
- The adapter supports different numbers of channels and input configurations for multivariate signals.
- Computational cost for EEG modeling decreases because most parameters come from an already-trained general model.
Where Pith is reading between the lines
- Rich temporal embeddings may contain enough information for pooling to recover channel interactions that would otherwise require explicit spatial layers.
- The same adapter pattern could apply to other multivariate time series where channels have latent spatial or structural meaning.
- Testing on datasets with higher channel counts or different electrode montages would show how far the implicit modeling extends.
Load-bearing premise
That univariate embeddings from a general time series foundation model combined with multi-head pooling can sufficiently capture spatial relationships across EEG channels without explicit spatial modeling or EEG-specific pretraining.
What would settle it
On a new clinical EEG classification dataset, if the STAMP-adapted general model shows a large performance gap below a dedicated EEG foundation model, that would challenge the claim of comparability.
Figures
read the original abstract
Time series foundation models (TSFMs) pretrained on data from multiple domains have shown strong performance on diverse modeling tasks. Various efforts have been made to develop foundation models specific to electroencephalography (EEG) data, which records brain electrical activity as time series. However, no comparative analysis of EEG-specific foundation models (EEGFMs) versus general TSFMs has been performed on EEG-specific tasks. We introduce a novel Spatial-Temporal Adapter with Multi-Head Pooling (STAMP), which leverages univariate embeddings produced by a general TSFM, implicitly models spatial-temporal characteristics of EEG data, and achieves performance comparable to state-of-the-art EEGFMs. A comprehensive analysis is performed on 8 benchmark datasets of clinical tasks using EEG for classification, along with ablation studies. Our proposed adapter is lightweight in trainable parameters and flexible in the inputs it can accommodate, supporting easy modeling of EEG data using TSFMs.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper introduces STAMP, a lightweight Spatial-Temporal Adapter with Multi-Head Pooling that processes univariate embeddings produced by feeding each EEG channel independently through a general time series foundation model (TSFM). It claims this adapter implicitly captures spatial-temporal characteristics of EEG signals and achieves performance comparable to state-of-the-art EEG-specific foundation models (EEGFMs) across 8 clinical classification benchmark datasets, supported by ablation studies. The approach is presented as flexible and parameter-efficient for adapting general TSFMs to EEG tasks without domain-specific pretraining.
Significance. If the central empirical claims hold under rigorous validation, the work would be significant for demonstrating that general TSFMs can be adapted to EEG without explicit spatial operators or EEG-specific pretraining, potentially simplifying model development in neuroscience applications. The reported ablation studies and multi-benchmark evaluation provide a foundation for assessing flexibility and efficiency, though stronger evidence for the implicit spatial modeling would strengthen the contribution.
major comments (2)
- [Abstract] Abstract and method description: The central claim that STAMP 'implicitly models spatial-temporal characteristics of EEG data' relies on multi-head pooling over per-channel univariate TSFM embeddings recovering spatial channel correlations. However, the TSFM is pretrained on scalar time series (no topographic bias) and standard multi-head pooling is permutation-invariant, so it does not inherently encode electrode geometry, adjacency, or relative positions. Ablation studies isolating this component (e.g., comparing against random channel permutations or explicit spatial baselines) are needed to substantiate implicit modeling rather than statistical co-occurrence.
- [Experimental section] Experimental evaluation: The comparability to SOTA EEGFMs is reported on 8 benchmarks with ablations, but without visible full details on error bars, exact baseline re-implementations, or statistical significance tests, the support for performance parity cannot be fully assessed. This affects the strength of the claim that the adapter matches specialized EEGFMs.
minor comments (2)
- [Abstract] Clarify the specific general TSFM backbone used and report the exact number of trainable parameters in STAMP for reproducibility.
- [Results] Ensure figures or tables comparing STAMP to baselines include standard deviations or confidence intervals to aid interpretation of results.
Simulated Author's Rebuttal
We thank the referee for the thoughtful and constructive comments. We address each major comment below and have revised the manuscript accordingly to clarify our claims and strengthen the experimental reporting.
read point-by-point responses
-
Referee: [Abstract] Abstract and method description: The central claim that STAMP 'implicitly models spatial-temporal characteristics of EEG data' relies on multi-head pooling over per-channel univariate TSFM embeddings recovering spatial channel correlations. However, the TSFM is pretrained on scalar time series (no topographic bias) and standard multi-head pooling is permutation-invariant, so it does not inherently encode electrode geometry, adjacency, or relative positions. Ablation studies isolating this component (e.g., comparing against random channel permutations or explicit spatial baselines) are needed to substantiate implicit modeling rather than statistical co-occurrence.
Authors: We appreciate this observation regarding the mechanism of implicit modeling. The STAMP adapter aggregates per-channel TSFM embeddings via multi-head pooling, enabling the model to learn inter-channel statistical dependencies directly from the EEG data without explicit spatial operators or topographic pretraining. This data-driven aggregation captures the spatial correlations necessary for clinical tasks, as demonstrated by competitive performance across benchmarks. To address the request for isolating ablations, we will add a random channel permutation experiment in the revised supplementary material and include a brief comparison to an explicit spatial baseline (e.g., a lightweight graph-based aggregator on electrode positions) to highlight the parameter efficiency of our approach. We have also revised the abstract and method description to more precisely characterize the implicit nature of the spatial-temporal modeling. revision: partial
-
Referee: [Experimental section] Experimental evaluation: The comparability to SOTA EEGFMs is reported on 8 benchmarks with ablations, but without visible full details on error bars, exact baseline re-implementations, or statistical significance tests, the support for performance parity cannot be fully assessed. This affects the strength of the claim that the adapter matches specialized EEGFMs.
Authors: We thank the referee for this feedback on experimental rigor. In the revised manuscript, we will expand the experimental section and appendix to report: standard deviation error bars computed over five independent runs with different random seeds for all methods and datasets; detailed re-implementation protocols for each EEGFM baseline, including exact hyperparameters, data preprocessing, and training settings; and statistical significance results using paired t-tests (with p-values) between STAMP and the EEGFMs on each of the eight benchmarks. These additions will provide a more complete assessment of performance parity. revision: yes
Circularity Check
No circularity; empirical claims rest on external benchmarks
full rationale
The paper presents STAMP as a lightweight adapter that feeds per-channel univariate time series into a pretrained general TSFM and applies multi-head pooling to implicitly capture spatial-temporal EEG structure. All performance claims are supported by direct empirical comparisons on 8 external benchmark datasets plus ablation studies, rather than any derivation that reduces by construction to fitted parameters or self-referential definitions. No equations, uniqueness theorems, or self-citations are invoked as load-bearing premises that would collapse the central result to its own inputs. The derivation chain is therefore self-contained against external benchmarks.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption Univariate embeddings from a general TSFM pretrained on other domains transfer usefully to EEG channels.
invented entities (1)
-
STAMP adapter with multi-head pooling
no independent evidence
Lean theorems connected to this paper
-
IndisputableMonolith/Foundation/AlexanderDuality.leanalexander_duality_circle_linking unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
The CC-GMLP is made up of L blocks... gT(Z), gS(Z) ... spatial gating unit ... linear mapping W:R^S→R^S
-
IndisputableMonolith/Cost/FunctionalEquation.leanwashburn_uniqueness_aczel unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
We introduce a novel Spatial-Temporal Adapter with Multi-Head Pooling (STAMP), which leverages univariate embeddings produced by a general TSFM, implicitly models spatial-temporal characteristics
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
Reference graph
Works this paper leans on
-
[1]
URL https://dx.doi.org/10.1088/ 1741-2552/ad546d
ISSN 1741-2552. doi: 10.1088/1741-2552/ ab0ab5. URLhttps://dx.doi.org/10.1088/ 1741-2552/ab0ab5. Publisher: IOP Publishing. Wenhui Cui, Woojae Jeong, Philipp Th¨ olke, Takfari- nas Medani, Karim Jerbi, Anand A. Joshi, and Richard M. Leahy. Neuro-gpt: Towards a foun- dation model for eeg. In2024 IEEE International Symposium on Biomedical Imaging (ISBI), pa...
-
[2]
URLhttps://proceedings.mlr.press/ v235/das24c.html. Vijay Ekambaram, Subodh Kumar, Arindam Jati, Sumanta Mukherjee, Tomoya Sakai, Pankaj Dayama, Wesley M. Gifford, and Jayant Kalagnanam. Tspulse: Dual space tiny pre- trained models for rapid time-series analysis, 2025. URLhttps://arxiv.org/abs/2505.13033. A. L. Goldberger, L. A. Amaral, L. Glass, J. M. Ha...
-
[3]
URLhttps://proceedings.mlr.press/ v235/goswami24a.html. A. Harati, M. Golmohammadi, S. Lopez, I. Obeid, and J. Picone. Improved EEG event clas- sification using differential energy. 2015: 10.1109/SPMB.2015.7405421, 2015. ISSN 2372-7241. doi: 10.1109/SPMB.2015.7405421. URLhttps://www.ncbi.nlm.nih.gov/pmc/ articles/PMC4874511/. Edward J Hu, Yelong Shen, Phi...
-
[4]
URL https://www.isca-archive.org/interspeech_ 2019/india19_interspeech.html
doi: 10.21437/Interspeech.2019-2616. URL https://www.isca-archive.org/interspeech_ 2019/india19_interspeech.html. Wei-Bang Jiang, Li-Ming Zhao, and Bao-Liang Lu. Large brain model for learning generic representa- tions with tremendous EEG data in BCI. InThe Twelfth International Conference on Learning Rep- resentations, 2024. URLhttps://openreview. net/fo...
-
[5]
EEG Conformer: Convolutional Transformer for EEG Decoding and Visualization,
URLhttps://openreview.net/forum?id= jYluzCLFDM. Gerwin Schalk, Dennis J McFarland, Thilo Hinter- berger, Niels Birbaumer, and Jonathan R Wolpaw. Bci2000: a general-purpose brain-computer inter- face (bci) system.IEEE Transactions on Biomed- ical Engineering, 51(6):1034–1043, 2004. doi: 10. 1109/TBME.2004.827072. Yonghao Song, Xueyu Jia, Lie Yang, and Long...
-
[6]
Jiquan Wang, Sha Zhao, Zhiling Luo, Yangxuan Zhou, Haiteng Jiang, Shijian Li, Tao Li, and Gang Pan
URLhttps://openreview.net/forum?id= lvS2b8CjG5. Jiquan Wang, Sha Zhao, Zhiling Luo, Yangxuan Zhou, Haiteng Jiang, Shijian Li, Tao Li, and Gang Pan. CBramod: A criss-cross brain foundation model for EEG decoding. InThe Thirteenth Inter- national Conference on Learning Representations,
-
[7]
URLhttps://openreview.net/forum?id= NPNUHgHF2w. Miao Zhao, Yufeng Ma, Yiwei Ding, Yu Zheng, Min Liu, and Minqiang Xu. Multi-query multi-head at- tention pooling and inter-topk penalty for speaker verification. InICASSP 2022 - 2022 IEEE Interna- tional Conference on Acoustics, Speech and Signal Processing (ICASSP), pages 6737–6741, 2022. doi: 10.1109/ICASS...
-
[8]
All five seeds were used for our full evaluation
The first three seeds were used for ablation ex- periments and hyperparameter tuning. All five seeds were used for our full evaluation. As a result of our fixed seeds, each experiment is fully reproducible. Appendix C. Hyperparameter Tuning During the development of the adapter, many hyper- parameters were searched over. Our final hyperpa- rameters were s...
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.