Cross-Subject Intracranial EEG Reconstruction from Scalp Recordings Using Multi-Scale Cross-Attention Transformers
Pith reviewed 2026-05-20 13:18 UTC · model grok-4.3
The pith
A multi-scale cross-attention transformer reconstructs intracranial EEG for unseen subjects from scalp recordings after brief calibration.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
CAST translates scalp EEG into multi-channel iEEG waveforms for unseen subjects through a temporal encoder that extracts multi-scale representations at three resolutions and a channel-aware decoder calibrated on a few minutes of target-subject data. Leave-one-subject-out validation on two datasets demonstrates that the approach reconstructs cortical signals substantially better than deep subcortical activity, achieving up to r=0.864 in the precentral gyrus and a mean r=0.545 with channel selection, exceeding previous within-subject baselines.
What carries the argument
CAST (Cross-Attention Spatial-Temporal Transformer) uses a multi-scale temporal encoder to capture neural representations at varying resolutions and a channel-aware decoder that adapts via brief calibration to handle varying electrode placements across subjects.
If this is right
- Cortical iEEG near the scalp surface becomes reconstructible for unseen subjects without full patient-specific training data.
- A brief calibration phase adapts the model to new hardware configurations and electrode placements.
- Reconstruction accuracy is highest in highly observable sensorimotor regions such as the precentral gyrus.
- Overall mean correlation reaches 0.545 on viable subjects using channel selection, surpassing within-subject baselines.
- The two-stage strategy reduces the circular dependency on invasive surgery for model training.
Where Pith is reading between the lines
- If the calibration remains stable across sessions, this method could support longitudinal monitoring without repeated invasive recordings.
- Extending the approach to real-time applications might enable scalp-based proxies for iEEG in epilepsy surgery planning.
- Testing on datasets with greater anatomical variability or different recording hardware would clarify the limits of the transfer step.
- Combining the model with existing scalp EEG analysis pipelines could broaden access to high-resolution neural insights in non-clinical settings.
Load-bearing premise
Multi-scale representations learned from other subjects transfer sufficiently to a new individual and a few minutes of their data can calibrate the decoder despite large differences in anatomy and electrode positions.
What would settle it
Reconstructed iEEG waveforms showing mean correlation below 0.3 with actual recorded signals in cortical regions for new subjects after the brief calibration phase, or performance no better than random in leave-one-subject-out tests on the public datasets.
Figures
read the original abstract
Intracranial EEG (iEEG) provides high-fidelity neural recordings essential for clinical and brain-computer interface applications, but acquiring these signals requires invasive surgery. While recent studies have attempted to estimate iEEG from non-invasive scalp EEG, most rely on patient-specific models, creating a circular dependency: if surgery is required to collect training data, the non-invasive model offers limited practical benefit. In this study, we address the challenge of cross-subject iEEG reconstruction by predicting intracranial signals for unseen patients using models trained on other individuals. We propose CAST (Cross-Attention Spatial-Temporal Transformer), a machine learning framework that translates scalp EEG into multi-channel iEEG waveforms through a two-stage transfer learning strategy. First, a temporal encoder extracts multi-scale neural representations at three different resolutions. Then, because electrode placements vary substantially across patients, a channel-aware decoder is calibrated using only a few minutes of data from the target subject. We evaluated the proposed method using leave-one-subject-out cross-validation on two public datasets comprising 1,282 iEEG channels. Experimental results demonstrate that CAST reconstructs cortical signals located near the scalp surface substantially better than deep subcortical activity. In highly observable sensorimotor regions, the model achieved peak correlations of up to r=0.864 in the precentral gyrus. Furthermore, with a channel selection strategy, CAST obtained a mean correlation of r=0.545 on viable subjects, outperforming previous within-subject baselines. These findings indicate that cortical iEEG signals can be reconstructed for unseen subjects from scalp EEG without extensive patient-specific training, and that only a brief calibration phase is sufficient to adapt the model to new hardware configurations.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper introduces CAST, a Cross-Attention Spatial-Temporal Transformer for cross-subject reconstruction of intracranial EEG (iEEG) from scalp EEG. It employs a two-stage transfer learning approach: a temporal encoder learns multi-scale neural representations from other subjects via leave-one-subject-out cross-validation, followed by calibration of a channel-aware decoder using only a few minutes of target-subject data to handle varying electrode placements. Evaluated on two public datasets with 1,282 iEEG channels, the method reports mean correlation r=0.545 (with channel selection) and peak r=0.864 in the precentral gyrus, outperforming within-subject baselines particularly for cortical signals near the scalp surface.
Significance. If the central claims hold under rigorous validation, this work has substantial significance for non-invasive brain-computer interfaces and clinical monitoring by reducing reliance on patient-specific invasive training data. Strengths include the use of public datasets, explicit leave-one-subject-out evaluation providing an external test of generalization, and concrete quantitative results (r values) with regional specificity. The two-stage strategy directly targets the practical barrier of extensive per-patient data collection.
major comments (2)
- Abstract: The central claim that multi-scale representations from the temporal encoder remain sufficiently transferable and that a few minutes of target-subject data suffice for channel-aware decoder calibration is load-bearing but unsupported by ablations on calibration duration, quantitative metrics of cross-subject representation alignment, or direct comparisons of adaptation cost versus full retraining. Large inter-subject variations in electrode placement, cortical folding, and skull conductivity make this the weakest link; the reported r=0.545 and r=0.864 values cannot be assessed for robustness without these controls.
- Results section (implied by abstract reporting): No details are provided on statistical tests, variance across LOOCV folds, number of subjects, or data exclusion criteria, despite concrete correlation values and baseline comparisons. This undermines evaluation of whether the outperformance over within-subject baselines is reliable or driven by specific subsets of channels/regions.
minor comments (2)
- Abstract: Clarify the exact definition of 'viable subjects' for the mean r=0.545 result and how channel selection was performed, as this directly affects interpretation of practical utility.
- Notation and presentation: Ensure consistent use of 'r' for Pearson correlation throughout and provide explicit comparison tables against prior within-subject methods with matched metrics.
Simulated Author's Rebuttal
We thank the referee for their constructive and detailed feedback on our manuscript. We have addressed each major comment point by point below, agreeing where revisions are warranted to improve clarity and rigor. We believe these changes will strengthen the presentation of our results without altering the core contributions.
read point-by-point responses
-
Referee: Abstract: The central claim that multi-scale representations from the temporal encoder remain sufficiently transferable and that a few minutes of target-subject data suffice for channel-aware decoder calibration is load-bearing but unsupported by ablations on calibration duration, quantitative metrics of cross-subject representation alignment, or direct comparisons of adaptation cost versus full retraining. Large inter-subject variations in electrode placement, cortical folding, and skull conductivity make this the weakest link; the reported r=0.545 and r=0.864 values cannot be assessed for robustness without these controls.
Authors: We agree that additional analyses would better support the transferability and calibration claims. In the revised manuscript, we will incorporate an ablation study on calibration duration (reporting performance for 1, 5, and 10 minutes of target-subject data), quantitative metrics of cross-subject representation alignment (e.g., cosine similarity between encoder outputs across subjects in the LOOCV), and a direct comparison of adaptation cost (data and compute) versus full retraining on the target subject. These additions will allow readers to better evaluate the robustness of the reported correlations. revision: yes
-
Referee: Results section (implied by abstract reporting): No details are provided on statistical tests, variance across LOOCV folds, number of subjects, or data exclusion criteria, despite concrete correlation values and baseline comparisons. This undermines evaluation of whether the outperformance over within-subject baselines is reliable or driven by specific subsets of channels/regions.
Authors: We concur that expanded statistical reporting is necessary for rigorous evaluation. The revised manuscript will explicitly state the number of subjects per dataset, report variance (standard deviation or confidence intervals) across LOOCV folds, include statistical tests (e.g., paired t-tests with p-values) comparing CAST to within-subject baselines, and detail data exclusion criteria. We will also add regional and channel-subset breakdowns to demonstrate that outperformance is not confined to particular subsets. revision: yes
Circularity Check
No significant circularity in cross-subject iEEG reconstruction via leave-one-subject-out validation
full rationale
The paper evaluates its CAST framework using leave-one-subject-out cross-validation on public datasets (1,282 iEEG channels), training the temporal encoder on other subjects and testing reconstruction on held-out unseen subjects after brief calibration of the channel-aware decoder. This constitutes an independent external test of generalization rather than fitting parameters to the target test data and then reporting those same fitted values as predictions. No equations or steps reduce by construction to self-definitions, fitted inputs renamed as outputs, or load-bearing self-citations; the multi-scale representations and calibration strategy are methodological choices whose performance is measured against held-out data. The reported metrics (mean r=0.545 with channel selection, peak r=0.864) are empirical outcomes of this validation procedure, not tautological derivations.
Axiom & Free-Parameter Ledger
free parameters (1)
- Transformer encoder and decoder weights
axioms (2)
- domain assumption Scalp EEG contains recoverable information about cortical iEEG activity
- domain assumption Multi-scale temporal features learned from one group of subjects transfer to new individuals after short calibration
Lean theorems connected to this paper
-
IndisputableMonolith/Foundation (8-tick period and recognition lattices)reality_from_one_distinction / 8-tick periodicity theorems unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
Multi-scale patching at scales s∈{8,16,32} generates N=87 tokens per channel; ... three scales specifically to capture ... high-frequency transients (40 ms resolution), alpha-band rhythms (80 ms), and slower theta/delta oscillations (160 ms).
-
IndisputableMonolith/Cost (J-cost and calibration)Jcost functional uniqueness / AlphaCoordinateFixation unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
two-stage transfer learning strategy ... universal encoder ... channel-aware decoder is calibrated using only a few minutes of data
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
Reference graph
Works this paper leans on
-
[1]
[Abdi-Sargezehet al., 2023 ] Bahman Abdi-Sargezeh, Ash- wini Oswal, and Saeid Sanei. Mapping scalp to intracra- nial eeg using generative adversarial networks for auto- matically detecting interictal epileptiform discharges. In 2023 IEEE Statistical Signal Processing Workshop (SSP), pages 710–714. IEEE,
work page 2023
-
[2]
[Abdi-Sargezehet al., 2025 ] Bahman Abdi-Sargezeh, Sepehr Shirani, Antonio Valentin, Gonzalo Alarcon, and Saeid Sanei. Eeg-to-eeg: Scalp-to-intracranial eeg translation using a combination of variational autoencoder and generative adversarial networks.Sensors, 25(2):494,
work page 2025
-
[3]
Deep neural architectures for mapping scalp to intracranial eeg
[Antoniadeset al., 2018 ] Andreas Antoniades, Loukianos Spyrou, David Martin-Lopez, Antonio Valentin, Gonzalo Alarcon, Saeid Sanei, and Clive Cheong Took. Deep neural architectures for mapping scalp to intracranial eeg. International journal of neural systems, 28(08):1850009,
work page 2018
-
[4]
[Aryaet al., 2013 ] Ravindra Arya, Francesco T Mangano, Paul S Horn, Katherine D Holland, Douglas F Rose, and Tracy A Glauser. Adverse events related to extraoper- ative invasive eeg monitoring with subdural grid elec- trodes: a systematic review and meta-analysis.Epilepsia, 54(5):828–839,
work page 2013
-
[5]
[Boranet al., 2020 ] Ece Boran, Tommaso Fedele, Adrian Steiner, Peter Hilfiker, Lennart Stieglitz, Thomas Grun- wald, and Johannes Sarnthein. Dataset of human medial temporal lobe neurons, scalp and intracranial eeg during a verbal working memory task.Scientific data, 7(1):30,
work page 2020
-
[6]
Tf-diffeeg: High-fidelity ieeg reconstruc- tion from scalp eeg via dual-domain diffusion
[Donget al., 2025 ] Yihang Dong, Kim-Fung Tsang, and Shuqiang Wang. Tf-diffeeg: High-fidelity ieeg reconstruc- tion from scalp eeg via dual-domain diffusion. In2025 IEEE International Symposium on Product Compliance Engineering-Asia (ISPCE-ASIA), pages 1–6. IEEE,
work page 2025
-
[7]
[Donget al., 2026 ] Yihang Dong, Changhong Jing, and Shuqiang Wang. Bridging scalp and intracranial eeg in bci via pretrained neural representations and geometric constraint embedding.arXiv preprint arXiv:2604.14202,
work page internal anchor Pith review Pith/arXiv arXiv 2026
-
[8]
[Heet al., 2018 ] Bin He, Abbas Sohrabpour, Emery Brown, and Zhongming Liu. Electrophysiological source imaging: a noninvasive window to brain dynamics.Annual review of biomedical engineering, 20(1):171–196,
work page 2018
-
[9]
[Heet al., 2026 ] Dongyi He, Bin Jiang, Kecheng Feng, Luyin Zhang, Ling Liu, Yuxuan Li, Yun Zhao, and He Yan. Non-invasive reconstruction of intracranial eeg across the deep temporal lobe from scalp eeg based on conditional normalizing flow.arXiv preprint arXiv:2603.03354,
work page internal anchor Pith review arXiv 2026
-
[10]
Theta oscillations in human mem- ory.Trends in cognitive sciences, 24(3):208–227,
[Herweget al., 2020 ] Nora A Herweg, Ethan A Solomon, and Michael J Kahana. Theta oscillations in human mem- ory.Trends in cognitive sciences, 24(3):208–227,
work page 2020
-
[11]
[Kauret al., 2014 ] Komalpreet Kaur, Jerry J Shih, and Dean J Krusienski. Empirical models of scalp-eeg re- sponses using non-concurrent intracranial responses.Jour- nal of neural engineering, 11(3):035012,
work page 2014
-
[12]
Intracranial eeg and human brain mapping.Journal of Physiology-Paris, 97(4-6):613–628,
[Lachauxet al., 2003 ] J Ph Lachaux, D Rudrauf, and P Ka- hane. Intracranial eeg and human brain mapping.Journal of Physiology-Paris, 97(4-6):613–628,
work page 2003
-
[13]
[Makinet al., 2020 ] Joseph G Makin, David A Moses, and Edward F Chang. Machine translation of cortical activity to text with an encoder–decoder framework.Nature neu- roscience, 23(4):575–582,
work page 2020
-
[14]
Eeg source imaging: a practical review of the anal- ysis steps.Frontiers in neurology, 10:325,
[Michel and Brunet, 2019] Christoph M Michel and Denis Brunet. Eeg source imaging: a practical review of the anal- ysis steps.Frontiers in neurology, 10:325,
work page 2019
-
[15]
Eeg source imaging.Clini- cal neurophysiology, 115(10):2195–2222,
[Michelet al., 2004 ] Christoph M Michel, Micah M Mur- ray, G ¨oran Lantz, Sara Gonzalez, Laurent Spinelli, and Rolando Grave De Peralta. Eeg source imaging.Clini- cal neurophysiology, 115(10):2195–2222,
work page 2004
-
[16]
[Mirchiet al., 2022 ] Nykan Mirchi, Nebras M Warsi, Fred- erick Zhang, Simeon M Wong, Hrishikesh Suresh, Karim Mithani, Lauren Erdman, and George M Ibrahim. De- coding intracranial eeg with machine learning: a system- atic review.Frontiers in Human Neuroscience, 16:913777,
work page 2022
-
[17]
[Mukamel and Fried, 2012] Roy Mukamel and Itzhak Fried. Human intracranial recordings and cognitive neuro- science.Annual review of psychology, 63(1):511–537,
work page 2012
-
[18]
[Nayaket al., 2004 ] Dinesh Nayak, Antonio Valentın, Gon- zalo Alarc ´on, Jorge J Garcıa Seoane, Franz Brunnhuber, Jane Juler, Charles E Polkey, and Colin D Binnie. Char- acteristics of scalp electrical fields associated with deep medial temporal epileptiform discharges.Clinical neuro- physiology, 115(6):1423–1435,
work page 2004
-
[19]
Edge ai–brain-computer interfaces system: A survey
[Nguyenet al., 2025 ] Manh-Dat Nguyen, Thomas Do, Xuan-The Tran, Quoc-Toan Nguyen, and Chin-Teng Lin. Edge ai–brain-computer interfaces system: A survey. IEEE Transactions on Neural Systems and Rehabilitation Engineering,
work page 2025
-
[20]
[Nguyenet al., 2026 ] Manh-Dat Nguyen, Thomas Do, Nguyen Thanh Trung Le, Xuan-The Tran, Fred Chang, and Chin-Teng Lin. Edgessvep: A fully embedded ssvep bci platform for low-power real-time applications.arXiv preprint arXiv:2601.01772,
-
[21]
Openneuro dataset ds004752.https://openneuro.org/datasets/ ds004752/versions/1.0.1
[OpenNeuro, ] OpenNeuro. Openneuro dataset ds004752.https://openneuro.org/datasets/ ds004752/versions/1.0.1. Public dataset, version 1.0.1. [Ouet al., 2023 ] Liang Ou, Thomas Do, Xuan-The Tran, Daniel Leong, Yu-Cheng Chang, Yu-Kai Wang, and Chin- Teng Lin. Improving cca algorithms on ssvep classifica- tion with reinforcement learning based temporal filter...
work page 2023
-
[22]
[Parvizi and Kastner, 2018] Josef Parvizi and Sabine Kast- ner. Promises and limitations of human intracranial elec- troencephalography.Nature neuroscience, 21(4):474–483,
work page 2018
-
[23]
[Pondal-Sordoet al., 2007 ] Margarita Pondal-Sordo, David Diosy, Jos ´e F T ´ellez-Zenteno, Ramesh Sahjpaul, and Samuel Wiebe. Usefulness of intracranial eeg in the deci- sion process for epilepsy surgery.Epilepsy research, 74(2- 3):176–182,
work page 2007
-
[24]
[Ronget al., 2026 ] Jesse Rong, Zhengxiang Cai, Boney Joseph, Gregory A. Worrell, and Bin He. Transformer- based eeg source imaging enables robust localization of pathological high-frequency oscillations in epilepsy. medRxiv, pages 2026–01,
work page 2026
-
[25]
[Stolket al., 2018 ] Arjen Stolk, Sandon Griffin, Roemer Van Der Meij, Callum Dewar, Ignacio Saez, Jack J Lin, Gio- vanni Piantoni, Jan-Mathijs Schoffelen, Robert T Knight, and Robert Oostenveld. Integrated analysis of anatomical and electrophysiological human intracranial data.Nature protocols, 13(7):1699–1723,
work page 2018
-
[26]
[Tranet al., 2023 ] Xuan-The Tran, Thomas Tien-Thong Do, and Chin-Teng Lin. Early detection of human decision- making in concealed object visual searching tasks: An eeg-bilstm study. In2023 45th Annual International Con- ference of the IEEE Engineering in Medicine & Biology Society (EMBC), pages 1–4. IEEE,
work page 2023
-
[27]
Eeg-ssm: Leveraging state-space model for dementia detection
[Tranet al., 2024b ] Xuan-The Tran, Linh Le, Quoc Toan Nguyen, Thomas Do, and Chin-Teng Lin. Eeg-ssm: Leveraging state-space model for dementia detection. arXiv preprint arXiv:2407.17801,
-
[28]
Inter-and intra-subject vari- ability in eeg: A systematic survey.arXiv preprint arXiv:2602.01019,
[Tranet al., 2026 ] Xuan-The Tran, Thien-Nhan V o, Son- Tung Vu, Thoa-Thi Tran, Manh-Dat Nguyen, Thomas Do, and Chin-Teng Lin. Inter-and intra-subject vari- ability in eeg: A systematic survey.arXiv preprint arXiv:2602.01019,
-
[29]
[Vafaei and Hosseini, 2025] Elnaz Vafaei and Mohammad Hosseini. Transformers in eeg analysis: a review of archi- tectures and applications in motor imagery, seizure, and emotion classification.Sensors, 25(5):1293,
work page 2025
-
[30]
Attention is all you need.Advances in neural information processing systems, 30, 2017
[Vaswaniet al., 2017 ] Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Łukasz Kaiser, and Illia Polosukhin. Attention is all you need.Advances in neural information processing systems, 30, 2017
work page 2017
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.