pith. machine review for the scientific record. sign in

arxiv: 2605.07212 · v1 · submitted 2026-05-08 · 💻 cs.LG · cs.AI· cs.HC· cs.NE· eess.SP

Recognition: 1 theorem link

· Lean Theorem

Same Brain, Different Prediction: How Preprocessing Choices Undermine EEG Decoding Reliability

Authors on Pith no claims yet

Pith reviewed 2026-05-11 02:33 UTC · model grok-4.3

classification 💻 cs.LG cs.AIcs.HCcs.NEeess.SP
keywords EEG decodingpreprocessing variabilityprediction stabilitydeep learninguncertainty quantificationbrain-computer interfacesWalsh-Hadamard decomposition
0
0 comments X

The pith

EEG predictions flip for up to 42% of trials when only preprocessing changes

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

This paper shows that deep learning models for EEG decoding produce unstable outputs depending on how the raw brain signals are prepared before analysis. Across six datasets covering four experimental paradigms, switching among common preprocessing steps reverses the model's decision on as many as 42% of individual trials while the underlying recordings stay the same. Standard uncertainty estimates overlook this source of variability because they assume one fixed preprocessing pipeline from the start. The authors supply three practical tools to measure the instability, decompose where it comes from, and reduce it in targeted cases.

Core claim

Preprocessing choices constitute a counterfactual intervention space whose variation produces large trial-level instability in EEG decoding. Up to 42% of predictions flip across six datasets when only these choices are altered. A Walsh-Hadamard decomposition of the 2^7 binary pipeline space shows that sensitivity behaves as near-additive under the chosen intervention design. Preprocessing Uncertainty is defined as a per-trial diagnostic that captures a dimension of instability complementary to model-based confidence. Normalized Adaptive PGI is presented as a graph-structured regularizer that exploits the compositional structure of the interventions.

What carries the argument

The counterfactual intervention space over seven binary preprocessing choices, decomposed by Walsh-Hadamard transform to expose near-additive sensitivity to prediction flips.

If this is right

  • Prediction reliability assessments in EEG decoding must include variation over preprocessing pipelines rather than conditioning on a single fixed one.
  • The near-additive property permits efficient sequential optimization of preprocessing steps without enumerating all 128 combinations.
  • Preprocessing Uncertainty can be computed alongside existing model confidence scores to identify trials that are unstable for reasons outside the model itself.
  • Normalized Adaptive PGI offers one regularization approach whose effectiveness is bounded by clear scope conditions on the intervention graph.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • Literature claims of high EEG decoding accuracy may require re-examination when preprocessing pipelines are not ablated.
  • Extending the binary intervention design to continuous parameter ranges or additional datasets would test whether near-additivity holds more broadly.
  • In applied brain-computer interfaces, unaccounted preprocessing instability could produce inconsistent control signals for the same user across sessions.

Load-bearing premise

The seven binary preprocessing choices and the six chosen datasets adequately represent the variability encountered in typical EEG practice and that the observed prediction flips are not artifacts of the particular model architectures or random seeds used.

What would settle it

An EEG dataset and model pair in which changing the preprocessing pipeline produces prediction flips on fewer than 5% of trials would show that the reported instability is not general.

Figures

Figures reproduced from arXiv: 2605.07212 by Dengzhe Hou, Fangzhou Lin, Kazunori D. Yamada, Lingyu Jiang, Zihao Wu, Zirui Li.

Figure 1
Figure 1. Figure 1: Overview. (1) A raw EEG trial is processed through (2) seven binary preprocessing toggles forming a 2 7=128-pipeline Boolean lattice, producing 128 counterfactual views. (3) An EEG decoder (EEGNet/ShallowNet) yields different predictions across pipelines, exposing preprocessing sensitivity. (4) Three diagnostics quantify this instability: CFR (flip rate), PU (per-trial pipeline disagreement), and Walsh-Had… view at source ↗
Figure 2
Figure 2. Figure 2: Preprocessing sensitivity (CFR) inversely correlates with task accuracy across all six [PITH_FULL_IMAGE:figures/full_fig_p005_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Preprocessing sensitivity is task-specific. Each cell shows the mean absolute per-pair accuracy change E[|δ|] (%) when toggling one intervention across 64 pipeline pairs. Color scale clipped at 10% for readability; epoch rejection on BCI-IV-2a reaches 20.9%. Signed effects in Appendix [PITH_FULL_IMAGE:figures/full_fig_p006_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: Signed effect (∆k, %) of each intervention across all six datasets. Positive values indicate that enabling the intervention improves accuracy. Note the sign flips for ASR and bad-channel repair across tasks. where src, dst are pre-computed Hasse edge indices and w contains per-intervention weights. For NA-PGI, add logit-variance normalization (/ z.detach().pow(2).mean()) and adaptive λ modulation. A.6 NA-P… view at source ↗
Figure 5
Figure 5. Figure 5: illustrates the training dynamics of NA-PGI on two contrasting datasets. On SEED-IV (62 channels), the adaptive λ mechanism produces stable convergence: CFR decreases smoothly and the model maintains discriminative accuracy. On BCI-IV-2a (22 channels, seed 44), CFR drops abruptly to near zero mid-training, indicating representational collapse. Early-stopping preserves a non-degenerate checkpoint, but the r… view at source ↗
read the original abstract

Electroencephalography (EEG) is a cornerstone of brain-computer interfaces and clinical neuroscience, yet deep learning models are typically trained and evaluated under a single, unreported preprocessing pipeline. We formalize preprocessing choices as a counterfactual intervention space and show that EEG predictions are surprisingly unstable under this space: across six datasets spanning four paradigms, up to 42% of trial-level predictions flip when only the preprocessing changes, a variability that standard uncertainty methods do not explicitly quantify because they condition on a fixed preprocessing pipeline. We provide three tools to make this instability measurable, decomposable, and reducible. First, a Walsh-Hadamard decomposition of the 2^7 pipeline space reveals that sensitivity is near-additive in practice under the binary intervention design, enabling efficient step-by-step optimization. Second, we introduce Preprocessing Uncertainty (PU), a per-trial diagnostic that captures a dimension of instability complementary to model-based confidence. Third, we study Normalized Adaptive PGI (NA-PGI), a graph-structured regularizer that exploits the compositional structure of preprocessing interventions as one mitigation strategy with clear scope conditions.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 1 minor

Summary. The manuscript claims that EEG decoding models exhibit substantial instability to preprocessing pipeline choices: across six datasets spanning four paradigms, up to 42% of trial-level predictions flip when only the preprocessing pipeline is altered. It formalizes preprocessing decisions as a 2^7 counterfactual intervention space, demonstrates that sensitivity is near-additive via Walsh-Hadamard decomposition, and introduces Preprocessing Uncertainty (PU) as a per-trial diagnostic complementary to model confidence plus Normalized Adaptive PGI (NA-PGI) as a graph-structured regularizer to reduce the instability.

Significance. If the central empirical result holds under broader controls, the work is significant because it identifies a previously unquantified source of variability that standard uncertainty quantification methods (which fix the pipeline) systematically omit. The near-additive decomposition and the two new metrics provide concrete, decomposable tools that could improve reliability in BCI and clinical EEG applications; the paper also supplies a clear scope for when the mitigation strategy applies.

major comments (2)
  1. [Results] Results section (the 42% headline figure): the central claim that prediction flips reflect inherent preprocessing instability requires evidence that the rate is not an artifact of the specific deep architectures and single random seeds used. No ablations over alternative models or multiple initializations are reported, leaving open the possibility that more robust architectures would exhibit substantially lower flip rates and thereby weaken the assertion that standard uncertainty methods miss a complementary dimension.
  2. [Methods] Methods / Experimental Setup: the abstract and main text provide concrete numbers across six datasets yet omit exact model architectures, the precise statistical tests used to establish significance of the flip rates, and the trial exclusion criteria. These omissions make the 42% claim difficult to evaluate or reproduce and constitute a load-bearing gap for the reliability conclusion.
minor comments (1)
  1. [Methods] The notation for the seven binary preprocessing choices and the precise definition of the intervention space should be stated explicitly in a dedicated table or subsection to aid reproducibility.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their constructive and detailed review. The comments highlight important aspects of reproducibility and robustness that we address below. We have revised the manuscript to incorporate additional details and experiments where feasible.

read point-by-point responses
  1. Referee: [Results] Results section (the 42% headline figure): the central claim that prediction flips reflect inherent preprocessing instability requires evidence that the rate is not an artifact of the specific deep architectures and single random seeds used. No ablations over alternative models or multiple initializations are reported, leaving open the possibility that more robust architectures would exhibit substantially lower flip rates and thereby weaken the assertion that standard uncertainty methods miss a complementary dimension.

    Authors: We agree that ablations across architectures and seeds would further strengthen the central claim. Our experiments employed standard EEG decoding models (e.g., variants of EEGNet) across six datasets and four paradigms, with the observed flip rates reaching 42% even under these conditions. This suggests the instability arises primarily from the preprocessing intervention space rather than model idiosyncrasies. To address the concern directly, the revised manuscript will include results from multiple random seeds and one additional architecture, reporting the distribution of flip rates to demonstrate that the phenomenon persists and remains complementary to standard uncertainty quantification. revision: yes

  2. Referee: [Methods] Methods / Experimental Setup: the abstract and main text provide concrete numbers across six datasets yet omit exact model architectures, the precise statistical tests used to establish significance of the flip rates, and the trial exclusion criteria. These omissions make the 42% claim difficult to evaluate or reproduce and constitute a load-bearing gap for the reliability conclusion.

    Authors: We apologize for these omissions in the main text. The exact model architectures, hyperparameters, statistical tests (binomial tests for significance of flip rates against a null of no change), and trial exclusion criteria (amplitude-based artifact rejection thresholds) were provided in the supplementary materials. In the revised version, we will expand the Methods section with a dedicated experimental setup subsection containing these details, along with pseudocode for the pipeline enumeration and statistical analysis, to ensure full reproducibility. revision: yes

Circularity Check

0 steps flagged

No significant circularity detected

full rationale

The paper's core results consist of empirical measurements of trial-level prediction flips (up to 42%) under enumerated preprocessing interventions across six datasets, together with the introduction of a Walsh-Hadamard decomposition of the 2^7 space, the per-trial Preprocessing Uncertainty (PU) diagnostic, and the Normalized Adaptive PGI regularizer. These quantities are defined directly from the intervention design and observed model outputs without any reduction to fitted parameters renamed as predictions, self-definitional loops, or load-bearing self-citations; the decomposition is a standard linear-algebraic tool applied to the binary pipeline space, and the new metrics are constructed to be complementary to existing uncertainty measures rather than tautological with the flip statistics themselves. The derivation chain therefore remains self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 2 invented entities

Based on abstract only; the central claim rests on treating preprocessing steps as independent binary interventions and on trial-level prediction flips as the relevant stability metric. No fitted numerical parameters are mentioned.

axioms (2)
  • domain assumption Preprocessing choices can be represented as a 2^7 binary intervention space
    Invoked to enable Walsh-Hadamard decomposition of sensitivity
  • domain assumption Trial-level prediction flips are a meaningful measure of model instability
    Central to the 42% claim and to the definition of Preprocessing Uncertainty
invented entities (2)
  • Preprocessing Uncertainty (PU) no independent evidence
    purpose: Per-trial diagnostic capturing instability complementary to model confidence
    Newly defined metric whose independent evidence is not provided in the abstract
  • Normalized Adaptive PGI (NA-PGI) no independent evidence
    purpose: Graph-structured regularizer exploiting compositional structure of preprocessing interventions
    Newly proposed mitigation whose scope conditions are stated but not validated in the abstract

pith-pipeline@v0.9.0 · 5518 in / 1500 out tokens · 51504 ms · 2026-05-11T02:33:24.678358+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

45 extracted references · 45 canonical work pages · 1 internal anchor

  1. [1]

    Deep learning-based electroencephalography analysis: a systematic review

    Yannick Roy, Hubert Banville, Isabela Albuquerque, Alexandre Gramfort, Tiago H Falk, and Jocelyn Faubert. Deep learning-based electroencephalography analysis: a systematic review. Journal of Neural Engineering, 16(5):051001, 2019

  2. [2]

    Deep learning for electroencephalo- gram (EEG) classification tasks: a review.Journal of Neural Engineering, 16(3):031001, 2019

    Alexander Craik, Yongtian He, and Jose L Contreras-Vidal. Deep learning for electroencephalo- gram (EEG) classification tasks: a review.Journal of Neural Engineering, 16(3):031001, 2019

  3. [3]

    BENDR: Using transformers and a contrastive self-supervised learning task to learn from massive amounts of EEG data

    Demetres Kostas, St´ephane Aroca-Ouellette, and Frank Rudzicz. BENDR: Using transformers and a contrastive self-supervised learning task to learn from massive amounts of EEG data. Frontiers in Human Neuroscience, 15:653659, 2021

  4. [4]

    Large brain model for learning generic representations with tremendous EEG data in BCI

    Wei-Bang Jiang, Li-Ming Zhao, and Bao-Liang Lu. Large brain model for learning generic representations with tremendous EEG data in BCI. InInternational Conference on Learning Representations, 2024

  5. [5]

    BIOT: Biosignal transformer for cross- data learning in the wild.Advances in Neural Information Processing Systems, 36, 2023

    Chaoqi Yang, M Brandon Westover, and Jimeng Sun. BIOT: Biosignal transformer for cross- data learning in the wild.Advances in Neural Information Processing Systems, 36, 2023

  6. [6]

    CSBrain: A cross-scale spatiotemporal brain foundation model for EEG decoding

    Yuchen Zhou et al. CSBrain: A cross-scale spatiotemporal brain foundation model for EEG decoding. InAdvances in Neural Information Processing Systems (NeurIPS), 2025

  7. [7]

    NeurIPT: Foundation model for neural interfaces

    Zitao Fang et al. NeurIPT: Foundation model for neural interfaces. InAdvances in Neural Information Processing Systems (NeurIPS), 2025

  8. [8]

    REVE: A foundation model for EEG — adapting to any setup with large-scale pretraining on 25,000 subjects

    Yassine El Ouahidi et al. REVE: A foundation model for EEG — adapting to any setup with large-scale pretraining on 25,000 subjects. InAdvances in Neural Information Processing Systems (NeurIPS), 2025

  9. [9]

    Dropout as a Bayesian approximation: Representing model uncertainty in deep learning

    Yarin Gal and Zoubin Ghahramani. Dropout as a Bayesian approximation: Representing model uncertainty in deep learning. InInternational Conference on Machine Learning (ICML), pages 1050–1059, 2016

  10. [10]

    Simple and scalable predictive uncertainty estimation using deep ensembles

    Balaji Lakshminarayanan, Alexander Pritzel, and Charles Blundell. Simple and scalable predictive uncertainty estimation using deep ensembles. InNeurIPS, 2017

  11. [11]

    The PREP pipeline: standardized preprocessing for large-scale EEG analysis.Frontiers in Neuroinformatics, 9:16, 2015

    Nima Bigdely-Shamlo, Tim Mullen, Christian Kothe, Kyung-Min Su, and Kay A Robbins. The PREP pipeline: standardized preprocessing for large-scale EEG analysis.Frontiers in Neuroinformatics, 9:16, 2015

  12. [12]

    DISCOVER-EEG: an open, fully automated EEG pipeline for biomarker discovery in clinical neuroscience.Scientific Data, 10:613, 2023

    Cristina Gil ´Avila et al. DISCOVER-EEG: an open, fully automated EEG pipeline for biomarker discovery in clinical neuroscience.Scientific Data, 10:613, 2023

  13. [13]

    Introducing RELAX: an automated pre-processing pipeline for cleaning EEG data.Clinical Neurophysiology, 149:178–201, 2023

    Neil W Bailey et al. Introducing RELAX: an automated pre-processing pipeline for cleaning EEG data.Clinical Neurophysiology, 149:178–201, 2023

  14. [14]

    Standardizing EEG preprocessing for cross-site integration—the CLEAN pipeline.NeuroImage, 328:121812, 2026

    Adriana B¨ottcher et al. Standardizing EEG preprocessing for cross-site integration—the CLEAN pipeline.NeuroImage, 328:121812, 2026

  15. [15]

    The more, the better? Evaluating the role of EEG preprocessing for deep learning applications.IEEE Transactions on Neural Systems and Rehabilitation Engineering, 33:1061–1070, 2025

    Federico Del Pup, Andrea Zanola, Louis Fabrice Tshimanga, Alessandra Bertoldo, and Man- fredo Atzori. The more, the better? Evaluating the role of EEG preprocessing for deep learning applications.IEEE Transactions on Neural Systems and Rehabilitation Engineering, 33:1061–1070, 2025

  16. [16]

    How EEG preprocessing shapes decoding performance.Communications Biology, 8:1039, 2025

    Roman Kessler et al. How EEG preprocessing shapes decoding performance.Communications Biology, 8:1039, 2025

  17. [17]

    Shortcut learning in deep neural networks.Nature Machine Intelligence, 2(11):665–673, 2020

    Robert Geirhos, J¨orn-Henrik Jacobsen, Claudio Michaelis, Richard Zemel, Wieland Brendel, Matthias Bethge, and Felix A Wichmann. Shortcut learning in deep neural networks.Nature Machine Intelligence, 2(11):665–673, 2020. 10

  18. [18]

    Un- derspecification presents challenges for credibility in modern machine learning.Journal of Machine Learning Research, 23(226):1–61, 2022

    Alexander D’Amour, Katherine Heller, Dan Moldovan, Ben Adlam, Babak Alipanahi, Alex Beutel, Christina Chen, Jonathan Deaton, Jacob Eisenstein, Matthew D Hoffman, et al. Un- derspecification presents challenges for credibility in modern machine learning.Journal of Machine Learning Research, 23(226):1–61, 2022

  19. [19]

    Understanding deep learning (still) requires rethinking generalization.Communications of the ACM, 64(3): 107–115, 2021

    Chiyuan Zhang, Samy Bengio, Moritz Hardt, Benjamin Recht, and Oriol Vinyals. Understanding deep learning (still) requires rethinking generalization.Communications of the ACM, 64(3): 107–115, 2021

  20. [20]

    Increasing trans- parency through a multiverse analysis.Perspectives on Psychological Science, 11(5):702–712, 2016

    Sara Steegen, Francis Tuerlinckx, Andrew Gelman, and Wolf Vanpaemel. Increasing trans- parency through a multiverse analysis.Perspectives on Psychological Science, 11(5):702–712, 2016

  21. [21]

    Variability in the analysis of a single neuroimaging dataset by many teams.Nature, 582:84–88, 2020

    Rotem Botvinik-Nezer et al. Variability in the analysis of a single neuroimaging dataset by many teams.Nature, 582:84–88, 2020

  22. [22]

    Lost in a large EEG multiverse? Comparing sampling approaches for representative pipeline selection.Journal of Neuroscience Methods, 424:110564, 2025

    Cassie Ann Short et al. Lost in a large EEG multiverse? Comparing sampling approaches for representative pipeline selection.Journal of Neuroscience Methods, 424:110564, 2025

  23. [23]

    Pipeline-invariant representation learning for neuroimaging.arXiv preprint arXiv:2208.12909, 2022

    Xinhui Li et al. Pipeline-invariant representation learning for neuroimaging.arXiv preprint arXiv:2208.12909, 2022

  24. [24]

    Towards a general-purpose foundation model for functional MRI analysis

    Cheng Wang, Yu Jiang, Zhihao Peng, Chenxin Li, Chang-bae Bang, Lin Zhao, Wanyi Fu, Jinglei Lv, Jorge Sepulcre, Carl Yang, Lifang He, Tianming Liu, Xue-Jun Kong, Quanzheng Li, Daniel S Barron, Anqi Qiu, Randy Hirschtick, Byung-Hoon Kim, Hongbin Han, Xiang Li, and Yixuan Yuan. Towards a general-purpose foundation model for functional MRI analysis. Nature Bi...

  25. [25]

    Invariant Risk Minimization

    Martin Arjovsky, L´eon Bottou, Ishaan Gulrajani, and David Lopez-Paz. Invariant risk mini- mization.arXiv preprint arXiv:1907.02893, 2019

  26. [26]

    Distributionally robust neural networks for group shifts: On the importance of regularization for worst-case generalization

    Shiori Sagawa, Pang Wei Koh, Tatsunori B Hashimoto, and Percy Liang. Distributionally robust neural networks for group shifts: On the importance of regularization for worst-case generalization. InICLR, 2020

  27. [27]

    Domain-adversarial training of neural networks.Journal of Machine Learning Research, 17(59):1–35, 2016

    Yaroslav Ganin, Evgeniya Ustinova, Hana Ajakan, Pascal Germain, Hugo Larochelle, Franc ¸ois Laviolette, Mario Marchand, and Victor Lempitsky. Domain-adversarial training of neural networks.Journal of Machine Learning Research, 17(59):1–35, 2016

  28. [28]

    Domain general- ization: A survey.IEEE Transactions on Pattern Analysis and Machine Intelligence, 45(4): 4396–4415, 2022

    Kaiyang Zhou, Ziwei Liu, Yu Qiao, Tianyue Xiang, and Chen Change Loy. Domain general- ization: A survey.IEEE Transactions on Pattern Analysis and Machine Intelligence, 45(4): 4396–4415, 2022

  29. [29]

    Chun-Yen Chang, Sheng-Hsiou Hsu, Luca Pion-Tonachini, and Tzyy-Ping Jung. Evaluation of artifact subspace reconstruction for automatic artifact components removal in multi-channel EEG recordings.IEEE Transactions on Biomedical Engineering, 67(4):1114–1121, 2020

  30. [30]

    Autoreject: Automated artifact rejection for MEG and EEG data.NeuroImage, 159:417–429, 2017

    Mainak Jas, Denis A Engemann, Yousra Bekhti, Francesca Raimondo, and Alexandre Gramfort. Autoreject: Automated artifact rejection for MEG and EEG data.NeuroImage, 159:417–429, 2017

  31. [31]

    EEGNet: a compact convolutional neural network for EEG-based brain-computer interfaces.Journal of Neural Engineering, 15(5):056013, 2018

    Vernon J Lawhern, Amelia J Solon, Nicholas R Waytowich, Stephen M Gordon, Chou P Hung, and Brent J Lance. EEGNet: a compact convolutional neural network for EEG-based brain-computer interfaces.Journal of Neural Engineering, 15(5):056013, 2018

  32. [32]

    BCI competition 2008 – Graz data set A.Institute for Knowledge Discovery, Graz University of Technology, 2008

    Clemens Brunner, Robert Leeb, Gernot M¨uller-Putz, Alois Schl¨ogl, and Gert Pfurtscheller. BCI competition 2008 – Graz data set A.Institute for Knowledge Discovery, Graz University of Technology, 2008

  33. [33]

    MOABB: trustworthy algorithm benchmarking for BCIs.Journal of Neural Engineering, 15(6):066011, 2018

    Vinay Jayaram and Alexandre Barachant. MOABB: trustworthy algorithm benchmarking for BCIs.Journal of Neural Engineering, 15(6):066011, 2018

  34. [34]

    BCI2000: a general-purpose brain-computer interface (BCI) system.IEEE Transactions on Biomedical Engineering, 51(6):1034–1043, 2004

    Gerwin Schalk, Dennis J McFarland, Thilo Hinterberger, Niels Birbaumer, and Jonathan R Wol- paw. BCI2000: a general-purpose brain-computer interface (BCI) system.IEEE Transactions on Biomedical Engineering, 51(6):1034–1043, 2004. 11

  35. [35]

    Analysis of a sleep-dependent neuronal feedback loop: the slow-wave microcontinuity of the EEG.IEEE Transactions on Biomedical Engineering, 47(9):1185–1194, 2000

    Bob Kemp, Aeilko H Zwinderman, Bert Tuk, Hilbert A C Kamphuisen, and Josefien J L Oberye. Analysis of a sleep-dependent neuronal feedback loop: the slow-wave microcontinuity of the EEG.IEEE Transactions on Biomedical Engineering, 47(9):1185–1194, 2000

  36. [36]

    PhysioBank, PhysioToolkit, and PhysioNet: components of a new research resource for complex physiologic signals.Circulation, 101(23):e215–e220, 2000

    Ary L Goldberger et al. PhysioBank, PhysioToolkit, and PhysioNet: components of a new research resource for complex physiologic signals.Circulation, 101(23):e215–e220, 2000

  37. [37]

    Emotionmeter: A multimodal framework for recognizing human emotions.IEEE Transactions on Cybernetics, 49(3):1110–1122, 2019

    Wei-Long Zheng, Wei Liu, Yifei Lu, Bao-Liang Lu, and Andrzej Cichocki. Emotionmeter: A multimodal framework for recognizing human emotions.IEEE Transactions on Cybernetics, 49(3):1110–1122, 2019

  38. [38]

    MEG and EEG data analysis with MNE-Python.Frontiers in Neuroscience, 7:267, 2013

    Alexandre Gramfort, Martin Luessi, Eric Larson, Denis A Engemann, Daniel Strohmeier, Christian Brodbeck, Lauri Parkkonen, and Matti S H¨am¨al¨ainen. MEG and EEG data analysis with MNE-Python.Frontiers in Neuroscience, 7:267, 2013

  39. [39]

    K. G. Beauchamp.Walsh Functions and Their Applications. Academic Press, 1975

  40. [40]

    A re- view of uncertainty quantification in deep learning: Techniques, applications and challenges

    Moloud Abdar, Farhad Pourpanah, Sadiq Hussain, Dana Rezazadegan, Li Liu, et al. A re- view of uncertainty quantification in deep learning: Techniques, applications and challenges. Information Fusion, 76:243–297, 2021

  41. [41]

    In search of lost domain generalization

    Ishaan Gulrajani and David Lopez-Paz. In search of lost domain generalization. InICLR, 2021

  42. [42]

    Deep CORAL: Correlation alignment for deep domain adap- tation

    Baochen Sun and Kate Saenko. Deep CORAL: Correlation alignment for deep domain adap- tation. InEuropean Conference on Computer Vision (ECCV) Workshops, pages 443–450, 2016

  43. [43]

    EEG is better left alone.Scientific Reports, 13:2372, 2023

    Arnaud Delorme. EEG is better left alone.Scientific Reports, 13:2372, 2023

  44. [44]

    Deep learning with convolutional neural networks for EEG decoding and visualization.Human Brain Mapping, 38(11):5391–5420, 2017

    Robin Tibor Schirrmeister, Jost Tobias Springenberg, Lukas Dominique Josef Fiederer, Martin Glasstetter, Katharina Eggensperger, Michael Tangermann, Frank Hutter, Wolfram Burgard, and Tonio Ball. Deep learning with convolutional neural networks for EEG decoding and visualization.Human Brain Mapping, 38(11):5391–5420, 2017

  45. [45]

    The BCI competition III: validating alternative approaches to actual BCI problems.IEEE Transactions on Neural Systems and Rehabilitation Engineering, 14(2):153–159, 2006

    Benjamin Blankertz, K-R Muller, Dean J Krusienski, Gerwin Schalk, Jonathan R Wolpaw, et al. The BCI competition III: validating alternative approaches to actual BCI problems.IEEE Transactions on Neural Systems and Rehabilitation Engineering, 14(2):153–159, 2006. 12 A Additional Results A.1 Spearman Rank Correlations of Intervention Importance Table 10: Sp...