pith. sign in

arxiv: 2605.17562 · v1 · pith:F6YTX6ALnew · submitted 2026-05-17 · 💻 cs.LG · cs.AI· cs.HC

Beyond Accuracy: Robustness, Interpretability and Expressiveness of EEG Foundation Models

Pith reviewed 2026-05-20 13:45 UTC · model grok-4.3

classification 💻 cs.LG cs.AIcs.HC
keywords EEG foundation modelsrobustnessinterpretabilityexpressivenesschannel dropoutlayer-wise relevance propagationblock-wise probing
0
0 comments X

The pith

No single EEG foundation model dominates every robustness test, with models attending to correct brain regions but decoding corrupted signals when perturbed.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

This paper evaluates six EEG foundation models across eight datasets on dimensions beyond clean accuracy. It applies test-time perturbations such as additive noise, random channel dropout, region-based dropout, and region-specific noise to measure robustness, finding that different models fail under different conditions with no overall winner. Using Attention-Aware Layer-Wise Relevance Propagation, the work shows that relevance concentrates on task-appropriate brain regions matching known neurophysiology, yet the maps stay spatially stable even as predictions degrade under perturbation. Block-wise probing reveals that early blocks already contain task-related information while late blocks get repurposed during fine-tuning, and that previously reported weak head-only performance stems mainly from pooling operations rather than deficient pre-trained representations.

Core claim

No single EEG-FM dominates all failure modes. Models concentrate relevance on task-appropriate brain regions consistent with known neurophysiology but decode corrupted content when inputs are perturbed. Late blocks are repurposed during fine-tuning while early blocks already hold task-related information. Poor head-only performance is largely explained by pooling rather than low-quality pre-trained representations.

What carries the argument

Test-time perturbations (additive noise, channel dropout, region-specific noise) combined with Attention-Aware Layer-Wise Relevance Propagation and block-wise probing to dissect robustness, interpretability, and expressiveness beyond accuracy.

If this is right

  • Different EEG foundation models will suit different expected data corruption patterns in practice.
  • Models attend to appropriate regions but require mechanisms to better decode signals when content is corrupted.
  • Fine-tuning mainly adapts later layers, so targeted adaptation of early layers may be unnecessary.
  • Preserving token-level embeddings rather than relying on pooled heads improves expressiveness for downstream tasks.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • Real-world EEG deployments would benefit from robustness benchmarks that include additional artifact types such as motion or electrode drift.
  • Stable attribution maps under performance drops suggest a need for training objectives that enforce content fidelity in addition to spatial focus.
  • Early-layer features could support efficient transfer to new tasks with limited labeled EEG data.

Load-bearing premise

The chosen perturbations and eight datasets capture the main real-world failure modes that matter for EEG foundation model deployment.

What would settle it

A single EEG foundation model that consistently outperforms the others across additive noise, channel dropout, and region-specific noise on all eight datasets would falsify the no-single-dominance result.

Figures

Figures reproduced from arXiv: 2605.17562 by Konstantinos Barmpas, Maryam Alimardani, Stefanos Zafeiriou, Urban \v{S}irca.

Figure 1
Figure 1. Figure 1: Classification balanced accuracy of full fine-tuned and head-only fine-tuned foundation [PITH_FULL_IMAGE:figures/full_fig_p004_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Average (across-tasks, excluding Sleep) balanced accuracy robustness evaluation under [PITH_FULL_IMAGE:figures/full_fig_p005_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Class-averaged attribution topographic maps. Columns (Models): EEGNet, CBraMod, [PITH_FULL_IMAGE:figures/full_fig_p006_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: Class-averaged attribution topographic maps under perturbation (averaged over folds) for [PITH_FULL_IMAGE:figures/full_fig_p006_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: Linear probing balanced accuracy and std (ten folds) by relative block depth for REVE [PITH_FULL_IMAGE:figures/full_fig_p007_5.png] view at source ↗
Figure 6
Figure 6. Figure 6: Per-block attention topographic maps in Movement (High-Gamma) task. Top row of each [PITH_FULL_IMAGE:figures/full_fig_p007_6.png] view at source ↗
Figure 7
Figure 7. Figure 7: Pooling strategy comparison: Mean vs flatten pooling for LaBraM, NeuroRVQ and REVE [PITH_FULL_IMAGE:figures/full_fig_p008_7.png] view at source ↗
Figure 8
Figure 8. Figure 8: White noise examples of two channels at five example SNR levels applied at test-time to a [PITH_FULL_IMAGE:figures/full_fig_p016_8.png] view at source ↗
Figure 9
Figure 9. Figure 9: Region-based channel dropout (control and primary regions) highlighted with red for five [PITH_FULL_IMAGE:figures/full_fig_p017_9.png] view at source ↗
Figure 10
Figure 10. Figure 10: Robustness evaluation under four perturbation types (full fine-tuned models): (a) Additive [PITH_FULL_IMAGE:figures/full_fig_p020_10.png] view at source ↗
Figure 11
Figure 11. Figure 11: Robustness evaluation under four perturbation types (head-only fine-tuned models): (a) [PITH_FULL_IMAGE:figures/full_fig_p021_11.png] view at source ↗
Figure 12
Figure 12. Figure 12: Per-benchmark degradation under true random channel dropout for the models that accept [PITH_FULL_IMAGE:figures/full_fig_p023_12.png] view at source ↗
Figure 13
Figure 13. Figure 13: Per-benchmark degradation under true primary region dropout for the models that accept [PITH_FULL_IMAGE:figures/full_fig_p025_13.png] view at source ↗
Figure 14
Figure 14. Figure 14: Class-averaged Gradient × Input topographic maps. Columns (Models): EEGNet, CBraMod, LaBraM and REVE. Rows (Benchmarks): Movement (High-Gamma), Motor-Imagery (OpenBMI-MI), ERP (OpenBMI-ERP), Eyes (PhysioNet). All models focus on task-relevant regions. G.2 GradCAM We apply GradCAM [Selvaraju et al., 2017] as a supplementary attribution method using the pytorch-grad-cam library [Gildenblat and contributors,… view at source ↗
Figure 15
Figure 15. Figure 15: Per-class attribution maps (AttnLRP / GradCAM) across all benchmarks for each model. [PITH_FULL_IMAGE:figures/full_fig_p028_15.png] view at source ↗
Figure 16
Figure 16. Figure 16: REVE per-block attention topographic maps across datasets. Top: fine-tuned. Bottom: [PITH_FULL_IMAGE:figures/full_fig_p029_16.png] view at source ↗
Figure 17
Figure 17. Figure 17: NeuroRVQ per-block attention topographic maps across datasets. Top: full fine-tuned. [PITH_FULL_IMAGE:figures/full_fig_p030_17.png] view at source ↗
Figure 18
Figure 18. Figure 18: REVE per-block attention under perturbation across datasets. Blocks ordered from input [PITH_FULL_IMAGE:figures/full_fig_p031_18.png] view at source ↗
Figure 19
Figure 19. Figure 19: REVE per-block attention under perturbation across datasets. Blocks ordered from input [PITH_FULL_IMAGE:figures/full_fig_p032_19.png] view at source ↗
Figure 20
Figure 20. Figure 20: NeuroRVQ per-block attention under perturbation across datasets. Blocks ordered from [PITH_FULL_IMAGE:figures/full_fig_p033_20.png] view at source ↗
Figure 21
Figure 21. Figure 21: NeuroRVQ per-block attention under perturbation across datasets. Blocks ordered from [PITH_FULL_IMAGE:figures/full_fig_p034_21.png] view at source ↗
Figure 22
Figure 22. Figure 22: Linear probing balanced accuracy (mean pooling) by relative block depth. Pre-trained [PITH_FULL_IMAGE:figures/full_fig_p035_22.png] view at source ↗
Figure 23
Figure 23. Figure 23: Linear probing balanced accuracy (flatten pooling) by relative block depth. Pre-trained [PITH_FULL_IMAGE:figures/full_fig_p035_23.png] view at source ↗
Figure 24
Figure 24. Figure 24: Depth truncation. For each k, we discard blocks beyond k, attach a flatten-pooling head, and fine-tune end-to-end. Balanced accuracy ± std is shown as a function of k. NeuroRVQ saturates at block 6 of 12, whereas REVE plateaus by block 14. H Pooling Strategy As stated in Section 4.6, the performance gap in the head-only setting appears to correlate with pooling strategy: REVE flattens all tokens before cl… view at source ↗
read the original abstract

EEG foundation models (EEG-FMs) have been evaluated predominantly on clean, in-distribution accuracy, leaving their robustness, interpretability and representational quality largely unexamined. This study addresses these gaps by benchmarking six EEG-FMs against a baseline deep learning model across eight datasets. Beyond clean accuracy, we conduct three layers of analysis: (i) Robustness: we apply test-time perturbations including additive noise, random and region-based channel dropout and region-specific noise injection. Our analyses show that no single model dominates all failure modes. The most noise-robust model is among the most fragile under channel dropout and much of the dropout fragility disappears when channels are removed rather than zero-padded. (ii) Interpretability: we present the first application of Attention-Aware Layer-Wise Relevance Propagation (AttnLRP) to EEG-FMs and show that models broadly concentrate relevance on task-appropriate brain regions consistent with known neurophysiology. However, attribution maps remain spatially stable under perturbation while predictions degrade, suggesting that the models attend to the correct brain regions but decode corrupted content. (iii) Expressiveness: With block-wise probing we show that late blocks are repurposed during fine-tuning, while early blocks already hold task-related information. Furthermore, we demonstrate that the poor head-only performance previously attributed to low-quality pre-trained representations is largely explained by pooling and that EEG-FMs possess sufficient representational capacity when their token-level embeddings are preserved. Together, these findings provide the first systematic assessment of robustness, interpretability and expressiveness for EEG-FMs and highlight critical considerations for their development.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

1 major / 1 minor

Summary. The manuscript benchmarks six EEG foundation models (EEG-FMs) against a baseline deep learning model across eight datasets. Beyond clean accuracy, it evaluates robustness via test-time perturbations (additive noise, random/region channel dropout, region-specific noise), applies Attention-Aware Layer-Wise Relevance Propagation (AttnLRP) to show that models concentrate relevance on task-appropriate brain regions consistent with neurophysiology (yet attribution maps remain stable while predictions degrade under perturbation), and uses block-wise probing to demonstrate that late blocks are repurposed during fine-tuning while early blocks already contain task-related information. It further claims that poor head-only performance is largely explained by pooling operations rather than low-quality pre-trained representations, concluding that no single EEG-FM dominates all failure modes.

Significance. If the results hold, this provides the first systematic multi-faceted evaluation of EEG-FMs beyond accuracy, offering practical guidance on model selection for noisy real-world EEG data and clarifying how pre-trained representations are utilized during fine-tuning. The alignment of attribution maps with known neurophysiology and the clarification of pooling effects are particularly useful strengths.

major comments (1)
  1. [Robustness analysis] Robustness section (abstract and methods): The central claim that 'no single model dominates all failure modes' and the reported trade-offs (noise-robust models being dropout-fragile) rest exclusively on the four synthetic perturbation families (additive noise, random/region channel dropout, region-specific noise) applied at test time. These are not calibrated against empirical distributions of real EEG artifacts (e.g., muscle artifact, electrode displacement, baseline wander, or non-stationary drift), nor is there an ablation adding physiologically motivated corruptions. This makes it unclear whether the observed rankings and fragility patterns generalize or are artifacts of the chosen test distribution.
minor comments (1)
  1. [Abstract] The abstract states that 'much of the dropout fragility disappears when channels are removed rather than zero-padded' but provides no quantitative deltas, exact experimental protocol, or reference to the relevant table/figure for this comparison.

Simulated Author's Rebuttal

1 responses · 1 unresolved

We thank the referee for their insightful feedback on the robustness evaluation. We agree that the synthetic perturbations provide a controlled but limited perspective and have revised the manuscript to better acknowledge this limitation while preserving the value of the controlled analysis.

read point-by-point responses
  1. Referee: [Robustness analysis] Robustness section (abstract and methods): The central claim that 'no single model dominates all failure modes' and the reported trade-offs (noise-robust models being dropout-fragile) rest exclusively on the four synthetic perturbation families (additive noise, random/region channel dropout, region-specific noise) applied at test time. These are not calibrated against empirical distributions of real EEG artifacts (e.g., muscle artifact, electrode displacement, baseline wander, or non-stationary drift), nor is there an ablation adding physiologically motivated corruptions. This makes it unclear whether the observed rankings and fragility patterns generalize or are artifacts of the chosen test distribution.

    Authors: We appreciate this observation and agree that our robustness results are based on synthetic perturbations chosen for their reproducibility and ability to isolate specific failure modes (additive noise for signal degradation, channel dropout for electrode loss). These do not fully replicate the statistical properties of real EEG artifacts. In the revised manuscript we have added a new paragraph in the Discussion section that explicitly compares the chosen perturbations to common real-world artifacts, qualifies the scope of the 'no single model dominates' claim to the tested conditions, and identifies the need for future work with empirically calibrated corruptions. We have also updated the abstract and conclusions to reflect this scope. No new experiments were performed, as that would exceed the scope of a major revision. revision: partial

standing simulated objections not resolved
  • Whether the observed model rankings and fragility patterns would persist under real-world EEG artifact distributions cannot be answered without new experiments using calibrated physiological corruptions.

Circularity Check

0 steps flagged

Empirical benchmarking study with no circular derivations or self-referential reductions

full rationale

The paper conducts an empirical benchmarking analysis of EEG foundation models across robustness (test-time perturbations on eight datasets), interpretability (AttnLRP attributions), and expressiveness (block-wise probing). No mathematical derivations, predictions, or first-principles results are presented that reduce to fitted parameters, self-definitions, or self-citation chains. All claims rest on external datasets, baselines, and experimental outcomes rather than internal redefinitions or ansatzes smuggled via prior work. This is a standard self-contained empirical study, warranting a score of 0.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

Empirical study with no explicit mathematical derivations or new postulated entities; free parameters are limited to standard model hyperparameters and dataset choices not detailed in the abstract.

axioms (1)
  • domain assumption The selected perturbations adequately simulate real-world EEG artifacts and channel failures.
    Invoked when interpreting robustness results as relevant to practical applications.

pith-pipeline@v0.9.0 · 5835 in / 1356 out tokens · 34159 ms · 2026-05-20T13:45:32.754782+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

65 extracted references · 65 canonical work pages

  1. [1]

    Supporting data for ``

    Kim, Hong-Kyung and Williamson, John and Lee, Min-Ho and Kwon, O-Yeon and Lee, Seong-Whan and Fazli, Siamac and Kim, Yong-Jeong and Lee, Young-Eun , year =. Supporting data for ``

  2. [2]

    2019 , volume =

    Lee, Min-Ho and Kwon, O-Yeon and Kim, Yong-Jeong and Kim, Hong-Kyung and Lee, Young-Eun and Williamson, John and Fazli, Siamac and Lee, Seong-Whan , journal =. 2019 , volume =

  3. [3]

    and Kasanov, Dauren and Kosachenko, Alexandra I

    Pavlov, Yuri G. and Kasanov, Dauren and Kosachenko, Alexandra I. and Kotyusov, Alexander I. , year =

  4. [4]

    and Zwinderman, A

    Kemp, B. and Zwinderman, A. H. and Tuk, B. and Kamphuisen, H. A. C. and Oberye, J. J. L. , journal =. Analysis of a sleep-dependent neuronal feedback loop: the slow-wave microcontinuity of the. 2000 , volume =

  5. [5]

    Circulation , year =

    PhysioBank, PhysioToolkit, and PhysioNet: Components of a New Research Resource for Complex Physiologic Signals , author =. Circulation , year =

  6. [6]

    Proceedings of the IEEE , year =

    Motor imagery and direct brain--computer communication , author =. Proceedings of the IEEE , year =

  7. [7]

    Large Brain Model for Learning Generic Representations with Tremendous

    Weibang Jiang and Liming Zhao and Bao-liang Lu , booktitle=. Large Brain Model for Learning Generic Representations with Tremendous. 2024 , doi =

  8. [8]

    2025 , doi=

    Jiquan Wang and Sha Zhao and Zhiling Luo and Yangxuan Zhou and Haiteng Jiang and Shijian Li and Tao Li and Gang Pan , booktitle=. 2025 , doi=

  9. [9]

    The Thirty-ninth Annual Conference on Neural Information Processing Systems , year=

    Yassine El Ouahidi and Jonathan Lys and Philipp Th. The Thirty-ninth Annual Conference on Neural Information Processing Systems , year=

  10. [10]

    and Laskaris, Nikolaos and Zafeiriou, Stefanos , year =

    Barmpas, Konstantinos and Lee, Na and Koliousis, Andreas and Panagakis, Yannis and Adamos, Dimitrios A. and Laskaris, Nikolaos and Zafeiriou, Stefanos , year =. CoRR , volume =

  11. [11]

    2023 , doi =

    Chaoqi Yang and M Brandon Westover and Jimeng Sun , booktitle=. 2023 , doi =

  12. [12]

    BrainOmni: A Brain Foundation Model for Unified

    Qinfan Xiao and Ziyun Cui and Chi Zhang and SiQi Chen and Wen Wu and Andrew Thwaites and Alexandra Woolgar and Bowen Zhou and Chao Zhang , booktitle=. BrainOmni: A Brain Foundation Model for Unified. 2026 , doi=

  13. [13]

    CoRR , volume=

    Lukas Rauch and René Heinrich and Houtan Ghaffari and Lukas Miklautz and Ilyass Moummad and Bernhard Sick and Christoph Scholz , title=. CoRR , volume=. 2025 , month=

  14. [14]

    Cheng and Hanlin Goh and Kaan Dogrusoz and Oncel Tuzel and Erdrin Azemi , title=

    Joseph Y. Cheng and Hanlin Goh and Kaan Dogrusoz and Oncel Tuzel and Erdrin Azemi , title=. CoRR , volume=. 2020 , cdate=

  15. [15]

    Deep learning with convolutional neural networks for

    Schirrmeister, Robin Tibor and Springenberg, Jost Tobias and Fiederer, Lukas Dominique Josef and Glasstetter, Martin and Eggensperger, Katharina and Tangermann, Michael and Hutter, Frank and Burgard, Wolfram and Ball, Tonio , journal =. Deep learning with convolutional neural networks for. 2017 , volume =

  16. [16]

    EEG-Inception: A Novel Deep Convolutional Neural Network for Assistive ERP-Based Brain-Computer Interfaces , year=

    Santamaría-Vázquez, Eduardo and Martínez-Cagigal, Víctor and Vaquerizo-Villar, Fernando and Hornero, Roberto , journal=. EEG-Inception: A Novel Deep Convolutional Neural Network for Assistive ERP-Based Brain-Computer Interfaces , year=

  17. [17]

    BrainWave-Scattering Net: a lightweight network for EEG-based motor imagery recognition , volume =

    Barmpas, Konstantinos and Panagakis, Yannis and Adamos, Dimitrios A and Laskaris, Nikolaos and Zafeiriou, Stefanos , year =. BrainWave-Scattering Net: a lightweight network for EEG-based motor imagery recognition , volume =. Journal of Neural Engineering , publisher =

  18. [18]

    EEG Conformer: Convolutional Transformer for EEG Decoding and Visualization , year=

    Song, Yonghao and Zheng, Qingqing and Liu, Bingchuan and Gao, Xiaorong , journal=. EEG Conformer: Convolutional Transformer for EEG Decoding and Visualization , year=

  19. [19]

    Lawhern, V. J. and Solon, A. J. and Waytowich, N. R. and Gordon, S. M. and Hung, C. P. and Lance, B. J. , journal =. 2018 , volume =

  20. [20]

    Parameter-Efficient Fine-Tuning of EEG Foundation Models for Plug-and-Play Motor Imagery BCIs , year=

    Širca, Urban and Brulec, Lovro and Alimardani, Maryam , booktitle=. Parameter-Efficient Fine-Tuning of EEG Foundation Models for Plug-and-Play Motor Imagery BCIs , year=

  21. [21]

    Journal of Neural Engineering , abstract =

    Kuruppu, Gayal and Wagh, Neeraj and Kremen, Vaclav and Varatharajah, Yogatheesan , title =. Journal of Neural Engineering , abstract =. doi:10.1088/1741-2552/ae4455 , year =

  22. [22]

    Forty-second International Conference on Machine Learning , year=

    Are Large Brainwave Foundation Models Capable Yet ? Insights from Fine-Tuning , author=. Forty-second International Conference on Machine Learning , year=

  23. [23]

    Assessing the Capabilities of Large Brainwave Foundation Models , year=

    Lee, Na and Bakas, Stylianos and Barmpas, Konstantinos and Panagakis, Yannis and Adamos, Dimitrios and Laskaris, Nikolaos and Zafeiriou, Stefanos , booktitle=. Assessing the Capabilities of Large Brainwave Foundation Models , year=

  24. [24]

    2025 , doi =

    A comprehensive review of biosignal foundation models , author =. 2025 , doi =

  25. [25]

    Journal of Neural Engineering , year =

    A causal perspective on brainwave modeling for brain--computer interfaces , author =. Journal of Neural Engineering , year =

  26. [26]

    2026 , eprint=

    EEG-FM-Bench: A Comprehensive Benchmark for the Systematic Evaluation of EEG Foundation Models , author=. 2026 , eprint=

  27. [27]

    2026 , eprint=

    EEG Foundation Models: Progresses, Benchmarking, and Open Problems , author=. 2026 , eprint=

  28. [28]

    and Hinterberger, Thilo and Birbaumer, Niels and Wolpaw, Jonathan R

    Schalk, Gerwin and McFarland, Dennis J. and Hinterberger, Thilo and Birbaumer, Niels and Wolpaw, Jonathan R. , journal =. 2004 , volume =

  29. [29]

    Communications of the ACM , year =

    Brain--computer interfaces for communication and control , author =. Communications of the ACM , year =

  30. [30]

    A review of classification algorithms for

    Lotte, Fabien and Bougrain, Laurent and Cichocki, Andrzej and Clerc, Maureen and Congedo, Marco and Rakotomamonjy, Alain and Yger, Fabrice , journal =. A review of classification algorithms for. 2018 , volume =

  31. [31]

    Sensors , year =

    Brain--computer interfaces: A review , author =. Sensors , year =

  32. [32]

    and Baumert, M

    Saha, S. and Baumert, M. , journal =. Intra- and inter-subject variability in. 2020 , volume =

  33. [33]

    Sujatha and Contreras-Vidal, Jose , journal =

    Ravindran, A. Sujatha and Contreras-Vidal, Jose , journal =. An empirical comparison of deep learning explainability approaches for. 2023 , volume =

  34. [34]

    CoRR , volume=

    Xinliang Zhou and Chenyu Liu and Liming Zhai and Ziyu Jia and Cuntai Guan and Yang Liu , title=. CoRR , volume=. 2023 , cdate=

  35. [35]

    Selvaraju, Michael Cogswell, Abhishek Das, Ramakrishna Vedantam, Devi Parikh, and Dhruv Batra

    Ramprasaath R. Selvaraju and Michael Cogswell and Abhishek Das and Ramakrishna Vedantam and Devi Parikh and Dhruv Batra , title=. 2017 , cdate=. doi:10.1109/ICCV.2017.74 , booktitle=

  36. [36]

    ICML , doi =

    Avanti Shrikumar and Peyton Greenside and Anshul Kundaje , title=. ICML , doi =. 2017 , cdate=

  37. [37]

    SmoothGrad: removing noise by adding noise , doi =

    Smilkov, Daniel and Thorat, Nikhil and Kim, Been and Viégas, Fernanda and Wattenberg, Martin , year =. SmoothGrad: removing noise by adding noise , doi =

  38. [38]

    International Conference on Learning Representations , year=

    Towards better understanding of gradient-based attribution methods for Deep Neural Networks , author=. International Conference on Learning Representations , year=

  39. [39]

    Bach , author A

    On Pixel-Wise Explanations for Non-Linear Classifier Decisions by Layer-Wise Relevance Propagation , year =. PLOS ONE , publisher =. doi:10.1371/journal.pone.0130140 , author =

  40. [40]

    2024 , editor =

    Achtibat, Reduan and Hatefi, Sayed Mohammad Vakilzadeh and Dreyer, Maximilian and Jain, Aakriti and Wiegand, Thomas and Lapuschkin, Sebastian and Samek, Wojciech , booktitle =. 2024 , editor =

  41. [41]

    Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics , year =

    Quantifying Attention Flow in Transformers , author =. Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics , year =

  42. [42]

    2021 IEEE/CVF International Conference on Computer Vision (ICCV) , year =

    Generic attention-model explainability for interpreting bi-modal and encoder-decoder transformers , author =. 2021 IEEE/CVF International Conference on Computer Vision (ICCV) , year =

  43. [43]

    Proceedings of the 2019 Conference of the North

    Linguistic Knowledge and Transferability of Contextual Representations , author =. Proceedings of the 2019 Conference of the North. 2019 , month = jun, address =

  44. [44]

    The Bottom-up Evolution of Representations in the Transformer: A Study with Machine Translation and Language Modeling Objectives , author =. Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP) , year =

  45. [45]

    Advances in Neural Information Processing Systems 31 (NeurIPS) , year =

    Sanity Checks for Saliency Maps , author =. Advances in Neural Information Processing Systems 31 (NeurIPS) , year =

  46. [46]
  47. [47]

    What Happens To BERT Embeddings During Fine-tuning?

    Merchant, Amil and Rahimtoroghi, Elahe and Pavlick, Ellie and Tenney, Ian. What Happens To BERT Embeddings During Fine-tuning?. Proceedings of the Third BlackboxNLP Workshop on Analyzing and Interpreting Neural Networks for NLP. 2020. doi:10.18653/v1/2020.blackboxnlp-1.4

  48. [48]

    doi:10.1088/1741-2560/4/2/r03 , url =

    Ali Bashashati and Mehrdad Fatourechi and Rabab K Ward and Gary E Birch , title =. doi:10.1088/1741-2560/4/2/r03 , url =

  49. [49]

    Cambridge, Mass, MIT Press

    Brain signal analysis advances in neuroelectric and neuromagnetic methods. Cambridge, Mass, MIT Press. , author=

  50. [50]

    Brain-computer interfacing: an introduction , author=

  51. [51]

    Brain–Computer Interfaces Handbook: Technological and Theoretical Advances, CRC Press , author=

  52. [52]

    and Anderson, C.W

    McFarland, D.J. and Anderson, C.W. and Muller, K.-R. and Schlogl, A. and Krusienski, D.J. , journal=. BCI meeting 2005-workshop on BCI signal processing: feature extraction and translation , year=. doi:10.1109/TNSRE.2006.875637 , url=

  53. [53]

    Language Models are Few-Shot Learners , doi =

    Brown, Tom and Mann, Benjamin and Ryder, Nick and Subbiah, Melanie and Kaplan, Jared D and Dhariwal, Prafulla and Neelakantan, Arvind and Shyam, Pranav and Sastry, Girish and Askell, Amanda and Agarwal, Sandhini and Herbert-Voss, Ariel and Krueger, Gretchen and Henighan, Tom and Child, Rewon and Ramesh, Aditya and Ziegler, Daniel and Wu, Jeffrey and Winte...

  54. [54]

    Nature , year =

    Deep learning , author =. Nature , year =

  55. [55]

    2023 , eprint=

    LLaMA: Open and Efficient Foundation Language Models , author=. 2023 , eprint=

  56. [56]

    Proceedings of the European Conference on Computer Vision (ECCV) , year=

    Arc2Face: A Foundation Model for ID-Consistent Human Faces , author=. Proceedings of the European Conference on Computer Vision (ECCV) , year=

  57. [57]

    and Laskaris, Nikolaos and Zafeiriou, Stefanos , journal=

    Barmpas, Konstantinos and Panagakis, Yannis and Bakas, Stylianos and Adamos, Dimitrios A. and Laskaris, Nikolaos and Zafeiriou, Stefanos , journal=. Improving Generalization of CNN-Based Motor-Imagery EEG Decoders via Dynamic Convolutions , year=

  58. [58]

    International Conference on Learning Representations , year=

    Decoupled Weight Decay Regularization , author=. International Conference on Learning Representations , year=

  59. [59]

    International Conference on Learning Representations (ICLR) , year =

    Adam: A Method for Stochastic Optimization , author =. International Conference on Learning Representations (ICLR) , year =

  60. [60]

    2021 , publisher=

    PyTorch library for CAM methods , author=. 2021 , publisher=

  61. [61]

    and Khan, M

    Khan, W. and Khan, M. S. and Qasem, S. N. and Ghaban, W. and Saeed, F. and Hanif, M. and Ahmad, J. , journal =. An explainable and efficient deep learning framework for. 2025 , volume =

  62. [62]

    and Ojo, S

    Almadhor, A. and Ojo, S. and Nathaniel, T. I. and Alsubai, S. and Alharthi, A. and Hejaili, A. A. and Sampedro, G. A. , journal =. An interpretable. 2025 , volume =

  63. [63]

    2006 , booktitle =

    ERD/ERS patterns reflecting sensorimotor activation and deactivation , editor =. 2006 , booktitle =. doi:10.1016/S0079-6123(06)59014-4 , author =

  64. [64]

    2007 , issn =

    EEG differences between eyes-closed and eyes-open resting conditions , journal =. 2007 , issn =. doi:10.1016/j.clinph.2007.07.028 , author =

  65. [65]

    Brain topography of the human sleep

    Werth, Esther and Achermann, Peter and Borb. Brain topography of the human sleep. NeuroReport , volume =. 1996 , doi =