pith. sign in

arxiv: 2605.22858 · v1 · pith:HRGKTORSnew · submitted 2026-05-19 · 📡 eess.SP · cs.LG

Classification of IED-free EEG Responses for Assisted Epilepsy Diagnosis

Pith reviewed 2026-05-25 06:27 UTC · model grok-4.3

classification 📡 eess.SP cs.LG
keywords epilepsy classificationIED-free EEGintermittent photic stimulationhyperventilationmachine learning ensemblemulti-domain featuresTUH corpus
0
0 comments X

The pith

A stacked ensemble on multi-domain features classifies epilepsy from IED-free EEG during photic stimulation and hyperventilation.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper develops a reproducible pipeline that extracts temporal, spectral, wavelet, and connectivity features from EEG recorded during intermittent photic stimulation and hyperventilation, then combines them in a stacked ensemble to separate epileptic from non-epileptic subjects even when no interictal epileptiform discharges are present. On the TUH corpus the method reaches 97.8 percent AUC on IED-free resting-state data and 94.1 percent AUC on IED-free IPS data; on an independent clinical cohort IPS yields 79.4 percent AUC. The work matters because routine EEGs frequently lack visible discharges, leaving diagnosis uncertain, and an objective, stimulation-based classifier could reduce reliance on subjective visual review. The results indicate that stimulation-evoked responses carry discriminative physiological information beyond the presence or absence of IEDs.

Core claim

The central claim is that stimulation procedures, especially intermittent photic stimulation, evoke activity patterns that a multi-domain feature set and stacked ensemble can exploit to classify epilepsy in the complete absence of interictal epileptiform discharges, achieving leave-one-subject-out AUC values up to 94.1 percent on the TUH corpus and 79.4 percent on an external clinical cohort.

What carries the argument

The stacked ensemble that integrates complementary temporal, spectral, wavelet, and connectivity feature sets extracted from stimulation segments.

If this is right

  • Intermittent photic stimulation supplies stronger discriminative signals than resting-state or hyperventilation for IED-free classification.
  • Combining multiple feature domains via ensembling increases robustness across subjects and recording conditions.
  • Hyperventilation yields usable discrimination once subjects are stratified by whether they show a physiological response.
  • Leave-one-subject-out performance on two independent cohorts supports the reproducibility of the multi-domain pipeline.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • If the same feature families prove stable across sites, the pipeline could be adapted to other provocation techniques such as sleep deprivation.
  • The performance drop from TUH to the clinical cohort implies that site-specific factors will need explicit handling before routine clinical deployment.
  • Future work that explicitly models medication load or arousal state could test whether the current separation truly isolates epilepsy-related physiology.

Load-bearing premise

The feature differences observed during photic stimulation and hyperventilation arise from epilepsy-specific physiology rather than from group differences in age, medication, arousal level, or recording artifacts.

What would settle it

A controlled replication on age- and medication-matched groups that finds classification performance dropping to chance levels would falsify the claim that the pipeline detects epilepsy-specific responses.

Figures

Figures reproduced from arXiv: 2605.22858 by Giacomo Zanardini, Justin Dauwels, Paul van der Kleij, Robert van den Berg, Ryan Moesman.

Figure 1
Figure 1. Figure 1: TUH dataset: ROC Curves for best single set and ensemble models. [PITH_FULL_IMAGE:figures/full_fig_p004_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: IED-free TUH dataset: ROC Curves for best single set and ensemble [PITH_FULL_IMAGE:figures/full_fig_p005_2.png] view at source ↗
read the original abstract

Diagnosing epilepsy is challenging when routine EEGs lack interictal epileptiform discharges (IEDs). Intermittent photic stimulation (IPS) and hyperventilation (HV) can increase diagnostic yield, but their interpretation is subjective. We propose a reproducible pipeline that classifies EEG recordings acquired during stimulation procedures, using machine-learning features spanning temporal, spectral, wavelet, and connectivity domains, and a stacked ensemble to combine complementary feature sets. Performance is evaluated with leave-one-subject-out (LOSO) cross-validation on the TUH Epilepsy Corpus and a clinical Erasmus MC (EMC) cohort, including IED-free analyses on TUH. On TUH, ensembles achieve up to 97.8\% AUC / 93.1\% BAC on IED-free resting-state EEG and 94.1\% AUC / 86.8\% BAC on IED-free IPS. On EMC, IPS provides the strongest discrimination (79.4\% AUC / 73.9\% BAC), while HV performance benefits from stratifying subjects by responsiveness. These results indicate that stimulation-evoked activity, particularly IPS, contains meaningful discriminative information for IED-free epilepsy classification and that multi-domain ensembling improves robustness.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 1 minor

Summary. The paper proposes a reproducible ML pipeline that extracts multi-domain features (temporal, spectral, wavelet, connectivity) from IED-free EEG segments recorded during resting-state, intermittent photic stimulation (IPS), and hyperventilation (HV), then applies a stacked ensemble for binary epilepsy classification. Performance is assessed via leave-one-subject-out (LOSO) cross-validation on the TUH Epilepsy Corpus (reporting up to 97.8% AUC / 93.1% BAC on resting-state and 94.1% AUC / 86.8% BAC on IPS) and an external Erasmus MC (EMC) cohort (79.4% AUC / 73.9% BAC on IPS), with a note that HV benefits from responsiveness stratification.

Significance. If the discriminative signal is shown to be epilepsy-specific rather than driven by cohort-level differences, the work would offer a concrete, objective aid for the clinically important subset of patients whose routine EEGs lack IEDs. The use of LOSO, multi-domain ensembling, and two-site evaluation are methodological strengths that increase credibility over single-cohort, non-held-out designs.

major comments (2)
  1. [Methods and Results] Methods and Results sections: No information is supplied on age, medication, arousal, or artifact matching between epilepsy and control groups, nor on any post-hoc exclusion criteria. Because the headline claim is that IPS/HV/resting-state features capture epilepsy physiology (rather than group differences), the absence of these controls is load-bearing; the reported LOSO AUCs are equally consistent with either interpretation.
  2. [Results] Results: The manuscript reports ensemble AUC/BAC values but supplies neither ablation results comparing the stacked ensemble to its constituent feature sets, nor any statistical tests (e.g., DeLong or permutation tests) on the AUC differences. Without these, it is impossible to determine whether the claimed improvement from multi-domain ensembling is robust or merely reflects the particular hyperparameter choices.
minor comments (1)
  1. [Abstract and Methods] Abstract and Methods: Feature definitions, exact hyperparameter search ranges, and the precise construction of the stacked ensemble are not described at a level that would allow immediate reproduction.

Simulated Author's Rebuttal

2 responses · 1 unresolved

We thank the referee for the constructive feedback emphasizing the need for greater transparency on cohort characteristics and ensemble validation. We address each major comment below and will revise the manuscript to incorporate feasible improvements while noting limitations where data constraints prevent full resolution.

read point-by-point responses
  1. Referee: [Methods and Results] Methods and Results sections: No information is supplied on age, medication, arousal, or artifact matching between epilepsy and control groups, nor on any post-hoc exclusion criteria. Because the headline claim is that IPS/HV/resting-state features capture epilepsy physiology (rather than group differences), the absence of these controls is load-bearing; the reported LOSO AUCs are equally consistent with either interpretation.

    Authors: We agree that explicit reporting of demographic and clinical matching strengthens claims of epilepsy-specific signal. The TUH Epilepsy Corpus is a public dataset; we followed its published subject selection criteria with no additional post-hoc exclusions beyond the IED-free filter described in the Methods. For the EMC cohort we will add a supplementary table with available age, sex, and medication data. Standard artifact rejection (ICA and amplitude thresholding) was applied uniformly, but explicit per-group arousal or artifact matching was not performed. LOSO cross-validation and the external EMC validation provide some protection against subject-level confounds, yet we acknowledge this does not fully substitute for matched cohorts. We will expand the Limitations section accordingly. revision: partial

  2. Referee: [Results] Results: The manuscript reports ensemble AUC/BAC values but supplies neither ablation results comparing the stacked ensemble to its constituent feature sets, nor any statistical tests (e.g., DeLong or permutation tests) on the AUC differences. Without these, it is impossible to determine whether the claimed improvement from multi-domain ensembling is robust or merely reflects the particular hyperparameter choices.

    Authors: We concur that ablation experiments and formal statistical comparisons are required to substantiate the benefit of the stacked ensemble. In the revised manuscript we will report performance for each individual feature domain (temporal, spectral, wavelet, connectivity) alongside the ensemble, and we will add DeLong tests for AUC differences together with permutation tests (1000 iterations) to evaluate whether observed gains exceed chance. These results will appear in an expanded Results section with new tables and a supplementary figure. revision: yes

standing simulated objections not resolved
  • Detailed per-subject arousal state and quantitative artifact burden metrics are not recorded in the TUH or EMC dataset metadata, preventing retrospective group matching on these variables.

Circularity Check

0 steps flagged

No circularity: empirical ML pipeline with held-out LOSO evaluation

full rationale

The paper describes a standard machine-learning classification pipeline (multi-domain features + stacked ensemble) whose performance metrics are obtained via leave-one-subject-out cross-validation on held-out subjects from two cohorts. No equations, fitted parameters, or self-citations are presented that would make the reported AUC/BAC values equivalent to the input labels by construction. The derivation chain consists of feature extraction followed by independent validation and contains no self-definitional, fitted-input-renamed-as-prediction, or load-bearing self-citation steps.

Axiom & Free-Parameter Ledger

1 free parameters · 1 axioms · 0 invented entities

Abstract supplies no explicit free parameters, mathematical axioms, or new entities; the central premise is a domain assumption about the information content of stimulation EEG.

free parameters (1)
  • feature extraction and ensemble hyperparameters
    Typical ML pipeline requires many tuned values for filters, wavelet scales, connectivity thresholds, and stacking weights; none are listed.
axioms (1)
  • domain assumption Stimulation-evoked EEG contains epilepsy-discriminative information independent of visible IEDs
    This premise justifies the entire classification task and is stated implicitly by the focus on IED-free analyses.

pith-pipeline@v0.9.0 · 5749 in / 1262 out tokens · 55341 ms · 2026-05-25T06:27:36.854220+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

27 extracted references · 27 canonical work pages · 1 internal anchor

  1. [1]

    ILAE definition of the Idiopathic Generalized Epilepsy Syndromes: Position statement by the ILAE Task Force on Nosology and Definitions,

    E. Hirsch et al., “ILAE definition of the Idiopathic Generalized Epilepsy Syndromes: Position statement by the ILAE Task Force on Nosology and Definitions,” en, Epilepsia, vol. 63, no. 6, pp. 1475–1499, Jun. 2022, ISSN: 0013-9580, 1528-1167.DOI: 10.1111/epi.17236

  2. [2]

    ILAE Official Report: A practical clinical definition of epilepsy,

    R. S. Fisher et al., “ILAE Official Report: A practical clinical definition of epilepsy,” en,Epilepsia, vol. 55, no. 4, pp. 475–482, 2014,ISSN: 1528-1167.DOI: 10. 1111/epi.12550

  3. [4]

    Interictal EEG and the Diagnosis of Epilepsy,

    J. Pillai and M. R. Sperling, “Interictal EEG and the Diagnosis of Epilepsy,” en,Epilepsia, vol. 47, no. s1, pp. 14–22, 2006,ISSN: 1528-1167.DOI: 10 . 1111 / j . 1528-1167.2006.00654.x

  4. [5]

    Masset, R

    R. Basiri, A. Shariatzadeh, S. Wiebe, and Y . Aghakhani, “Focal epilepsy without interictal spikes on scalp EEG: A common finding of uncertain significance,”Epilepsy Research, vol. 150,ISSN: 0920-1211.DOI: 10.1016/j. eplepsyres.2018.12.009 [7]UC Davis Department of Neurology - Epilepsy FAQs. [Online]. Available: https : / / health . ucdavis . edu / neurol...

  5. [6]

    Clinical correlations of photoparoxysmal responses,

    P. Jayakar and K. H. Chiappa, “Clinical correlations of photoparoxysmal responses,”Electroencephalography and Clinical Neurophysiology, vol. 75, no. 3, pp. 251– 254, Mar. 1990,ISSN: 0013-4694.DOI: 10.1016/0013- 4694(90)90178-M

  6. [7]

    Methodology of photic stimulation revisited: Updated European algo- rithm for visual stimulation in the EEG laboratory,

    D. Kasteleijn-Nolst Trenit ´e et al., “Methodology of photic stimulation revisited: Updated European algo- rithm for visual stimulation in the EEG laboratory,” en, Epilepsia, vol. 53, no. 1, pp. 16–24, Jan. 2012,ISSN: 0013-9580, 1528-1167.DOI: 10 . 1111 / j . 1528 - 1167 . 2011.03319.x

  7. [8]

    Does Hyperventilation Elicit Epileptic Seizures?

    M. D. Holmes, A. S. Dewaraja, and S. Vanhatalo, “Does Hyperventilation Elicit Epileptic Seizures?” en, Epilepsia, vol. 45, no. 6, pp. 618–620, 2004,ISSN: 1528-1167.DOI: 10.1111/j.0013-9580.2004.63803.x

  8. [9]

    Hyperventilation Revisited: Physiological Effects and Efficacy on Focal Seizure Activation in the Era of Video-EEG Monitoring,

    M. S. B. Guaranha, E. Garzon, C. A. Buchpiguel, S. Tazima, E. M. T. Yacubian, and A. C. Sakamoto, “Hyperventilation Revisited: Physiological Effects and Efficacy on Focal Seizure Activation in the Era of Video-EEG Monitoring,” en,Epilepsia, vol. 46, no. 1, pp. 69–75, 2005,ISSN: 1528-1167.DOI: 10 . 1111 / j . 0013-9580.2005.11104.x

  9. [10]

    Sleep deprivation: A risk for epileptic seizures,

    J. T. Dell’Aquila and V . Soti, “Sleep deprivation: A risk for epileptic seizures,”Sleep Science, vol. 15, no. 2, pp. 245–249, 2022,ISSN: 1984-0659.DOI: 10 . 5935 / 1984-0063.20220046 [13]Epilepsie, en. [Online]. Available: https : / / richtlijnendatabase . nl / richtlijn / epilepsie / elektrofysiologisch onderzoek bij epilepsie.html

  10. [11]

    Facing epistemic and complex uncer- tainty in serious illness: The role of mindfulness and shared mind,

    R. M. Epstein, “Facing epistemic and complex uncer- tainty in serious illness: The role of mindfulness and shared mind,” eng,Patient Education and Counseling, vol. 104, no. 11, pp. 2635–2642, Nov. 2021,ISSN: 1873- 5134.DOI: 10.1016/j.pec.2021.07.030

  11. [12]

    Risk of recurrence after a first unprovoked seizure,

    A. T. Berg, “Risk of recurrence after a first unprovoked seizure,” en,Epilepsia, vol. 49, no. s1, pp. 13–18, 2008, ISSN: 1528-1167.DOI: 10 . 1111 / j . 1528 - 1167 . 2008 . 01444.x

  12. [13]

    Retraining and eval- uation of machine learning and deep learning models for seizure classification from EEG data,

    J. P. Carvajal-Dossman¹ et al., “Retraining and eval- uation of machine learning and deep learning models for seizure classification from EEG data,” en,Scientific Reports, vol. 15, no. 1, p. 15 345, May 2025,ISSN: 2045-2322.DOI: 10.1038/s41598-025-98389-y

  13. [14]

    Using spectral and temporal filters with EEG signal to predict the temporal lobe epilepsy outcome after antiseizure medication via machine learn- ing,

    Y . Shin et al., “Using spectral and temporal filters with EEG signal to predict the temporal lobe epilepsy outcome after antiseizure medication via machine learn- ing,” en,Scientific Reports, vol. 13, no. 1, p. 22 532, Dec. 2023,ISSN: 2045-2322.DOI: 10 . 1038 / s41598 - 023-49255-2

  14. [15]

    Expert level of detection of interictal discharges with a deep neural network,

    M. C. Tjepkema-Cloostermans et al., “Expert level of detection of interictal discharges with a deep neural network,”Epilepsia, vol. 66, no. 1, pp. 184–194, Jan. 2025,ISSN: 0013-9580.DOI: 10.1111/epi.18164

  15. [16]

    Accuracy of Machine Learning in Detecting Pediatric Epileptic Seizures: Systematic Review and Meta-Analysis,

    Z. Zou, B. Chen, D. Xiao, F. Tang, and X. Li, “Accuracy of Machine Learning in Detecting Pediatric Epileptic Seizures: Systematic Review and Meta-Analysis,” EN, Journal of Medical Internet Research, vol. 26, no. 1, e55986, Dec. 2024.DOI: 10.2196/55986

  16. [17]

    An Attention- Enhanced 3D-CNN Framework for Spectrogram-Based EEG Analysis in Epilepsy Detection,

    Z. Khan, A. Dayal, and H.-C. Kim, “An Attention- Enhanced 3D-CNN Framework for Spectrogram-Based EEG Analysis in Epilepsy Detection,”IEEE Access, pp. 1–1, 2025,ISSN: 2169-3536.DOI: 10 . 1109 / ACCESS.2025.3574646

  17. [18]

    Channel-annotated deep learning for enhanced interpretability in EEG-based seizure de- tection,

    S. Wong et al., “Channel-annotated deep learning for enhanced interpretability in EEG-based seizure de- tection,”Biomedical Signal Processing and Control, vol. 103, p. 107 484, May 2025,ISSN: 1746-8094.DOI: 10.1016/j.bspc.2024.107484

  18. [19]

    A systematic review of artificial intelligence techniques based on electroencephalography analysis in the diagnosis of epilepsy disorders: A clinical per- spective,

    S. A. Zendehbad, A. S. Razavi, N. Tabrizi, and Z. Sedaghat, “A systematic review of artificial intelligence techniques based on electroencephalography analysis in the diagnosis of epilepsy disorders: A clinical per- spective,”Epilepsy Research, vol. 215, p. 107 582, Sep. 2025,ISSN: 0920-1211.DOI: 10 . 1016 / j . eplepsyres . 2025.107582

  19. [20]

    Automated EEG analysis of epilepsy: A review,

    U. R. Acharya, S. Vinitha Sree, G. Swapna, R. J. Martis, and J. S. Suri, “Automated EEG analysis of epilepsy: A review,”Knowledge-Based Systems, vol. 45, pp. 147–165, Jun. 2013,ISSN: 0950-7051.DOI: 10 . 1016/j.knosys.2013.02.014

  20. [21]

    EEG datasets for seizure detection and prediction— A review,

    S. Wong et al., “EEG datasets for seizure detection and prediction— A review,”Epilepsia Open, vol. 8, no. 2, pp. 252–267, Feb. 2023,ISSN: 2470-9239.DOI: 10.1002/epi4.12704

  21. [22]

    Diagnosing Epilepsy with Normal Interictal EEG Us- ing Dynamic Network Models,

    “Diagnosing Epilepsy with Normal Interictal EEG Us- ing Dynamic Network Models,” en,Annals of Neurol- ogy, vol. 97, no. 5, pp. 907–918, 2025,ISSN: 1531- 8249.DOI: 10.1002/ana.27168

  22. [23]

    Improving automated diagnosis of epilepsy from EEGs beyond IEDs,

    P. Thangavel et al., “Improving automated diagnosis of epilepsy from EEGs beyond IEDs,” en,Journal of Neural Engineering, vol. 19, no. 6, p. 066 017, Dec. 2022,ISSN: 1741-2560, 1741-2552.DOI: 10 . 1088 / 1741-2552/ac9c93

  23. [24]

    Automated Epilepsy Diagnosis beyond IEDs by Multimodal Features and Deep Learning,

    Y . Mirwani, “Automated Epilepsy Diagnosis beyond IEDs by Multimodal Features and Deep Learning,” M.S. thesis, TU Delft, 2024. [Online]. Available: https : / / resolver. tudelft . nl / uuid : c829feac - 3482 - 47a3 - 9c3e - 2e27e89056c0

  24. [25]

    Using machine learning models trained on IED-free EEGs to support epilepsy diagno- sis,

    P. A. van der Kleij, “Using machine learning models trained on IED-free EEGs to support epilepsy diagno- sis,” M.S. thesis, TU Delft, 2025. [Online]. Available: https://repository.tudelft.nl/record/uuid:e89c0857-496b- 40a4-9361-c5a94680b908

  25. [26]

    The Temple University Hospital EEG Data Corpus,

    I. Obeid and J. Picone, “The Temple University Hospital EEG Data Corpus,” English,Frontiers in Neuroscience, vol. 10, May 2016,ISSN: 1662-453X.DOI: 10.3389/ fnins.2016.00196

  26. [27]

    Automated Detection of Interictal Epileptiform Discharges from Scalp Electroencephalo- grams by Convolutional Neural Networks,

    J. Thomas et al., “Automated Detection of Interictal Epileptiform Discharges from Scalp Electroencephalo- grams by Convolutional Neural Networks,”Interna- tional Journal of Neural Systems, vol. 30, no. 11, p. 2 050 030, Nov. 2020,ISSN: 0129-0657.DOI: 10 . 1142/S0129065720500306

  27. [28]

    XGBoost: A Scalable Tree Boosting System

    T. Chen and C. Guestrin, “XGBoost: A Scalable Tree Boosting System,” inProceedings of the 22nd ACM SIGKDD International Conference on Knowledge Dis- covery and Data Mining, arXiv:1603.02754 [cs], Aug. 2016, pp. 785–794.DOI: 10.1145/2939672.2939785