Suitability of an inter-burst detection method for grading hypoxic-ischemic encephalopathy in newborn EEG

Geraldine B. Boylan; Gordon Lightbody; John M. O'Toole; Saif Nalband; Sumit A. Raurale

arxiv: 1907.02877 · v1 · pith:NZTHIKECnew · submitted 2019-07-05 · 📡 eess.SP · cs.LG

Suitability of an inter-burst detection method for grading hypoxic-ischemic encephalopathy in newborn EEG

Sumit A. Raurale , Saif Nalband , Geraldine B. Boylan , Gordon Lightbody , John M. O'Toole This is my paper

Pith reviewed 2026-05-25 02:03 UTC · model grok-4.3

classification 📡 eess.SP cs.LG

keywords inter-burst detectionhypoxic-ischemic encephalopathynewborn EEGEEG gradingmachine learningterm infantspreterm EEG methodinter-burst interval

0 comments

The pith

An inter-burst detection method from preterm infants works without change on term newborn EEG to classify grades of hypoxic-ischemic encephalopathy at 77.8 percent accuracy.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper tests whether a detector for low-voltage inter-burst segments, built on EEG from babies born before 30 weeks, can be used directly on recordings from full-term infants. It extracts simple measures of how these segments are spaced in time, such as their overall percentage and longest run, then feeds them into a multi-layer perceptron to sort the EEG into four injury grades. The single best feature reaches 59.3 percent accuracy on held-out data, while the two-feature combination reaches 77.8 percent and performs comparably to methods that use many more features. A reader would care because EEG grading after birth asphyxia guides clinical decisions, and a transferable detector would let the same tool serve different gestational ages.

Core claim

The inter-burst detection method developed for preterm infants born less than 30 weeks gestational age accurately identifies inter-bursts in term infants. Features from the temporal organisation of the inter-bursts, in particular the percentage of inter-bursts and the maximum duration of inter-bursts, when combined in a multi-layer perceptron, classify four grades of hypoxic-ischemic encephalopathy with a testing accuracy of 77.8 percent, similar to existing multi-feature approaches.

What carries the argument

The inter-burst detection method (developed for preterm infants) together with the multi-layer perceptron that classifies injury grades from temporal features of the detected intervals.

If this is right

The preterm inter-burst detector transfers directly to term EEG without retraining or parameter changes.
Percentage of inter-bursts alone already separates the four injury grades at 59.3 percent accuracy.
Adding maximum inter-burst duration to the classifier raises performance to 77.8 percent on test data.
The resulting accuracy matches that of more elaborate multi-feature classifiers already in use.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The transfer success suggests that the statistical structure of inter-bursts is sufficiently stable across the term-preterm boundary to allow reuse of the same detector.
If the same features continue to work on larger or more varied term cohorts, automated grading pipelines could drop the requirement for separate age-specific detectors.
The approach opens a route to test whether the same two features also track recovery trajectories or predict later neurodevelopmental scores.

Load-bearing premise

The detector trained only on very preterm EEG marks the same inter-burst segments in term EEG that a human expert would mark.

What would settle it

Manual expert annotation of inter-bursts in a new set of term EEG recordings followed by direct comparison of detection overlap or boundary error against the preterm method's output.

Figures

Figures reproduced from arXiv: 1907.02877 by Geraldine B. Boylan, Gordon Lightbody, John M. O'Toole, Saif Nalband, Sumit A. Raurale.

**Figure 1.** Figure 1: Proposed HIE grading system for term infant EEG. (Majority voting: multiplexing data into individual binary masks [PITH_FULL_IMAGE:figures/full_fig_p002_1.png] view at source ↗

**Figure 2.** Figure 2: Detecting inter-burst intervals (IBI) for 4 different [PITH_FULL_IMAGE:figures/full_fig_p002_2.png] view at source ↗

**Figure 3.** Figure 3: Distribution of the IBI features separated by EEG [PITH_FULL_IMAGE:figures/full_fig_p003_3.png] view at source ↗

read the original abstract

Electroencephalography (EEG) is an important clinical tool for grading injury caused by lack of oxygen or blood to the brain during birth. Characteristics of low-voltage waveforms, known as inter-bursts, are related to different grades of injury. This study assesses the suitability of an existing inter-burst detection method, developed from preterm infants born <30 weeks of gestational age, to detect inter-bursts in term infants. Different features from the temporal organisation of the inter-bursts are combined using a multi-layer perceptron (MLP) machine learning algorithm to classify four grades of injury in the EEG. We find that the best performing feature, percentage of inter-bursts, has an accuracy of 59.3%. Combining this with the maximum duration of inter-bursts in the MLP produces a testing accuracy of 77.8%, with similar performance to existing multi-feature methods. These results validate the use of the preterm detection method in term EEG and show how simple measures of the inter-burst interval can be used to classify different grades of injury.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 0 minor

Summary. The paper assesses the suitability of an inter-burst detection method originally developed for preterm infants (<30 weeks gestational age) when applied without modification to term infants for grading hypoxic-ischemic encephalopathy (HIE) severity in EEG. Temporal features derived from the detected inter-bursts (e.g., percentage of inter-bursts, maximum duration) are fed to a multi-layer perceptron classifier to distinguish four injury grades, with reported testing accuracies of 59.3% for the single best feature and 77.8% for the combined feature set.

Significance. If the detector transfer holds and the classification results are reliable, the work indicates that straightforward inter-burst interval statistics can achieve performance comparable to more complex multi-feature methods for HIE grading, potentially offering a simpler clinical tool. The empirical validation on new term recordings using standard machine learning is a positive aspect, though the absence of direct detector metrics limits the strength of the suitability claim.

major comments (2)

[Abstract] Abstract: The central claim that the preterm inter-burst detector is suitable for term EEG rests on classification accuracy alone; no quantitative detection performance metrics (sensitivity, specificity, or agreement with annotations) are supplied for the term cohort, so maturational differences in burst patterns could invalidate the derived features without being detected.
[Abstract] Abstract: Reported accuracies (59.3% single feature, 77.8% combined) are given without dataset size, number of recordings or subjects, cross-validation details, statistical testing, or error bars, preventing assessment of whether the results support the suitability conclusion or are consistent with chance-level performance.

Simulated Author's Rebuttal

2 responses · 1 unresolved

We thank the referee for the detailed comments on our manuscript. We respond point-by-point to the major comments, indicating where revisions to the manuscript (including the abstract) will be made to address the concerns raised.

read point-by-point responses

Referee: [Abstract] Abstract: The central claim that the preterm inter-burst detector is suitable for term EEG rests on classification accuracy alone; no quantitative detection performance metrics (sensitivity, specificity, or agreement with annotations) are supplied for the term cohort, so maturational differences in burst patterns could invalidate the derived features without being detected.

Authors: We agree that the suitability claim is supported indirectly via the downstream classification performance (77.8% testing accuracy using inter-burst features) rather than direct detector metrics on term data. The study applied the preterm method without modification to assess transferability through feature utility for HIE grading. Direct metrics (sensitivity/specificity) would require new expert annotations of inter-bursts in the term recordings, which were outside the scope of this work. We will revise the abstract and add a limitations paragraph in the discussion to explicitly note this indirect validation approach and the potential impact of maturational differences. revision: partial
Referee: [Abstract] Abstract: Reported accuracies (59.3% single feature, 77.8% combined) are given without dataset size, number of recordings or subjects, cross-validation details, statistical testing, or error bars, preventing assessment of whether the results support the suitability conclusion or are consistent with chance-level performance.

Authors: The abstract is constrained by length and therefore omits these details, but the full manuscript describes the dataset (number of recordings and subjects), the MLP architecture, leave-one-subject-out cross-validation, and results in the methods and results sections. To address the concern, we will revise the abstract to include the cohort size and cross-validation method, and ensure error bars or confidence intervals are reported if not already present in the results. revision: yes

standing simulated objections not resolved

Direct quantitative detection performance metrics (sensitivity, specificity, agreement) for the inter-burst detector on the term cohort cannot be supplied, as the study did not generate new annotations of inter-bursts in term EEG.

Circularity Check

0 steps flagged

Minor self-citation of prior detector; downstream ML classification accuracy is independently measured

full rationale

The paper applies an existing inter-burst detector (developed on preterm EEG) to a new term-infant cohort, extracts simple interval features, and trains/tests an MLP to predict injury grades, achieving 77.8% test accuracy via standard supervised learning on held-out data. No equations, fitted parameters, or self-citations reduce this accuracy figure to a quantity defined by the inputs themselves. The reliance on the prior detector constitutes a self-citation but is not load-bearing for the reported result, as the classification performance remains an empirical, falsifiable outcome on independent recordings.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

Abstract-only review limits visibility into parameters; the central assumption is transferability of the detection method, treated as a domain assumption under test rather than an ad-hoc invention.

axioms (1)

domain assumption The inter-burst detection algorithm developed on preterm infants (<30 weeks) identifies inter-bursts in term infants without adjustment.
This premise is explicitly tested by the study and is required for the validation claim.

pith-pipeline@v0.9.0 · 5742 in / 1209 out tokens · 50549 ms · 2026-05-25T02:03:38.538467+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

13 extracted references · 13 canonical work pages

[1]

Sleep wake cycling in early preterm infants: comparison of polysomnographic recordings with a novel EEG-based index

P. Kirsi, T. Kirjavainen, S. Stjerna, T. Salokivi, and S. Vanhat- alo. “Sleep wake cycling in early preterm infants: comparison of polysomnographic recordings with a novel EEG-based index” Clinical Neurophysiology, 124(9), (2013): 1807–1814

work page 2013
[2]

Estimating functional brain maturity in very and extremely preterm neonates using automated analysis of the electroencephalogram

J. M. O’Toole, G. B. Boylan, S. Vanhatalo, and N. J. Stevenson. “Estimating functional brain maturity in very and extremely preterm neonates using automated analysis of the electroencephalogram.” Clin- ical Neurophysiology, 127(8), (2016): 2910–2918

work page 2016
[3]

Quantitative electroencephalographic patterns in normal preterm infants over the ﬁrst week after birth

R. W. Claire, J. E. Harding, C. E. Williams, M. I. Gunning, and M. R. Battin. “Quantitative electroencephalographic patterns in normal preterm infants over the ﬁrst week after birth” Early Human Devel- opment, 82(1), (2006): 43–51

work page 2006
[4]

Prediction of seizures in asphyxiated neonates: cor- relation with continuous video-electroencephalographic monitoring

M. D. Murray, C. A. Ryan, G. B. Boylan, A. P. Fitzgerald, and S. Connolly. “Prediction of seizures in asphyxiated neonates: cor- relation with continuous video-electroencephalographic monitoring” Pediatrics, 118(1), (2006): 41–46

work page 2006
[5]

Slow endogenous activity transients and developmental expression of K+-Cl- cotransporter 2 in the immature human cortex

S. Vanhatalo, J. M. Palva, S. Andersson, C. Rivera, J. V oipio, and K. Kaila. “Slow endogenous activity transients and developmental expression of K+-Cl- cotransporter 2 in the immature human cortex” European Journal of Neuroscience , 22(11), (2005): 2799–2804

work page 2005
[6]

Improving reliability of monitoring background EEG dynamics in asphyxiated infants

V . Matic, P. J. Cherian, K. Jansen, N. Koolen, G. Naulaers, R. M. Swarte, P. Govaert, S. V . Huffel, and M. D. V os. “Improving reliability of monitoring background EEG dynamics in asphyxiated infants.” IEEE Transactions on Biomedical Engineering , 63(5), (2016): 973– 983

work page 2016
[7]

Detecting bursts in the EEG of very and extremely premature infants using a multi-feature approach

J. M. O’Toole, G. B. Boylan, R. O. Lloyd, R. M. Goulding, S. Vanhatalo, and N. J. Stevenson. “Detecting bursts in the EEG of very and extremely premature infants using a multi-feature approach.” Medical Engineering & Physics , 45, (2017): 42–50

work page 2017
[8]

Surrogate data test for nonlinearity of EEG signals: A newborn EEG burst suppression case study

P. Mirzaei, G. Azemi, N. Japaridze, and B. Boashash. “Surrogate data test for nonlinearity of EEG signals: A newborn EEG burst suppression case study.” Digital Signal Processing , 70 (2017): 30–38

work page 2017
[9]

Automated detection of neonate EEG sleep stages

A. Piryatinska, T. Gyorgy, W. A. Woyczynski, K. A. Loparo, M. S. Scher, and A. Zlotnik. “Automated detection of neonate EEG sleep stages.” Computer Methods and Programs in Biomedicine , 95(1), (2009): 31–46

work page 2009
[10]

An automated system for grading EEG abnormality in term neonates with hypoxic-ischaemic encephalopa- thy

N. J. Stevenson, I. Korotchikova, A. Temko, G. Lightbody, W. P. Marnane, and G. B. Boylan. “An automated system for grading EEG abnormality in term neonates with hypoxic-ischaemic encephalopa- thy.” Annals of Biomedical Engineering , 41(4), (2013): 775–785

work page 2013
[11]

Grading hypoxicischemic encephalopathy severity in neonatal EEG using GMM supervectors and the support vector machine

R. Ahmed, A. Temko, W. Marnane, G. Lightbody, and G. Boylan. “Grading hypoxicischemic encephalopathy severity in neonatal EEG using GMM supervectors and the support vector machine.” Clinical Neurophysiology, 127(1), (2016): 297-309

work page 2016
[12]

The use of conven- tional EEG for the assessment of hypoxic ischaemic encephalopathy in the newborn: a review

B. H. Walsh, D. M. Murray, and G. B. Boylan. “The use of conven- tional EEG for the assessment of hypoxic ischaemic encephalopathy in the newborn: a review.” Clinical Neurophysiology, 122(7), (2011): 1284–1294

work page 2011
[13]

EMG acquisition and hand pose classiﬁcation for bionic hands from randomly-placed sensors

S. Raurale, J. McAllister, and J. M. del Rincon, “EMG acquisition and hand pose classiﬁcation for bionic hands from randomly-placed sensors” in 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) , (2018): 1105–1109

work page 2018

[1] [1]

Sleep wake cycling in early preterm infants: comparison of polysomnographic recordings with a novel EEG-based index

P. Kirsi, T. Kirjavainen, S. Stjerna, T. Salokivi, and S. Vanhat- alo. “Sleep wake cycling in early preterm infants: comparison of polysomnographic recordings with a novel EEG-based index” Clinical Neurophysiology, 124(9), (2013): 1807–1814

work page 2013

[2] [2]

Estimating functional brain maturity in very and extremely preterm neonates using automated analysis of the electroencephalogram

J. M. O’Toole, G. B. Boylan, S. Vanhatalo, and N. J. Stevenson. “Estimating functional brain maturity in very and extremely preterm neonates using automated analysis of the electroencephalogram.” Clin- ical Neurophysiology, 127(8), (2016): 2910–2918

work page 2016

[3] [3]

Quantitative electroencephalographic patterns in normal preterm infants over the ﬁrst week after birth

R. W. Claire, J. E. Harding, C. E. Williams, M. I. Gunning, and M. R. Battin. “Quantitative electroencephalographic patterns in normal preterm infants over the ﬁrst week after birth” Early Human Devel- opment, 82(1), (2006): 43–51

work page 2006

[4] [4]

Prediction of seizures in asphyxiated neonates: cor- relation with continuous video-electroencephalographic monitoring

M. D. Murray, C. A. Ryan, G. B. Boylan, A. P. Fitzgerald, and S. Connolly. “Prediction of seizures in asphyxiated neonates: cor- relation with continuous video-electroencephalographic monitoring” Pediatrics, 118(1), (2006): 41–46

work page 2006

[5] [5]

Slow endogenous activity transients and developmental expression of K+-Cl- cotransporter 2 in the immature human cortex

S. Vanhatalo, J. M. Palva, S. Andersson, C. Rivera, J. V oipio, and K. Kaila. “Slow endogenous activity transients and developmental expression of K+-Cl- cotransporter 2 in the immature human cortex” European Journal of Neuroscience , 22(11), (2005): 2799–2804

work page 2005

[6] [6]

Improving reliability of monitoring background EEG dynamics in asphyxiated infants

V . Matic, P. J. Cherian, K. Jansen, N. Koolen, G. Naulaers, R. M. Swarte, P. Govaert, S. V . Huffel, and M. D. V os. “Improving reliability of monitoring background EEG dynamics in asphyxiated infants.” IEEE Transactions on Biomedical Engineering , 63(5), (2016): 973– 983

work page 2016

[7] [7]

Detecting bursts in the EEG of very and extremely premature infants using a multi-feature approach

J. M. O’Toole, G. B. Boylan, R. O. Lloyd, R. M. Goulding, S. Vanhatalo, and N. J. Stevenson. “Detecting bursts in the EEG of very and extremely premature infants using a multi-feature approach.” Medical Engineering & Physics , 45, (2017): 42–50

work page 2017

[8] [8]

Surrogate data test for nonlinearity of EEG signals: A newborn EEG burst suppression case study

P. Mirzaei, G. Azemi, N. Japaridze, and B. Boashash. “Surrogate data test for nonlinearity of EEG signals: A newborn EEG burst suppression case study.” Digital Signal Processing , 70 (2017): 30–38

work page 2017

[9] [9]

Automated detection of neonate EEG sleep stages

A. Piryatinska, T. Gyorgy, W. A. Woyczynski, K. A. Loparo, M. S. Scher, and A. Zlotnik. “Automated detection of neonate EEG sleep stages.” Computer Methods and Programs in Biomedicine , 95(1), (2009): 31–46

work page 2009

[10] [10]

An automated system for grading EEG abnormality in term neonates with hypoxic-ischaemic encephalopa- thy

N. J. Stevenson, I. Korotchikova, A. Temko, G. Lightbody, W. P. Marnane, and G. B. Boylan. “An automated system for grading EEG abnormality in term neonates with hypoxic-ischaemic encephalopa- thy.” Annals of Biomedical Engineering , 41(4), (2013): 775–785

work page 2013

[11] [11]

Grading hypoxicischemic encephalopathy severity in neonatal EEG using GMM supervectors and the support vector machine

R. Ahmed, A. Temko, W. Marnane, G. Lightbody, and G. Boylan. “Grading hypoxicischemic encephalopathy severity in neonatal EEG using GMM supervectors and the support vector machine.” Clinical Neurophysiology, 127(1), (2016): 297-309

work page 2016

[12] [12]

The use of conven- tional EEG for the assessment of hypoxic ischaemic encephalopathy in the newborn: a review

B. H. Walsh, D. M. Murray, and G. B. Boylan. “The use of conven- tional EEG for the assessment of hypoxic ischaemic encephalopathy in the newborn: a review.” Clinical Neurophysiology, 122(7), (2011): 1284–1294

work page 2011

[13] [13]

EMG acquisition and hand pose classiﬁcation for bionic hands from randomly-placed sensors

S. Raurale, J. McAllister, and J. M. del Rincon, “EMG acquisition and hand pose classiﬁcation for bionic hands from randomly-placed sensors” in 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) , (2018): 1105–1109

work page 2018