pith. machine review for the scientific record. sign in

arxiv: 2604.13567 · v1 · submitted 2026-04-15 · 💻 cs.SD · cs.AI

Recognition: unknown

Comparison of window shapes and lengths in short-time feature extraction for classification of heart sound signals

Authors on Pith no claims yet

Pith reviewed 2026-05-10 12:27 UTC · model grok-4.3

classification 💻 cs.SD cs.AI
keywords heart sound classificationphonocardiographywindow functionsshort-time feature extractionbiLSTMPCG signalsGaussian windowsignal segmentation
0
0 comments X

The pith

Gaussian window at 75 ms length gives best accuracy for heart sound classification with biLSTM.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

Heart sound signals vary over time, requiring feature extraction from short segments rather than entire recordings. The paper tests three window shapes—rectangular, triangular, and Gaussian—at different lengths to split phonocardiography signals before feeding statistical features into a bidirectional LSTM network. Performance metrics show the Gaussian window at 75 ms produces the highest classification results, with the triangular window close behind at the same length and the rectangular window performing worst overall. This Gaussian configuration also exceeds the accuracy of a baseline method, pointing to window choice as a practical lever for improving automated detection of cardiovascular conditions from PCG recordings.

Core claim

The paper establishes that for short-time statistical feature extraction from non-stationary heart sound signals, the Gaussian window shape at a length of 75 ms yields superior classification performance when training and testing a biLSTM network, outperforming both the triangular and rectangular windows across tested lengths as well as a baseline approach.

What carries the argument

Sliding window segmentation of PCG signals using rectangular, triangular, or Gaussian shapes of varying lengths to enable statistical feature extraction for biLSTM-based classification.

If this is right

  • Gaussian windows should be prioritized over rectangular windows in PCG preprocessing pipelines for improved diagnostic accuracy.
  • A 75 ms length offers strong results and can serve as a default choice when balancing detail and stability in heart sound segmentation.
  • Triangular windows remain competitive at 75 ms and could substitute for Gaussian when computational simplicity matters.
  • Adopting the best window configuration raises overall classification rates above standard baseline methods without changing the classifier architecture.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The findings imply that spectral leakage from window side lobes directly degrades feature quality for non-stationary biomedical signals.
  • Similar window optimization could improve short-time analysis in related tasks such as ECG arrhythmia detection or respiratory sound classification.
  • Real-time heart monitoring apps might incorporate Gaussian window defaults to reduce false negatives in automated screening.

Load-bearing premise

The performance gaps between window shapes and lengths stem mainly from those parameters themselves rather than from the particular dataset, training variability, or the specific statistical features chosen.

What would settle it

Running the identical feature extraction, window trials, and biLSTM training on a completely separate labeled PCG dataset; if the 75 ms Gaussian window no longer shows clear superiority, the central claim does not hold.

Figures

Figures reproduced from arXiv: 2604.13567 by Abeer FathAllah Brery, Mahmoud Fakhry.

Figure 1
Figure 1. Figure 1: A sound recording of normal heart (PCG signal) [PITH_FULL_IMAGE:figures/full_fig_p002_1.png] view at source ↗
Figure 3
Figure 3. Figure 3: Three different symmetric windows and their corresponding spectral shap [PITH_FULL_IMAGE:figures/full_fig_p005_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: Normalized sequences of features for two PCG signals obtained with a 75 ms [PITH_FULL_IMAGE:figures/full_fig_p008_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: Average performance for four numbers of neurons of [PITH_FULL_IMAGE:figures/full_fig_p009_5.png] view at source ↗
Figure 6
Figure 6. Figure 6: Classification performance of the test dataset in terms of accuracy for [PITH_FULL_IMAGE:figures/full_fig_p010_6.png] view at source ↗
read the original abstract

Heart sound signals, phonocardiography (PCG) signals, allow for the automatic diagnosis of potential cardiovascular pathology. Such classification task can be tackled using the bidirectional long short-term memory (biLSTM) network, trained on features extracted from labeled PCG signals. Regarding the non-stationarity of PCG signals, it is recommended to extract the features from multiple short-length segments of the signals using a sliding window of certain shape and length. However, some window contains unfavorable spectral side lobes, which distort the features. Accordingly, it is preferable to adapt the window shape and length in terms of classification performance. We propose an experimental evaluation for three window shapes, each with three window lengths. The biLSTM network is trained and tested on statistical features extracted, and the performance is reported in terms of the window shapes and lengths. Results show that the best performance is obtained when the Gaussian window is used for splitting the signals, and the triangular window competes with the Gaussian window for a length of 75 ms. Although the rectangular window is a commonly offered option, it is the worst choice for splitting the signals. Moreover, the classification performance obtained with a 75 ms Gaussian window outperforms that of a baseline method.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

1 major / 2 minor

Summary. The manuscript experimentally compares three window shapes (rectangular, triangular, Gaussian) and three lengths (25 ms, 50 ms, 75 ms) for short-time segmentation of phonocardiogram (PCG) signals prior to extracting statistical features for biLSTM-based heart sound classification. The central claim is that the Gaussian window, especially at 75 ms, yields the highest classification performance, the triangular window is competitive at 75 ms, the rectangular window performs worst, and the 75 ms Gaussian configuration outperforms a baseline method.

Significance. If the reported performance ordering proves robust under repeated trials and statistical testing, the work would offer concrete, practical guidance on a frequently under-examined preprocessing choice (window function and length) for PCG classification pipelines. This could modestly improve feature quality in automated cardiovascular diagnostics without requiring architectural changes to the classifier.

major comments (1)
  1. [Results] Results section (and abstract): The performance rankings and the claim that 'the best performance is obtained when the Gaussian window is used' and that '75 ms Gaussian window outperforms that of a baseline method' rest on single biLSTM training runs per window configuration. Because biLSTM training is stochastic (random weight initialization, data shuffling), accuracy differences of several percent are expected across seeds; the manuscript does not report multiple seeds, standard deviations, or any paired statistical test (e.g., McNemar or Wilcoxon) on per-fold or per-run accuracies. This directly undermines the reliability of the claimed ordering and is load-bearing for the central empirical claim.
minor comments (2)
  1. [Abstract] Abstract: No numerical accuracy values, dataset size, number of recordings, cross-validation procedure, or error bars are supplied, making it impossible for readers to gauge the magnitude or reproducibility of the reported improvements.
  2. [Methods] Methods: The exact set of statistical features, the precise definition of the baseline method, and the PCG dataset (including number of samples per class and any class imbalance handling) should be stated explicitly so that the experiments can be reproduced.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for the constructive feedback on our manuscript. We address the single major comment point by point below.

read point-by-point responses
  1. Referee: [Results] Results section (and abstract): The performance rankings and the claim that 'the best performance is obtained when the Gaussian window is used' and that '75 ms Gaussian window outperforms that of a baseline method' rest on single biLSTM training runs per window configuration. Because biLSTM training is stochastic (random weight initialization, data shuffling), accuracy differences of several percent are expected across seeds; the manuscript does not report multiple seeds, standard deviations, or any paired statistical test (e.g., McNemar or Wilcoxon) on per-fold or per-run accuracies. This directly undermines the reliability of the claimed ordering and is load-bearing for the central empirical claim.

    Authors: We agree that the current results rely on single training runs per configuration and that this limits the strength of the performance ordering claims, given the stochastic elements in biLSTM optimization. To address this directly, the revised manuscript will include new experiments with multiple random seeds (at least five per window configuration). We will report mean accuracies and standard deviations across seeds, and we will apply a paired statistical test (Wilcoxon signed-rank test) on the per-fold accuracies to evaluate whether differences between configurations are significant. These additions will be placed in the Results section, with corresponding updates to the abstract to reflect the more robust evidence. We believe this revision will substantiate the reported rankings without altering the core experimental design. revision: yes

Circularity Check

0 steps flagged

No circularity: purely empirical experimental comparison

full rationale

The paper performs a direct experimental comparison of three window shapes (Gaussian, triangular, rectangular) at three lengths for short-time statistical feature extraction from PCG signals, followed by biLSTM training and performance reporting. No mathematical derivations, first-principles predictions, fitted parameters renamed as outputs, or self-citation chains are present in the abstract or described methodology. The reported rankings (Gaussian best, rectangular worst, 75 ms Gaussian outperforming baseline) are empirical results from the stated protocol rather than reductions to inputs by construction. The study is self-contained against external benchmarks with no load-bearing self-referential steps.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 0 invented entities

The paper rests on standard domain assumptions about non-stationary signals and the utility of statistical features for LSTM classification; it introduces no free parameters, new axioms, or invented entities.

axioms (2)
  • domain assumption PCG signals are non-stationary and therefore benefit from short-time windowed analysis
    Explicitly stated in the abstract as the reason for using sliding windows.
  • domain assumption Statistical features extracted from windowed segments are sufficient to train an effective biLSTM classifier
    Implicit in the choice to report performance on those features.

pith-pipeline@v0.9.0 · 5517 in / 1360 out tokens · 38902 ms · 2026-05-10T12:27:11.531889+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

43 extracted references · 37 canonical work pages

  1. [1]

    Noninvasive acoustical detection of co ronary artery disease: a comparative study of signa l processing methods,

    Y. M. Akay, M. Akay, W. Welkowitz, J. L. Semmlow, and J. B. Kostis, “Noninvasive acoustical detection of co ronary artery disease: a comparative study of signa l processing methods,” IEEE Transactions on Biomedical Engineering , vol. 40, no. 6, pp. 571–578, Jun. 1993, doi: 10.1109/10.237677

  2. [2]

    Phonocardiogram signal analysis: a review,

    R. M. Rangayyan and R. J. Lehner, “Phonocardiogram signal analysis: a review,” Critical reviews in biomedical engineering , vol. 15, no. 3, pp. 211–236, 1987

  3. [3]

    Phonocardiogram signal analysis: Techni ques and performance comparison,

    M. S. Obaidat, “Phonocardiogram signal analysis: Techni ques and performance comparison,” Journal of Medical Engineering and Technology, vol. 17, no. 6, pp. 221–227, Jan. 1993, doi: 10.3109/03091909309006329

  4. [4]

    Automatic diagnosis of the 12 -lead ECG using a deep neural network,

    A. H. Ribeiro et al., “Automatic diagnosis of the 12 -lead ECG using a deep neural network,” Nature Communications, vol. 11, no. 1, Dec. 2020, doi: 10.1038/s41467-020-15432-4

  5. [5]

    Deep learning methods for heart sounds classification: a systematic review,

    W. Chen, Q. Sun, X. Chen, G. Xie, H. Wu, and C. Xu, “Deep learning methods for heart sounds classification: a systematic review,” Entropy, vol. 23, no. 6, May 2021, doi: 10.3390/e23060667

  6. [6]

    Principal component analysis in ECG signal processing,

    F. Castells, P. Lagu na, L. Sörnmo, A. Bollmann, and J. M. Roig, “Principal component analysis in ECG signal processing,” EURASIP Journal on Advances in Signal Processing, no. 1, Dec. 2007, doi: 10.1155/2007/74580

  7. [7]

    Selection of nonstationary dynamic feature s for obstructive sleep apnoea detection in children,

    L. M. Sepulveda-Cano, E. Gil, P. Laguna, and G. Castellanos-Dominguez, “Selection of nonstationary dynamic feature s for obstructive sleep apnoea detection in children,” EURASIP Journal on Advances in Signal Processing , no. 1, Dec. 2011, doi: 10.1155/2011/538314

  8. [8]

    Data fus ion for improved respiration rate estimation,

    S. Nemati, A. Malhotra, and G. D. Clifford, “Data fus ion for improved respiration rate estimation,” EURASIP Journal on Advances in Signal Processing, no. 1, Dec. 2010, doi: 10.1155/2010/926305

  9. [9]

    Auscultation of the heart: general principles,

    S. McGee, “Auscultation of the heart: general principles,” in Evidence-Based Physical Diagnosis, Elsevier, 2018, pp. 327–332

  10. [10]

    Modeling of size dependent failure in cardio- vascular stent struts under tension and bending

    M. El- Segaier, O. Lilja, S. Lukkarinen, L. Sörnmo, R. Sepp onen, and E. Pesonen, “Computer -based detection and analysis of heart sound and murmur,” Annals of Biomedical Engineering , vol. 33, no. 7, pp. 937 –942, Jul. 2005, doi: 10.1007/s10439- 005- 4053-3

  11. [11]

    A review of signal processing technique s for heart sound analysis in clinical diagnosis,

    B. S. Emmanuel, “A review of signal processing technique s for heart sound analysis in clinical diagnosis,” Journal of Medical Engineering and Technology, vol. 36, no. 6, pp. 303–307, Aug. 2012, doi: 10.3109/03091902.2012.684831

  12. [12]

    A review of co mputer-aided heart sound detection techniques,

    S. Li, F. Li, S. Tang, and W. Xiong, “A review of co mputer-aided heart sound detection techniques,” BioMed Research International, pp. 1–10, Jan. 2020, doi: 10.1155/2020/5846191

  13. [13]

    K. M. M. Prabhu, Window functions and their applications in signal processing. CRC Press, 2018., doi: 10.1201/9781315216386

  14. [14]

    Detection of differences of the phonocardio gram signals by using the continuous wavelet transform method,

    S. Debbal and F. Bereksi- Reguig, “Detection of differences of the phonocardio gram signals by using the continuous wavelet transform method,” International Journal of Biomedical Soft Computing and Human Sciences, vol. 18, no. 2, pp. 73–81, 2013. Int J Elec & Comp Eng ISSN: 2088-8708  Comparison of window shapes and lengths in short-time feature extracti...

  15. [15]

    Compa rison of short -time fourier, wavelet and time-domain analyses of intracardiac sounds,

    J. Bulgrin, B. Rubal, C. Thompson, and J. Moody, “Compa rison of short -time fourier, wavelet and time-domain analyses of intracardiac sounds,” Biomedical sciences instrumentation, vol. 29, pp. 465–472, 1993

  16. [16]

    An adaptive singular spectrum analysis approach to murmur detection from heart sounds,

    S. Sanei, M. Ghodsi, and H. Hassani, “An adaptive singular spectrum analysis approach to murmur detection from heart sounds,” Medical Engineering and Physics, vol. 33, no. 3, pp. 362–367, Apr. 2011, doi: 10.1016/j.medengphy.2010.11.004

  17. [17]

    Research on the metho d of characteristic extraction and classification of Phonocardiogram,

    J. Wu, S. Zhou, Z. Wu, and X. Wu, “Research on the metho d of characteristic extraction and classification of Phonocardiogram,” in 2012 International Conference on Systems and Informat ics (ICSAI2012) , May 2012, pp. 1732 –1735, doi: 10.1109/ICSAI.2012.6223377

  18. [18]

    A system for heart sounds classification,

    G. Redlarski, D. Gradolewski, and A. Palkowski, “A system for heart sounds classification,” Plos One, vol. 9, no. 11, Nov. 2014, doi: 10.1371/journal.pone.0112673

  19. [19]

    Classification of heart sound signal using curve fitting and fractal d imension,

    M. Hamidi, H. Ghassemian, and M. Imani, “Classification of heart sound signal using curve fitting and fractal d imension,” Biomedical Signal Processing and Control, vol. 39, pp. 351–359, Jan. 2018, doi: 10.1016/j.bspc.2017.08.002

  20. [20]

    Classification of unsegmented heart sound recording using KNN classifier,

    S. A. Singh and S. Ma jumder, “Classification of unsegmented heart sound recording using KNN classifier,” Journal of Mechanics in Medicine and Biology, vol. 19, no. 4, Jun. 2019, doi: 10.1142/S0219519419500258

  21. [21]

    Classification of heart sounds usin g an artificial neural network,

    T. Ölmez and Z. Dokur, “Classification of heart sounds usin g an artificial neural network,” Pattern Recognition Letters, vol. 24, no. 1–3, pp. 617–629, Jan. 2003, doi: 10.1016/S0167-8655(02)00281-7

  22. [22]

    Detection of heart mu rmurs using wavelet analysis and artificial neural networks,

    N. Andrisevic, K. Ejaz, F. Rios-Gutierrez, R. A lba-Flores, G. Nordehn, and S. Burns, “Detection of heart mu rmurs using wavelet analysis and artificial neural networks,” Journal of Biomechanical Engineering , vol. 127, no. 6, pp. 899 –904, Nov. 2005, doi: 10.1115/1.2049327

  23. [23]

    Automated pediatr ic cardiac auscultation,

    J. P. de Vos and M. M. Blanckenberg, “Automated pediatr ic cardiac auscultation,” IEEE Transactions on Biomedical Engineering, vol. 54, no. 2, pp. 244–252, Feb. 2007, doi: 10.1109/TBME.2006.886660

  24. [24]

    PCG classification using a neural network approach,

    I. Grzegorczyk et al., “PCG classification using a neural network approach,” Sep. 2016., doi: 10.22489/CinC.2016.323-252

  25. [25]

    Classification of normal/abnormal hea rt sound recordings based on multi:domain features and back propagation neural network,

    H. Tang, H. Chen, T. Li, and M. Zhong, “Classification of normal/abnormal hea rt sound recordings based on multi:domain features and back propagation neural network,” Sep. 2016., doi: 10.22489/CinC.2016.171-159

  26. [26]

    Detection of pathological heart sounds,

    M. Abdollahpur, A. Ghaffari, S. Ghiasi, and M. J. Mollakazem i, “Detection of pathological heart sounds,” Physiological Measurement, vol. 38, no. 8, pp. 1616–1630, Jul. 2017, doi: 10.1088/1361-6579/aa7840

  27. [27]

    An Introduction to Convolutional Neural Networks

    K. O’Shea and R. Nash, “An introduction to convolut ional neural networks,” arXiv preprint arXiv:1511.08458, Nov. 2015. doi: 10.48550/arXiv.1511.08458

  28. [28]

    Automated heart sound classification system from unsegmente d phonocardiogram (PCG) using deep neural network,

    P. T. Krishnan, P. Balasubramanian, and S. Umapathy, “Automated heart sound classification system from unsegmente d phonocardiogram (PCG) using deep neural network,” Physical and Engineering Sciences in Medicine , vol. 43, no. 2, pp. 505–515, Jun. 2020, doi: 10.1007/s13246-020-00851-w

  29. [29]

    Heart sound classification using deep struct ured features,

    M. Tschannen, T. Kramer, G. Marti, M. Heinzmann, and T. Wia towski, “Heart sound classification using deep struct ured features,” Sep. 2016., doi: 10.22489/CinC.2016.162-186

  30. [30]

    Recognition of normal -abnormal phonocardiographic signals using deep convolutional neural networks and mel- frequency spectral coefficients,

    V. Maknickas and A. Maknickas, “Recognition of normal -abnormal phonocardiographic signals using deep convolutional neural networks and mel- frequency spectral coefficients,” Physiological Measurement, vol. 38, no. 8, pp. 1671 –1684, Jul. 2017, doi: 10.1088/1361-6579/aa7841

  31. [31]

    Recognizing abnormal heart sou nds using deep learning,

    J. Rubin, R. Abreu, A. Ganguli, S. Nelaturi, I. Matei , and K. Sricharan, “Recognizing abnormal heart sou nds using deep learning,” arXiv preprint arXiv:1707.04642, 2017, doi: 10.48550/arXiv.1707.04642

  32. [32]

    Abnormal heart beat detection using recurrent neural networks,

    S. Latif, M. Usman, J. Qadir, and R. Rana, “Abnormal heart beat detection using recurrent neural networks,” CoRR abs/1801.08322, 2018

  33. [33]

    Medsker and L

    L. Medsker and L. C. Jain, Recurrent neural networks: design and applications. CRC press, 1999

  34. [34]

    Murmur d etection using parallel recurrent and convolutional neural networks,

    S. Alam, R. Banerjee, and S. Bandyopadhyay, “Murmur d etection using parallel recurrent and convolutional neural networks,” arXiv preprint arXiv:1808.04411, 2018. doi: 10.48550/arXiv.1808.04411

  35. [35]

    Classification of PCG signals using a nonlinea r autoregressive network with exogenous inputs (NARX),

    S. Khaled, M. Fakhry, and A. S. Muba rak, “Classification of PCG signals using a nonlinea r autoregressive network with exogenous inputs (NARX),” in 2020 International Conference on Innovative Trends in C ommunication and Computer Engineering (ITCE), Feb. 2020, pp. 98–102, doi: 10.1109/ITCE48509.2020.9047772

  36. [36]

    Analysis of training optimization algorithms in the NARX neural network for classification of heart sound signals,

    S. Khaled, M. Fakhry, H. Esmail, A. Ezzat, and E. Hamad , “Analysis of training optimization algorithms in the NARX neural network for classification of heart sound signals,” International Journal of Scientific and Engineering Rese arch, vol. 13, no. 2, pp. 382–390, 2022

  37. [37]

    A comparison study on trai ning optimization algorithms in the biLSTM neural networ k for classification of PCG signals,

    M. Fakhry and A. F. Brery, “A comparison study on trai ning optimization algorithms in the biLSTM neural networ k for classification of PCG signals,” in 2022 2nd International Conference on Innovative Resea rch in Applied Science, Engineering and Technology (IRASET), Mar. 2022, pp. 1–6, doi: 10.1109/IRASET52964.2022.9738309

  38. [38]

    Lerch, An introduction to audio content analysis

    A. Lerch, An introduction to audio content analysis . Hoboken, NJ, USA: John Wiley and Sons, Inc., 2012, doi : 10.1002/9781118393550

  39. [39]

    A review on the long short -term memory model,

    G. Van Houdt, C. Mosquera, and G. Nápoles, “A review on the long short -term memory model,” Artificial Intelligence Review, vol. 53, no. 8, pp. 5929–5955, Dec. 2020, doi: 10.1007/s10462-020-09838-1

  40. [40]

    1997, IEEE Transactions on Signal Processing, 45, 2673, doi: 10.1109/78.650093

    M. Schuster and K. K. Paliwal, “Bidirectional recurr ent neural networks,” IEEE Transactions on Signal Processing , vol. 45, no. 11, pp. 2673–2681, 1997, doi: 10.1109/78.650093

  41. [41]

    A survey of optimization methods from a machine learning perspective.IEEE Transactions on Cybernetics, 50(8):3668– 3681, 2020

    S. Sun, Z. Cao, H. Zhu, and J. Zhao, “A survey of op timization methods from a machine learning perspective ,” IEEE Transactions on Cybernetics, vol. 50, no. 8, pp. 3668–3681, Aug. 2020, doi: 10.1109/TCYB.2019.2950779

  42. [42]

    An open access database for the evaluation of heart sound algorithms,

    C. Liu et al., “An open access database for the evaluation of heart sound algorithms,” Physiological Measurement, vol. 37, no. 12, pp. 2181–2213, Dec. 2016, doi: 10.1088/0967-3334/37/12/2181

  43. [43]

    Maroco, D

    J. Maroco, D. Silva, A. Rodrigues, M. Guerreiro, I. S antana, and A. de Mendonça, “Data mining methods in t he prediction of Dementia: A real-data comparison of the accuracy, sensi tivity and specificity of linear discriminant analys is, logistic regression, neural networks, support vector machines, classification trees and random forests,” BMC Research N...