pith. sign in

arxiv: 2604.24767 · v1 · submitted 2026-04-15 · 💻 cs.LG · cs.CV

Automated detection of pediatric congenital heart disease from phonocardiograms using deep and handcrafted feature fusion

Pith reviewed 2026-05-10 13:55 UTC · model grok-4.3

classification 💻 cs.LG cs.CV
keywords congenital heart diseasephonocardiogramfeature fusiondeep learningpediatric screeningautomated classificationheart sound signalslow-resource diagnosis
0
0 comments X

The pith

Fusing deep neural features with handcrafted ones from heart sound recordings detects congenital heart disease in children with 92% accuracy.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces an automated system for identifying congenital heart disease using phonocardiogram signals captured by a digital stethoscope. It combines deep learning extracted features with manually designed signal processing features to classify recordings as CHD or non-CHD. Tested on 751 pediatric patients in Bangladesh with a patient-wise split, the approach yields 92% accuracy, 91% sensitivity, 91% specificity, 96% AUROC, and 92% F1-score. This matters because echocardiography is expensive and often unavailable in low-resource areas, and expert interpretation varies. The method aims to enable earlier, more accessible screening for a condition affecting about 1% of births.

Core claim

The central discovery is a deep and handcrafted feature fusion model for CHD detection from PCG signals recorded at mitral, aortic, pulmonary, and tricuspid valve sites. On data from 751 subjects aged 1 month to 16 years, labeled by cardiologists, the model achieves 92% accuracy along with 91% sensitivity and specificity under patient-wise partitioning into 70% train, 20% validation, and 10% test sets. The authors argue this fusion supports efficient real-time remote detection as a cost-effective tool in low-resource settings.

What carries the argument

The deep and handcrafted feature fusion process that integrates representations from convolutional neural networks with traditional acoustic and statistical features extracted from phonocardiogram waveforms.

If this is right

  • The system functions as a screening tool for CHDs using only accessible digital stethoscopes.
  • It supports remote and real-time detection to address delays caused by limited expert availability.
  • The reported metrics indicate reliable performance across the tested pediatric population.
  • Patient-wise data splitting ensures the evaluation measures true generalization to unseen individuals.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • If the fusion adds unique value, ablating either feature set on new data could quantify the contribution of each.
  • Extension to multi-class detection of specific CHD subtypes would increase clinical utility.
  • Validation on datasets from other countries and with varying noise levels would test broader applicability.
  • Embedding the model in mobile applications could enable community-level screening programs.

Load-bearing premise

The model's high performance observed on this single Bangladesh-collected dataset will transfer to other populations, devices, and clinical environments without major retraining.

What would settle it

Application of the fixed model to phonocardiogram data collected in a different country or with alternative hardware resulting in accuracy falling below 80% or AUROC below 85% would disprove reliable generalization.

read the original abstract

Congenital heart disease (CHD) is the most common type of birth defect, impacting about 1% of live births worldwide. Echocardiography, the gold-standard diagnostic method, is costly and inaccessible in low-resource settings. Diagnosis is delayed due to limited skilled experts, whose ability to interpret pathological patterns varies significantly, causing inter- and intra-clinician variability. Therefore, we present a new method for a more accessible diagnostic modality, the digital stethoscope, to detect CHDs. Our method is based on deep feature fusion, integrating deep and handcrafted features for the automated early detection of CHDs. For this work, Phonocardiography (PCG) recordings were obtained from 751 pediatric subjects (Age:1 month- 16 years) in Bangladesh, ranging from infants to adults at four auscultation locations: mitral valve (MV), aortic valve (AV), pulmonary valve (PV), and tricuspid valve (TV). These recordings were labeled based on confirmed diagnoses by cardiologists as either cases of CHD or non-CHD. The results demonstrated that our proposed model achieved an accuracy of 92%, a sensitivity of 91%, and a specificity of 91%, based on a patient-wise split of 70% training, 20% validation, and 10% testing. Furthermore, the Area Under the Receiver Operating Characteristic curve (AUROC) of 96%, and an F1-score of 92%. This model promises efficient real-time remote detection of CHDs as a cost-effective screening tool for low-resource settings.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The paper proposes a deep and handcrafted feature fusion approach for detecting congenital heart disease from phonocardiogram recordings collected from 751 pediatric subjects in Bangladesh. Using a patient-wise 70/20/10 train/validation/test split, the model achieves an accuracy of 92%, sensitivity and specificity of 91%, AUROC of 96%, and F1-score of 92%, and is presented as a cost-effective screening tool for low-resource settings.

Significance. Should the fusion method prove superior to simpler alternatives and generalize beyond this dataset, the work could offer a valuable, accessible diagnostic aid where echocardiography is limited. The patient-wise splitting is a methodological strength that supports the reliability of the performance metrics by minimizing leakage risks.

major comments (2)
  1. [Results] The manuscript provides no ablation studies comparing the proposed deep-handcrafted feature fusion against deep features alone or handcrafted features alone. Without these, it is unclear whether the fusion step contributes to the reported 92% accuracy and 96% AUROC, undermining the central claim of the method's superiority.
  2. [Abstract and Experimental Setup] Details are missing on the class distribution (CHD vs non-CHD), the precise fusion architecture (e.g., how features are combined), comparisons to standard baselines such as MFCC with SVM or wavelet-based methods, and any error bars or p-values for the metrics. These omissions make it difficult to verify the performance claims on the held-out test set of ~75 patients.
minor comments (2)
  1. [Methods] The description of the four auscultation locations (MV, AV, PV, TV) and how recordings from multiple locations per subject are handled could be expanded for reproducibility.
  2. [Discussion] Consider adding a limitations section addressing potential dataset biases and the need for multi-center validation.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their constructive comments and positive assessment of the patient-wise splitting and potential significance of the work. We address each major comment below and will revise the manuscript accordingly to strengthen the presentation of our results.

read point-by-point responses
  1. Referee: [Results] The manuscript provides no ablation studies comparing the proposed deep-handcrafted feature fusion against deep features alone or handcrafted features alone. Without these, it is unclear whether the fusion step contributes to the reported 92% accuracy and 96% AUROC, undermining the central claim of the method's superiority.

    Authors: We agree that ablation studies are necessary to substantiate the contribution of the fusion step. In the revised manuscript, we will include new ablation experiments comparing the full deep-handcrafted feature fusion model against variants using deep features alone and handcrafted features alone. These results will be added to the Results section, with performance metrics reported on the same patient-wise held-out test set to directly address the central claim. revision: yes

  2. Referee: [Abstract and Experimental Setup] Details are missing on the class distribution (CHD vs non-CHD), the precise fusion architecture (e.g., how features are combined), comparisons to standard baselines such as MFCC with SVM or wavelet-based methods, and any error bars or p-values for the metrics. These omissions make it difficult to verify the performance claims on the held-out test set of ~75 patients.

    Authors: We acknowledge that these details are required for full reproducibility and verification. We will expand both the abstract and the Experimental Setup section in the revised manuscript to include the class distribution (CHD vs. non-CHD counts), a precise description of the fusion architecture (including the feature combination method such as concatenation followed by dense layers), direct comparisons to standard baselines including MFCC with SVM and wavelet-based classifiers, and statistical details such as error bars (standard deviations from repeated experiments) and p-values for the reported metrics on the held-out test set. revision: yes

Circularity Check

0 steps flagged

No significant circularity; performance metrics derived from held-out patient-wise test split

full rationale

The paper reports accuracy, sensitivity, specificity, AUROC and F1-score on a patient-wise 10% test split after 70/20 train/validation partitioning of the 751-subject dataset. This constitutes standard independent evaluation rather than any reduction of the claimed performance to fitted parameters or self-referential definitions. No equations, derivations, or uniqueness theorems appear in the abstract or summary that would equate outputs to inputs by construction. No load-bearing self-citations or ansatz smuggling are evident. The result remains falsifiable on external data and does not exhibit any of the enumerated circularity patterns.

Axiom & Free-Parameter Ledger

1 free parameters · 2 axioms · 0 invented entities

The central claim rests on standard supervised learning assumptions plus domain knowledge that heart sounds encode CHD information; no new physical entities or unproven mathematical axioms are introduced.

free parameters (1)
  • Deep network hyperparameters and fusion parameters
    Learning rate, layer sizes, and any weighting between deep and handcrafted features are selected or optimized on the validation set.
axioms (2)
  • domain assumption Phonocardiogram signals recorded at standard valve locations contain sufficient information to distinguish CHD from non-CHD cases
    This is the foundational premise that justifies using PCG as a diagnostic modality instead of echocardiography.
  • domain assumption Cardiologist-provided labels constitute accurate ground truth for training and evaluation
    The model is trained and scored against these labels without independent verification of label quality.

pith-pipeline@v0.9.0 · 5623 in / 1508 out tokens · 58462 ms · 2026-05-10T13:55:29.019844+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

44 extracted references · 44 canonical work pages · 1 internal anchor

  1. [1]

    Hassanuzzaman, N.A

    M. Hassanuzzaman, N.A. Hasan, M.A.A. Mamun, K.I. Ahmed, A.H. Khandoker, R. Mostafa, Classification of short segment pediatric heart sounds based on a transformer-based convolutional neural network, 2024, arXiv:2404.00470

  2. [2]

    Imran, E

    Z. Imran, E. Grooby, V.V. Malgi, C. Sitaula, S. Aryal, F. Marzbanrad, A fusion of handcrafted feature-based and deep learning classifiers for heart murmur detection, in: 2022 Computing in Cardiology, CinC, Tampere, Finland, 2022, pp. 1–4, http://dx.doi.org/10.22489/CinC.2022.310

  3. [3]

    N.K. Bhat, M. Dhar, R. Kumar, A. Patel, A. Rawat, B.P. Kalra, Prevalence and pattern of congenital heart disease in Uttarakhand, India, Indian J. Pediatr. 80 (2013) 281–285

  4. [4]

    Gilboa, O.J

    S.M. Gilboa, O.J. Devine, J.E. Kucik, M.E. Oster, T. Riehle-Colarusso, W.N. Nembhard, A.J. Marelli, Congenital heart defects in the United States: estimating the magnitude of the affected population in 2010, Circulation 134 (2) (2016) 101–109

  5. [5]

    Vassar, S

    R. Vassar, S. Peyvandi, D. Gano, S. Cox, Y. Zetino, S. Miller, P. McQuillen, Critical congenital heart disease beyond HLHS and TGA: neonatal brain injury and early neurodevelopment, Pediatr. Res. 94 (2) (2023) 691–698, http://dx.doi.org/10. 1038/s41390-023-02490-9, Epub 2023 Feb 13

  6. [6]

    Lopes, I

    S. Lopes, I. Guimarães, S. Costa, A. Acosta, K. Sandes, C. Mendes, Mortality for critical congenital heart diseases and associated risk factors in newborns. A cohort study, Arq. Bras. Cardiol. 111 (5) (2018) 666–673, http://dx.doi.org/10. 5935/abc.20180175, Epub 2018 Sep 21. Computers in Biology and Medicine 197 (2025) 110993 8 A. Jabbar et al

  7. [7]

    Warnes, R

    C.A. Warnes, R. Liberthson, G.K. Danielson, A. Dore, L. Harris, J.I. Hoffman, G.D. Webb, Task force 1: the changing profile of congenital heart disease in adult life, J. Am. Coll. Cardiol. 37 (5) (2001) 1170–1175

  8. [8]

    Mahle, J.W

    W.T. Mahle, J.W. Newburger, G.P. Matherne, F.C. Smith, T.R. Hoke, R. Koppel, S.D. Grosse, Role of pulse oximetry in examining newborns for congenital heart disease: a scientific statement from the American Heart Association and American Academy of Pediatrics, Circulation 120 (5) (2009) 447–458

  9. [9]

    Abbas, S

    S. Abbas, S. Ojo, A. Al Hejaili, et al., Artificial intelligence framework for heart disease classification from audio signals, Sci. Rep. 14 (2024) 3123, http: //dx.doi.org/10.1038/s41598-024-53778-7

  10. [10]

    Zeinali, S.T.A

    Y. Zeinali, S.T.A. Niaki, Heart sound classification using signal processing and machine learning algorithms, Mach. Learn. Appl. 7 (2022) 100206, http://dx.doi. org/10.1016/j.mlwa.2021.100206, URL https://www.sciencedirect.com/science/ article/pii/S2666827021001031

  11. [11]

    Huang, et al., Applying artificial intelligence to wearable sensor data to diagnose and predict cardiovascular disease: A review, Sensors 22 (2022) 8002

    J.-D. Huang, et al., Applying artificial intelligence to wearable sensor data to diagnose and predict cardiovascular disease: A review, Sensors 22 (2022) 8002

  12. [12]

    Pandey, D

    A. Pandey, D. Adedinsewo, The future of AI-enhanced ECG interpretation for valvular heart disease screening, 2022

  13. [13]

    H. Kui, J. Pan, R. Zong, H. Yang, W. Wang, Heart sound classification based on log Mel-frequency spectral coefficients features and convolutional neural networks, Biomed. Signal Process. Control. 69 (2021) 102893

  14. [14]

    Raza, et al., Heartbeat sound signal classification using deep learning, Sensors 19 (2019) 4819

    A. Raza, et al., Heartbeat sound signal classification using deep learning, Sensors 19 (2019) 4819

  15. [15]

    El Badlaoui, A

    O. El Badlaoui, A. Benba, A. Hammouch, Novel PCG analysis method for discriminating between abnormal and normal heart sounds, IRBM 41 (4) (2020) 223–228

  16. [16]

    H. Wu, S. Kim, K. Bae, Hidden Markov model with heart sound signals for identification of heart diseases, in: Proceedings of 20th International Congress on Acoustics, ICA, Sydney, Australia, 2010, pp. 23–27

  17. [17]

    Maglogiannis, E

    I. Maglogiannis, E. Loukis, E. Zafiropoulos, A. Stasis, Support vectors machine- based identification of heart valve diseases using heart sounds, Comput. Biol. Med. 95 (1) (2009) 47–61

  18. [18]

    Burns, M

    J. Burns, M. Ganigara, A. Dhar, Application of intelligent phonocardiography in the detection of congenital heart disease in pediatric patients: a narrative review, Prog. Pediatr. Cardiol. 64 (2022) 101455

  19. [19]

    P. Chen, Q. Zhang, Classification of heart sounds using discrete time-frequency energy feature based on s transform and the wavelet threshold denoising, Biomed. Signal Process. Control. 57 (2020) 101684

  20. [20]

    Karhade, S

    J. Karhade, S. Dash, S.K. Ghosh, D.K. Dash, R.K. Tripathy, Time–frequency- domain deep learning framework for the automated detection of heart valve disorders using PCG signals, IEEE Trans. Instrum. Meas. 71 (2022) 1–1

  21. [21]

    Ghosh, R

    S.K. Ghosh, R. Ponnalagu, R. Tripathy, U.R. Acharya, Automated detection of heart valve diseases using chirplet transform and multiclass composite classifier with PCG signals, Comput. Biol. Med. 118 (2020) 103632

  22. [22]

    Desai, et al., Decision support system for arrhythmia beats using ECG signals with DCT, DWT and EMD methods: a comparative study, J

    U. Desai, et al., Decision support system for arrhythmia beats using ECG signals with DCT, DWT and EMD methods: a comparative study, J. Mech. Med. Biology 16 (1) (2016) 1640012

  23. [23]

    C. Liu, D. Springer, Q. Li, B. Moody, R.A. Juan, F.J. Chorro, et al., An open access database for the evaluation of heart sound algorithms, Physiol. Meas. 37 (12) (2016) 2181

  24. [24]

    Homsi, F

    M.N. Homsi, F. Plesinger, P. Jurak, L. Viscor, P. Leinveber, I. Halamek, J. Meste, R. Smisek, J.P. Martinek, M. Vondra, Automatic heart sound recording classification using a nested set of ensemble algorithms, in: 2016 Computing in Cardiology Conference, CinC, Vancouver, BC, Canada, 2016, pp. 817–820

  25. [25]

    Zabihi, A.B

    M. Zabihi, A.B. Rad, S. Kiranyaz, M. Gabbouj, A.K. Katsaggelos, Heart sound anomaly and quality detection using ensemble of neural networks without segmentation, in: 2016 Computing in Cardiology Conference, CinC, Vancouver, BC, Canada, 2016, pp. 613–616

  26. [26]

    Potes, S

    C. Potes, S. Parvaneh, A. Rahman, B. Conroy, Ensemble of feature-based and deep learning-based classifiers for detection of abnormal heart sounds, in: 2016 Computing in Cardiology Conference, CinC, Vancouver, BC, Canada, 2016, pp. 621–624

  27. [27]

    Li, et al., A fusion framework based on multi-domain features and deep learning features of phonocardiogram for coronary artery disease detection, Comput

    H. Li, et al., A fusion framework based on multi-domain features and deep learning features of phonocardiogram for coronary artery disease detection, Comput. Biol. Med. 120 (2020) 103733

  28. [28]

    Singh, T.G

    S.A. Singh, T.G. Meitei, S. Majumder, Short PCG Classification Based on Deep Learning, Elsevier, 2020, pp. 141–164

  29. [29]

    Alkhodari, L

    M. Alkhodari, L. Fraiwan, Convolutional and recurrent neural networks for the detection of valvular heart diseases in phonocardiogram recordings, Comput. Biol. Med. 200 (2021) 105940

  30. [30]

    Bozkurt, I

    B. Bozkurt, I. Germanakis, Y. Stylianou, A study of time-frequency features for CNN-based automatic heart sound classification for pathology detection, Comput. Biol. Med. 100 (2018) 132–143

  31. [32]

    Reyna, Y

    M. Reyna, Y. Kiarashi, A. Elola, J. Oliveira, F. Renna, A. Gu, E.A. Perez Alday, N. Sadr, S. Mattos, M. Coimbra, R. Sameni, A. Bahrami Rad, Z. Koscova, G. Clifford, Heart murmur detection from phonocardiogram recordings: The george B. moody PhysioNet challenge 2022 (version 1.0.0), PhysioNet (2023) http://dx.doi.org/10.13026/t49p-5v35

  32. [33]

    Y. Xu, X. Bao, H.K. Lam, E.N. Kamavuako, Hierarchical multi-scale convolutional network for murmurs detection on PCG signals, in: 2022 Computing in Cardiol- ogy, CinC, Tampere, Finland, 2022, pp. 1–4, http://dx.doi.org/10.22489/CinC. 2022.439

  33. [34]

    H. Lu, Y. Zhang, Q. Liu, J. Wang, M. Zhou, J. Li, A lightweight robust approach for automatic heart murmurs and clinical outcomes classification from phonocardiogram recordings, in: 2022 Computing in Cardiology, CinC, Tampere, Finland, 2022, pp. 1–4, http://dx.doi.org/10.22489/CinC.2022.165

  34. [35]

    McDonald, M.J

    A. McDonald, M.J. Gales, A. Agarwal, Detection of heart murmurs in phono- cardiograms with parallel hidden semi-Markov models, in: 2022 Computing in Cardiology, CinC, Tampere, Finland, 2022, pp. 1–4, http://dx.doi.org/10.22489/ CinC.2022.020

  35. [36]

    Aziz, M.U

    S. Aziz, M.U. Khan, M. Alhaisoni, T. Akram, M. Altaf, Phonocardiogram signal processing for automatic diagnosis of congenital heart disorders through fusion of temporal and cepstral features, Sensors 20 (13) (2020) http://dx.doi.org/10. 3390/s20133790, URL https://www.mdpi.com/1424-8220/20/13/3790

  36. [37]

    M. Hassanuzzaman, et al., Recognition of pediatric congenital heart diseases by using phonocardiogram signals and transformer-based neural networks, in: 2023 45th Annual International Conference of the IEEE Engineering in Medicine & Biology Society, EMBC, Sydney, Australia, 2023, pp. 1–4, http://dx.doi.org/10. 1109/EMBC40787.2023.10340370

  37. [38]

    Alkahtani, I.U

    H.K. Alkahtani, I.U. Haq, Y.Y. Ghadi, N. Innab, M. Alajmi, M. Nurbapa, Precision diagnosis: An automated method for detecting congenital heart diseases in chil- dren from phonocardiogram signals employing deep neural network, IEEE Access 12 (2024) 76053–76064, http://dx.doi.org/10.1109/ACCESS.2024.3395389

  38. [39]

    Tutsoy, M.Y

    O. Tutsoy, M.Y. Tanrikulu, Priority and age specific vaccination algorithm for the pandemic diseases: A comprehensive parametric prediction model, BMC Med. Inform. Decis. Mak. 22 (1) (2022) 4

  39. [40]

    Deep learning for ecg analysis: Benchmarks and insights from ptb-xl,

    E. Grooby, et al., Neonatal heart and lung sound quality assessment for robust heart and breathing rate estimation for telehealth applications, IEEE J. Biomed. Heal. Inform. 25 (12) (2021) 4255–4266, http://dx.doi.org/10.1109/JBHI.2020. 3047602

  40. [41]

    Yugar, J

    L. Yugar, J. Yugar-Toledo, N. Dinamarco, L. Sedenho-Prado, B. Moreno, T. Rubio, A. Fattori, B. Rodrigues, J. Vilela-Martin, H. Moreno, The role of heart rate variability (HRV) in different hypertensive syndromes, Diagn. (Basel) 13 (4) (2023) 785, http://dx.doi.org/10.3390/diagnostics13040785

  41. [42]

    Rahmani, A

    A.M. Rahmani, A. Haider, M. Adeli, O. Mzoughi, E. Gemeay, M. Mohammadi, H. Alinejad-Rokny, P. Khoshvaght, M. Hosseinzadeh, Enhanced heart sound classification using mel frequency cepstral coefficients and comparative analysis of single vs. ensemble classifier strategies, PLoS One 19 (12) (2024) e0316645, http://dx.doi.org/10.1371/journal.pone.0316645

  42. [43]

    M. Deng, T. Meng, J. Cao, S. Wang, J. Zhang, H. Fan, Heart sound classification based on improved MFCC features and convolutional recurrent neural networks, Neural Netw. 130 (2020) 22–32, http://dx.doi.org/10.1016/j.neunet.2020.06. 015

  43. [44]

    N.K. Bhat, M. Dhar, R. Kumar, A. Patel, A. Rawat, B.P. Kalra, Prevalence and pattern of congenital heart disease in Uttarakhand, India, Indian J. Pediatr. 80 (4) (2013) 281–285, http://dx.doi.org/10.1007/s12098-012-0738-4

  44. [45]

    Gilboa, O.J

    S.M. Gilboa, O.J. Devine, J.E. Kucik, M.E. Oster, T. Riehle-Colarusso, W.N. Nem- bhard, P. Xu, A. Correa, K. Jenkins, A.J. Marelli, Congenital heart defects in the United States: Estimating the magnitude of the affected population in 2010, Cir- culation 134 (2) (2016) 101–109, http://dx.doi.org/10.1161/CIRCULATIONAHA. 115.019307. Computers in Biology and ...