Automated detection of pediatric congenital heart disease from phonocardiograms using deep and handcrafted feature fusion
Pith reviewed 2026-05-10 13:55 UTC · model grok-4.3
The pith
Fusing deep neural features with handcrafted ones from heart sound recordings detects congenital heart disease in children with 92% accuracy.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The central discovery is a deep and handcrafted feature fusion model for CHD detection from PCG signals recorded at mitral, aortic, pulmonary, and tricuspid valve sites. On data from 751 subjects aged 1 month to 16 years, labeled by cardiologists, the model achieves 92% accuracy along with 91% sensitivity and specificity under patient-wise partitioning into 70% train, 20% validation, and 10% test sets. The authors argue this fusion supports efficient real-time remote detection as a cost-effective tool in low-resource settings.
What carries the argument
The deep and handcrafted feature fusion process that integrates representations from convolutional neural networks with traditional acoustic and statistical features extracted from phonocardiogram waveforms.
If this is right
- The system functions as a screening tool for CHDs using only accessible digital stethoscopes.
- It supports remote and real-time detection to address delays caused by limited expert availability.
- The reported metrics indicate reliable performance across the tested pediatric population.
- Patient-wise data splitting ensures the evaluation measures true generalization to unseen individuals.
Where Pith is reading between the lines
- If the fusion adds unique value, ablating either feature set on new data could quantify the contribution of each.
- Extension to multi-class detection of specific CHD subtypes would increase clinical utility.
- Validation on datasets from other countries and with varying noise levels would test broader applicability.
- Embedding the model in mobile applications could enable community-level screening programs.
Load-bearing premise
The model's high performance observed on this single Bangladesh-collected dataset will transfer to other populations, devices, and clinical environments without major retraining.
What would settle it
Application of the fixed model to phonocardiogram data collected in a different country or with alternative hardware resulting in accuracy falling below 80% or AUROC below 85% would disprove reliable generalization.
read the original abstract
Congenital heart disease (CHD) is the most common type of birth defect, impacting about 1% of live births worldwide. Echocardiography, the gold-standard diagnostic method, is costly and inaccessible in low-resource settings. Diagnosis is delayed due to limited skilled experts, whose ability to interpret pathological patterns varies significantly, causing inter- and intra-clinician variability. Therefore, we present a new method for a more accessible diagnostic modality, the digital stethoscope, to detect CHDs. Our method is based on deep feature fusion, integrating deep and handcrafted features for the automated early detection of CHDs. For this work, Phonocardiography (PCG) recordings were obtained from 751 pediatric subjects (Age:1 month- 16 years) in Bangladesh, ranging from infants to adults at four auscultation locations: mitral valve (MV), aortic valve (AV), pulmonary valve (PV), and tricuspid valve (TV). These recordings were labeled based on confirmed diagnoses by cardiologists as either cases of CHD or non-CHD. The results demonstrated that our proposed model achieved an accuracy of 92%, a sensitivity of 91%, and a specificity of 91%, based on a patient-wise split of 70% training, 20% validation, and 10% testing. Furthermore, the Area Under the Receiver Operating Characteristic curve (AUROC) of 96%, and an F1-score of 92%. This model promises efficient real-time remote detection of CHDs as a cost-effective screening tool for low-resource settings.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper proposes a deep and handcrafted feature fusion approach for detecting congenital heart disease from phonocardiogram recordings collected from 751 pediatric subjects in Bangladesh. Using a patient-wise 70/20/10 train/validation/test split, the model achieves an accuracy of 92%, sensitivity and specificity of 91%, AUROC of 96%, and F1-score of 92%, and is presented as a cost-effective screening tool for low-resource settings.
Significance. Should the fusion method prove superior to simpler alternatives and generalize beyond this dataset, the work could offer a valuable, accessible diagnostic aid where echocardiography is limited. The patient-wise splitting is a methodological strength that supports the reliability of the performance metrics by minimizing leakage risks.
major comments (2)
- [Results] The manuscript provides no ablation studies comparing the proposed deep-handcrafted feature fusion against deep features alone or handcrafted features alone. Without these, it is unclear whether the fusion step contributes to the reported 92% accuracy and 96% AUROC, undermining the central claim of the method's superiority.
- [Abstract and Experimental Setup] Details are missing on the class distribution (CHD vs non-CHD), the precise fusion architecture (e.g., how features are combined), comparisons to standard baselines such as MFCC with SVM or wavelet-based methods, and any error bars or p-values for the metrics. These omissions make it difficult to verify the performance claims on the held-out test set of ~75 patients.
minor comments (2)
- [Methods] The description of the four auscultation locations (MV, AV, PV, TV) and how recordings from multiple locations per subject are handled could be expanded for reproducibility.
- [Discussion] Consider adding a limitations section addressing potential dataset biases and the need for multi-center validation.
Simulated Author's Rebuttal
We thank the referee for their constructive comments and positive assessment of the patient-wise splitting and potential significance of the work. We address each major comment below and will revise the manuscript accordingly to strengthen the presentation of our results.
read point-by-point responses
-
Referee: [Results] The manuscript provides no ablation studies comparing the proposed deep-handcrafted feature fusion against deep features alone or handcrafted features alone. Without these, it is unclear whether the fusion step contributes to the reported 92% accuracy and 96% AUROC, undermining the central claim of the method's superiority.
Authors: We agree that ablation studies are necessary to substantiate the contribution of the fusion step. In the revised manuscript, we will include new ablation experiments comparing the full deep-handcrafted feature fusion model against variants using deep features alone and handcrafted features alone. These results will be added to the Results section, with performance metrics reported on the same patient-wise held-out test set to directly address the central claim. revision: yes
-
Referee: [Abstract and Experimental Setup] Details are missing on the class distribution (CHD vs non-CHD), the precise fusion architecture (e.g., how features are combined), comparisons to standard baselines such as MFCC with SVM or wavelet-based methods, and any error bars or p-values for the metrics. These omissions make it difficult to verify the performance claims on the held-out test set of ~75 patients.
Authors: We acknowledge that these details are required for full reproducibility and verification. We will expand both the abstract and the Experimental Setup section in the revised manuscript to include the class distribution (CHD vs. non-CHD counts), a precise description of the fusion architecture (including the feature combination method such as concatenation followed by dense layers), direct comparisons to standard baselines including MFCC with SVM and wavelet-based classifiers, and statistical details such as error bars (standard deviations from repeated experiments) and p-values for the reported metrics on the held-out test set. revision: yes
Circularity Check
No significant circularity; performance metrics derived from held-out patient-wise test split
full rationale
The paper reports accuracy, sensitivity, specificity, AUROC and F1-score on a patient-wise 10% test split after 70/20 train/validation partitioning of the 751-subject dataset. This constitutes standard independent evaluation rather than any reduction of the claimed performance to fitted parameters or self-referential definitions. No equations, derivations, or uniqueness theorems appear in the abstract or summary that would equate outputs to inputs by construction. No load-bearing self-citations or ansatz smuggling are evident. The result remains falsifiable on external data and does not exhibit any of the enumerated circularity patterns.
Axiom & Free-Parameter Ledger
free parameters (1)
- Deep network hyperparameters and fusion parameters
axioms (2)
- domain assumption Phonocardiogram signals recorded at standard valve locations contain sufficient information to distinguish CHD from non-CHD cases
- domain assumption Cardiologist-provided labels constitute accurate ground truth for training and evaluation
Reference graph
Works this paper leans on
-
[1]
M. Hassanuzzaman, N.A. Hasan, M.A.A. Mamun, K.I. Ahmed, A.H. Khandoker, R. Mostafa, Classification of short segment pediatric heart sounds based on a transformer-based convolutional neural network, 2024, arXiv:2404.00470
work page internal anchor Pith review arXiv 2024
-
[2]
Z. Imran, E. Grooby, V.V. Malgi, C. Sitaula, S. Aryal, F. Marzbanrad, A fusion of handcrafted feature-based and deep learning classifiers for heart murmur detection, in: 2022 Computing in Cardiology, CinC, Tampere, Finland, 2022, pp. 1–4, http://dx.doi.org/10.22489/CinC.2022.310
-
[3]
N.K. Bhat, M. Dhar, R. Kumar, A. Patel, A. Rawat, B.P. Kalra, Prevalence and pattern of congenital heart disease in Uttarakhand, India, Indian J. Pediatr. 80 (2013) 281–285
work page 2013
-
[4]
S.M. Gilboa, O.J. Devine, J.E. Kucik, M.E. Oster, T. Riehle-Colarusso, W.N. Nembhard, A.J. Marelli, Congenital heart defects in the United States: estimating the magnitude of the affected population in 2010, Circulation 134 (2) (2016) 101–109
work page 2010
-
[5]
R. Vassar, S. Peyvandi, D. Gano, S. Cox, Y. Zetino, S. Miller, P. McQuillen, Critical congenital heart disease beyond HLHS and TGA: neonatal brain injury and early neurodevelopment, Pediatr. Res. 94 (2) (2023) 691–698, http://dx.doi.org/10. 1038/s41390-023-02490-9, Epub 2023 Feb 13
work page 2023
-
[6]
S. Lopes, I. Guimarães, S. Costa, A. Acosta, K. Sandes, C. Mendes, Mortality for critical congenital heart diseases and associated risk factors in newborns. A cohort study, Arq. Bras. Cardiol. 111 (5) (2018) 666–673, http://dx.doi.org/10. 5935/abc.20180175, Epub 2018 Sep 21. Computers in Biology and Medicine 197 (2025) 110993 8 A. Jabbar et al
work page 2018
- [7]
-
[8]
W.T. Mahle, J.W. Newburger, G.P. Matherne, F.C. Smith, T.R. Hoke, R. Koppel, S.D. Grosse, Role of pulse oximetry in examining newborns for congenital heart disease: a scientific statement from the American Heart Association and American Academy of Pediatrics, Circulation 120 (5) (2009) 447–458
work page 2009
-
[9]
S. Abbas, S. Ojo, A. Al Hejaili, et al., Artificial intelligence framework for heart disease classification from audio signals, Sci. Rep. 14 (2024) 3123, http: //dx.doi.org/10.1038/s41598-024-53778-7
-
[10]
Y. Zeinali, S.T.A. Niaki, Heart sound classification using signal processing and machine learning algorithms, Mach. Learn. Appl. 7 (2022) 100206, http://dx.doi. org/10.1016/j.mlwa.2021.100206, URL https://www.sciencedirect.com/science/ article/pii/S2666827021001031
-
[11]
J.-D. Huang, et al., Applying artificial intelligence to wearable sensor data to diagnose and predict cardiovascular disease: A review, Sensors 22 (2022) 8002
work page 2022
- [12]
-
[13]
H. Kui, J. Pan, R. Zong, H. Yang, W. Wang, Heart sound classification based on log Mel-frequency spectral coefficients features and convolutional neural networks, Biomed. Signal Process. Control. 69 (2021) 102893
work page 2021
-
[14]
Raza, et al., Heartbeat sound signal classification using deep learning, Sensors 19 (2019) 4819
A. Raza, et al., Heartbeat sound signal classification using deep learning, Sensors 19 (2019) 4819
work page 2019
-
[15]
O. El Badlaoui, A. Benba, A. Hammouch, Novel PCG analysis method for discriminating between abnormal and normal heart sounds, IRBM 41 (4) (2020) 223–228
work page 2020
-
[16]
H. Wu, S. Kim, K. Bae, Hidden Markov model with heart sound signals for identification of heart diseases, in: Proceedings of 20th International Congress on Acoustics, ICA, Sydney, Australia, 2010, pp. 23–27
work page 2010
-
[17]
I. Maglogiannis, E. Loukis, E. Zafiropoulos, A. Stasis, Support vectors machine- based identification of heart valve diseases using heart sounds, Comput. Biol. Med. 95 (1) (2009) 47–61
work page 2009
- [18]
-
[19]
P. Chen, Q. Zhang, Classification of heart sounds using discrete time-frequency energy feature based on s transform and the wavelet threshold denoising, Biomed. Signal Process. Control. 57 (2020) 101684
work page 2020
-
[20]
J. Karhade, S. Dash, S.K. Ghosh, D.K. Dash, R.K. Tripathy, Time–frequency- domain deep learning framework for the automated detection of heart valve disorders using PCG signals, IEEE Trans. Instrum. Meas. 71 (2022) 1–1
work page 2022
- [21]
-
[22]
U. Desai, et al., Decision support system for arrhythmia beats using ECG signals with DCT, DWT and EMD methods: a comparative study, J. Mech. Med. Biology 16 (1) (2016) 1640012
work page 2016
-
[23]
C. Liu, D. Springer, Q. Li, B. Moody, R.A. Juan, F.J. Chorro, et al., An open access database for the evaluation of heart sound algorithms, Physiol. Meas. 37 (12) (2016) 2181
work page 2016
-
[24]
M.N. Homsi, F. Plesinger, P. Jurak, L. Viscor, P. Leinveber, I. Halamek, J. Meste, R. Smisek, J.P. Martinek, M. Vondra, Automatic heart sound recording classification using a nested set of ensemble algorithms, in: 2016 Computing in Cardiology Conference, CinC, Vancouver, BC, Canada, 2016, pp. 817–820
work page 2016
-
[25]
M. Zabihi, A.B. Rad, S. Kiranyaz, M. Gabbouj, A.K. Katsaggelos, Heart sound anomaly and quality detection using ensemble of neural networks without segmentation, in: 2016 Computing in Cardiology Conference, CinC, Vancouver, BC, Canada, 2016, pp. 613–616
work page 2016
- [26]
-
[27]
H. Li, et al., A fusion framework based on multi-domain features and deep learning features of phonocardiogram for coronary artery disease detection, Comput. Biol. Med. 120 (2020) 103733
work page 2020
-
[28]
S.A. Singh, T.G. Meitei, S. Majumder, Short PCG Classification Based on Deep Learning, Elsevier, 2020, pp. 141–164
work page 2020
-
[29]
M. Alkhodari, L. Fraiwan, Convolutional and recurrent neural networks for the detection of valvular heart diseases in phonocardiogram recordings, Comput. Biol. Med. 200 (2021) 105940
work page 2021
-
[30]
B. Bozkurt, I. Germanakis, Y. Stylianou, A study of time-frequency features for CNN-based automatic heart sound classification for pathology detection, Comput. Biol. Med. 100 (2018) 132–143
work page 2018
-
[32]
M. Reyna, Y. Kiarashi, A. Elola, J. Oliveira, F. Renna, A. Gu, E.A. Perez Alday, N. Sadr, S. Mattos, M. Coimbra, R. Sameni, A. Bahrami Rad, Z. Koscova, G. Clifford, Heart murmur detection from phonocardiogram recordings: The george B. moody PhysioNet challenge 2022 (version 1.0.0), PhysioNet (2023) http://dx.doi.org/10.13026/t49p-5v35
-
[33]
Y. Xu, X. Bao, H.K. Lam, E.N. Kamavuako, Hierarchical multi-scale convolutional network for murmurs detection on PCG signals, in: 2022 Computing in Cardiol- ogy, CinC, Tampere, Finland, 2022, pp. 1–4, http://dx.doi.org/10.22489/CinC. 2022.439
-
[34]
H. Lu, Y. Zhang, Q. Liu, J. Wang, M. Zhou, J. Li, A lightweight robust approach for automatic heart murmurs and clinical outcomes classification from phonocardiogram recordings, in: 2022 Computing in Cardiology, CinC, Tampere, Finland, 2022, pp. 1–4, http://dx.doi.org/10.22489/CinC.2022.165
-
[35]
A. McDonald, M.J. Gales, A. Agarwal, Detection of heart murmurs in phono- cardiograms with parallel hidden semi-Markov models, in: 2022 Computing in Cardiology, CinC, Tampere, Finland, 2022, pp. 1–4, http://dx.doi.org/10.22489/ CinC.2022.020
work page 2022
-
[36]
S. Aziz, M.U. Khan, M. Alhaisoni, T. Akram, M. Altaf, Phonocardiogram signal processing for automatic diagnosis of congenital heart disorders through fusion of temporal and cepstral features, Sensors 20 (13) (2020) http://dx.doi.org/10. 3390/s20133790, URL https://www.mdpi.com/1424-8220/20/13/3790
work page 2020
-
[37]
M. Hassanuzzaman, et al., Recognition of pediatric congenital heart diseases by using phonocardiogram signals and transformer-based neural networks, in: 2023 45th Annual International Conference of the IEEE Engineering in Medicine & Biology Society, EMBC, Sydney, Australia, 2023, pp. 1–4, http://dx.doi.org/10. 1109/EMBC40787.2023.10340370
-
[38]
H.K. Alkahtani, I.U. Haq, Y.Y. Ghadi, N. Innab, M. Alajmi, M. Nurbapa, Precision diagnosis: An automated method for detecting congenital heart diseases in chil- dren from phonocardiogram signals employing deep neural network, IEEE Access 12 (2024) 76053–76064, http://dx.doi.org/10.1109/ACCESS.2024.3395389
-
[39]
O. Tutsoy, M.Y. Tanrikulu, Priority and age specific vaccination algorithm for the pandemic diseases: A comprehensive parametric prediction model, BMC Med. Inform. Decis. Mak. 22 (1) (2022) 4
work page 2022
-
[40]
Deep learning for ecg analysis: Benchmarks and insights from ptb-xl,
E. Grooby, et al., Neonatal heart and lung sound quality assessment for robust heart and breathing rate estimation for telehealth applications, IEEE J. Biomed. Heal. Inform. 25 (12) (2021) 4255–4266, http://dx.doi.org/10.1109/JBHI.2020. 3047602
-
[41]
L. Yugar, J. Yugar-Toledo, N. Dinamarco, L. Sedenho-Prado, B. Moreno, T. Rubio, A. Fattori, B. Rodrigues, J. Vilela-Martin, H. Moreno, The role of heart rate variability (HRV) in different hypertensive syndromes, Diagn. (Basel) 13 (4) (2023) 785, http://dx.doi.org/10.3390/diagnostics13040785
-
[42]
A.M. Rahmani, A. Haider, M. Adeli, O. Mzoughi, E. Gemeay, M. Mohammadi, H. Alinejad-Rokny, P. Khoshvaght, M. Hosseinzadeh, Enhanced heart sound classification using mel frequency cepstral coefficients and comparative analysis of single vs. ensemble classifier strategies, PLoS One 19 (12) (2024) e0316645, http://dx.doi.org/10.1371/journal.pone.0316645
-
[43]
M. Deng, T. Meng, J. Cao, S. Wang, J. Zhang, H. Fan, Heart sound classification based on improved MFCC features and convolutional recurrent neural networks, Neural Netw. 130 (2020) 22–32, http://dx.doi.org/10.1016/j.neunet.2020.06. 015
-
[44]
N.K. Bhat, M. Dhar, R. Kumar, A. Patel, A. Rawat, B.P. Kalra, Prevalence and pattern of congenital heart disease in Uttarakhand, India, Indian J. Pediatr. 80 (4) (2013) 281–285, http://dx.doi.org/10.1007/s12098-012-0738-4
-
[45]
S.M. Gilboa, O.J. Devine, J.E. Kucik, M.E. Oster, T. Riehle-Colarusso, W.N. Nem- bhard, P. Xu, A. Correa, K. Jenkins, A.J. Marelli, Congenital heart defects in the United States: Estimating the magnitude of the affected population in 2010, Cir- culation 134 (2) (2016) 101–109, http://dx.doi.org/10.1161/CIRCULATIONAHA. 115.019307. Computers in Biology and ...
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.