pith. sign in

arxiv: 2606.12252 · v1 · pith:ZGQAIVIEnew · submitted 2026-06-10 · 💻 cs.LG · cs.AI

Using Explainability as a Training-Time Reliability Signal for Efficient ECG Classification

Pith reviewed 2026-06-27 10:12 UTC · model grok-4.3

classification 💻 cs.LG cs.AI
keywords ECG classificationexplainabilityGrad-CAMtraining efficiencyreliability signalprogressive data selectionclinical time-series
0
0 comments X

The pith

Explanation quality from Grad-CAM can serve as a training signal to select reliable samples and reduce costs in ECG classification.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper sets out to establish that the quality of model explanations offers a practical way to decide which ECG samples deserve continued gradient updates during training. It replaces reliance on model confidence with a focus score computed from Grad-CAM attention maps, keeping samples whose predictions rest on coherent localised patterns and dropping those whose uncertainty stems from noise or ambiguity. This selective approach is meant to lower the number of samples that consume full training effort while preserving or improving classification performance. A reader would care because clinical time-series tasks often run on limited hardware, so any method that trims compute without harming accuracy could make repeated model development feasible in more settings. The reported results on three datasets and several architectures show gains in macro-F1 together with lower effective training cost.

Core claim

ERTS computes a focus score from Grad-CAM attention maps on candidate samples and uses that score to filter samples for gradient updates, retaining only those with coherent and localised attention while excluding low-focus examples. This replaces the confidence-based criterion in progressive data dropout so that samples difficult due to noise are removed earlier. The method produces higher macro-F1 scores with reduced training cost across three ECG datasets and multiple backbone architectures.

What carries the argument

The focus score derived from Grad-CAM attention maps, which measures whether a prediction rests on coherent and localised patterns to decide whether a sample continues to receive gradient updates.

If this is right

  • Higher macro-F1 scores are obtained while the number of samples receiving full gradient updates is reduced.
  • The method distinguishes unreliable uncertainty from informative uncertainty more effectively than confidence alone.
  • Performance gains hold across three different ECG datasets and several backbone architectures.
  • The approach can be layered on existing progressive data selection pipelines without changing the underlying loss or architecture.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same focus score could be reused at inference time to flag individual predictions whose explanations are diffuse.
  • The technique may transfer to other clinical time-series problems where attention maps can be computed and where noise is common.
  • Combining the explanation-based filter with confidence or loss-based criteria could produce a hybrid selection rule that further reduces training cost.

Load-bearing premise

The focus score from Grad-CAM attention maps reliably separates samples that are difficult because of noise or ambiguity from those that are difficult because the model itself is making errors on informative data.

What would settle it

Training the same architectures on the same ECG datasets with ERTS and with standard progressive dropout, then finding that the ERTS version shows lower macro-F1 or higher effective cost, would indicate the focus score does not provide the claimed advantage.

Figures

Figures reproduced from arXiv: 2606.12252 by Shreyank N Gowda, Veerendhra Kumar Dangeti, Xiao Gu, Ying Weng.

Figure 1
Figure 1. Figure 1: Standard confidence-based data selection treats all uncertain samples as equally informative. However, uncertainty may arise from meaningful but under-learned patterns or from noise and ambiguity. We use explanation quality to distinguish between these cases. Samples with focused and clinically meaningful attention are retained, while those with diffuse or unreliable explanations are filtered out, leading … view at source ↗
Figure 2
Figure 2. Figure 2: Overview of the proposed ERTS framework. At each epoch, candidate samples are first selected using confidence-based filtering (PDD). Explanation quality is then used to refine this subset by retaining samples with focused and reliable attention. The selected subset is used for backpropagation, and the model is updated iteratively. Over training, the effective dataset size decreases, and in the final epoch … view at source ↗
Figure 3
Figure 3. Figure 3: Pareto comparison between predictive perfor [PITH_FULL_IMAGE:figures/full_fig_p006_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: Absolute number of training samples used for [PITH_FULL_IMAGE:figures/full_fig_p010_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: Two-stage filtering behaviour of ERTS in the [PITH_FULL_IMAGE:figures/full_fig_p011_5.png] view at source ↗
Figure 6
Figure 6. Figure 6: Class distribution of samples retained by ERTS, removed by the Grad-CAM filter, and removed by confidence-based DBPD filtering. Results are shown for the PTB-XL EfficientNetV2-S model using 𝜏 = 0.3 and 𝜙 = 0.7 [PITH_FULL_IMAGE:figures/full_fig_p012_6.png] view at source ↗
Figure 7
Figure 7. Figure 7: Log-scale view of cumulative sample counts retained by ERTS, removed by the Grad-CAM filter, and removed by confidence-based DBPD filtering throughout training. The logarithmic scale highlights large differences in filtering behaviour across classes. V. K. Dangeti, S. N. Gowda: Preprint submitted to Elsevier Page 12 of 16 [PITH_FULL_IMAGE:figures/full_fig_p012_7.png] view at source ↗
Figure 8
Figure 8. Figure 8: Qualitative examples of samples retained and filtered by ERTS. Each panel contrasts a retained sample with a dropped sample using Grad-CAM attention maps and focus scores. Retained samples generally show more coherent and localised attention around ECG morphology, while dropped samples show more diffuse activation patterns. These examples illustrate how ERTS uses explanation quality to distinguish informat… view at source ↗
read the original abstract

Training deep neural networks for clinical time-series analysis is computationally demanding, yet many healthcare settings lack the resources required for repeated model development and deployment. This challenge is particularly evident in electrocardiogram classification, where large datasets and long training schedules make efficiency practically important. Progressive Data Dropout reduces training cost by excluding samples from gradient updates once they are learned, but it relies on model confidence and may retain samples that are difficult due to noise or ambiguity rather than useful signal. In this work, we introduce ERTS, an explainability-based reliability training signal for efficient ECG classification. ERTS uses explanation quality during training to distinguish between informative and unreliable uncertainty. Building on progressive data selection, we compute Grad-CAM attention maps for candidate samples and derive a focus score that measures whether model predictions are supported by coherent and localised patterns. Samples with low focus are filtered out, while those with meaningful attention are prioritised for gradient updates. We evaluate ERTS across three ECG datasets and multiple backbone architectures, showing consistent improvements in macro-F1 alongside reduced effective training cost. These results suggest that explanation quality can serve as a practical signal for improving both efficiency and reliability in clinical time-series learning. Code will be released.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

1 major / 0 minor

Summary. The paper introduces ERTS, an explainability-based reliability training signal for efficient ECG classification. Building on progressive data selection, it computes Grad-CAM attention maps during training to derive a focus score measuring coherent and localised patterns in model predictions. Low-focus samples are filtered from gradient updates while high-focus samples are prioritised, with the goal of distinguishing informative uncertainty from noise or ambiguity. The method is evaluated on three ECG datasets across multiple backbone architectures, reporting consistent macro-F1 improvements alongside reduced effective training cost.

Significance. If the results hold and the focus-score assumption is validated, the work would demonstrate a practical use of explanation quality as a training-time signal to improve both efficiency and reliability in clinical time-series tasks. This is relevant for resource-limited healthcare settings where repeated full-dataset training is costly. The planned code release would aid reproducibility.

major comments (1)
  1. [Abstract] Abstract (paragraph describing ERTS): the central claim requires that the focus score derived from Grad-CAM reliably identifies samples difficult due to noise/ambiguity rather than model error. No explicit validation (e.g., correlation with expert-labeled noise, comparison to oracle difficulty, or ablation on class-imbalance effects) is described; this assumption is load-bearing because the progressive training loop uses the score to decide which samples receive gradient updates. If low-focus samples instead reflect early-training instability or imbalance, filtering removes useful signal and the reported efficiency/reliability gains become illusory.

Simulated Author's Rebuttal

1 responses · 1 unresolved

We thank the referee for the constructive feedback and the opportunity to clarify the assumptions underlying ERTS. The major comment concerns the lack of explicit validation for the focus score. We address this point below and indicate planned revisions.

read point-by-point responses
  1. Referee: [Abstract] Abstract (paragraph describing ERTS): the central claim requires that the focus score derived from Grad-CAM reliably identifies samples difficult due to noise/ambiguity rather than model error. No explicit validation (e.g., correlation with expert-labeled noise, comparison to oracle difficulty, or ablation on class-imbalance effects) is described; this assumption is load-bearing because the progressive training loop uses the score to decide which samples receive gradient updates. If low-focus samples instead reflect early-training instability or imbalance, filtering removes useful signal and the reported efficiency/reliability gains become illusory.

    Authors: We agree that direct validation of the focus score would strengthen the central claim. The manuscript does not report correlations with expert-labeled noise or oracle difficulty measures, as these annotations are unavailable in the public ECG datasets used. However, the consistent macro-F1 improvements and reduced training cost across three datasets and multiple architectures provide empirical support that the focus score prioritizes samples with coherent attention patterns. To mitigate concerns about class imbalance and early-training instability, we will add ablations in the revised manuscript that track focus-score distributions over epochs and stratified by class frequency. These analyses will help confirm that low-focus filtering does not primarily remove useful signal due to imbalance or instability. revision: partial

standing simulated objections not resolved
  • Direct correlation analysis against expert-labeled noise annotations, which are not present in the evaluated public datasets.

Circularity Check

0 steps flagged

No significant circularity; derivation uses external Grad-CAM without self-referential reduction

full rationale

The paper defines ERTS by applying standard Grad-CAM (cited from prior external literature) to compute a focus score on attention maps, then uses that score to filter samples during progressive data selection. No equations appear that equate the focus score or selection outcome to fitted parameters by construction. The method does not rename known results, smuggle ansatzes via self-citation, or invoke uniqueness theorems from the authors' prior work. The central claim rests on an empirical assumption about what the focus score measures, but that assumption is not forced by definition or self-reference; the derivation chain remains independent of its own outputs.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Only the abstract is available, so no concrete free parameters, axioms, or invented entities are extractable beyond the high-level description of the ERTS method itself.

pith-pipeline@v0.9.1-grok · 5747 in / 1158 out tokens · 27410 ms · 2026-06-27T10:12:08.247749+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

54 extracted references · 8 linked inside Pith

  1. [1]

    Arazo, E., Ortego, D., Albert, P., O’Connor, N., McGuinness, K.,

  2. [2]

    Unsupervised label noise modeling and loss correction, in: International conference on machine learning, PMLR. pp. 312–321

  3. [3]

    An artificial intelligence-enabled ecg algorithm for the identification of patients with atrial fibrillation during sinus rhythm: a retrospective analysis of outcome prediction

    Attia, Z.I., Noseworthy, P.A., Lopez-Jimenez, F., Asirvatham, S.J., Deshmukh,A.J.,Gersh,B.J.,Carter,R.E.,Yao,X.,Rabinstein,A.A., Erickson, B.J., et al., 2019. An artificial intelligence-enabled ecg algorithm for the identification of patients with atrial fibrillation during sinus rhythm: a retrospective analysis of outcome prediction. The Lancet 394, 861–867

  4. [4]

    Bengio,Y.,Louradour,J.,Collobert,R.,Weston,J.,2009.Curriculum learning, in: Proceedings of the 26th annual international conference on machine learning, pp. 41–48

  5. [5]

    Active label cleaning for improved dataset quality under resource constraints

    Bernhardt, M., Castro, D.C., Tanno, R., Schwaighofer, A., Tezcan, K.C.,Monteiro,M.,Bannur,S.,Lungren,M.P.,Nori,A.,Glocker,B., et al., 2022. Active label cleaning for improved dataset quality under resource constraints. Nature communications 13, 1161

  6. [6]

    Chattopadhay, A., Sarkar, A., Howlader, P., Balasubramanian, V.N.,

  7. [7]

    Grad-cam++: Generalized gradient-based visual explanations fordeepconvolutionalnetworks,in:2018IEEEwinterconferenceon applications of computer vision (WACV), IEEE. pp. 839–847

  8. [8]

    Erion, G., Janizek, J.D., Sturmfels, P., Lundberg, S.M., Lee, S.I.,

  9. [9]

    Nature machine intelligence 3, 620–631

    Improving performance of deep learning models with ax- iomatic attribution priors and expected gradients. Nature machine intelligence 3, 620–631

  10. [10]

    Adaptive data dropout: Towards self-regulated learning in deep neural networks

    Gahir, A., Patel, V., Gowda, S.N., 2026. Adaptive data dropout: Towards self-regulated learning in deep neural networks. arXiv preprint arXiv:2604.12945

  11. [11]

    Synthetic sample selection for generalized zero-shot learning, in: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp

    Gowda, S.N., 2023. Synthetic sample selection for generalized zero-shot learning, in: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 58–67

  12. [12]

    Gowda, S.N., Hao, X., Li, G., Gowda, S.N., Jin, X., Sevilla-Lara, L.,

  13. [13]

    Wattforwhat:Rethinkingdeeplearning’senergy-performance relationship,in:EuropeanConferenceonComputerVision,Springer. pp. 388–405

  14. [14]

    Cardiac health assessment across scenarios and devices using a multimodal foundation model pretrained on data from 1.7 million individuals

    Gu, X., Tang, W., Han, J., Sangha, V., Liu, F., Gowda, S.N., Ribeiro, A.H., Schwab, P., Branson, K., Clifton, L., et al., 2026. Cardiac health assessment across scenarios and devices using a multimodal foundation model pretrained on data from 1.7 million individuals. Nature Machine Intelligence 8, 220–233

  15. [15]

    Deepcompression:Compressing deepneuralnetworkswithpruning,trainedquantizationandhuffman coding

    Han,S.,Mao,H.,Dally,W.J.,2015. Deepcompression:Compressing deepneuralnetworkswithpruning,trainedquantizationandhuffman coding. arXiv preprint arXiv:1510.00149

  16. [16]

    Cardiologist-level arrhythmia detectionandclassificationinambulatoryelectrocardiogramsusinga deep neural network

    Hannun, A.Y., Rajpurkar, P., Haghpanahi, M., Tison, G.H., Bourn, C., Turakhia, M.P., Ng, A.Y., 2019. Cardiologist-level arrhythmia detectionandclassificationinambulatoryelectrocardiogramsusinga deep neural network. Nature medicine 25, 65–69

  17. [17]

    Distillingtheknowledgeina neural network

    Hinton,G.,Vinyals,O.,Dean,J.,2015. Distillingtheknowledgeina neural network. arXiv preprint arXiv:1503.02531

  18. [18]

    Quantization and training of neural networks for efficient integer-arithmetic-only inference, in: Proceedings of the IEEE conference on computer vision and pattern recognition, pp

    Jacob, B., Kligys, S., Chen, B., Zhu, M., Tang, M., Howard, A., Adam, H., Kalenichenko, D., 2018. Quantization and training of neural networks for efficient integer-arithmetic-only inference, in: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 2704–2713

  19. [19]

    Explainable detection of myocardial infarction using deep learning models with grad-cam technique on ecg signals

    Jahmunah, V., Ng, E.Y., Tan, R.S., Oh, S.L., Acharya, U.R., 2022. Explainable detection of myocardial infarction using deep learning models with grad-cam technique on ecg signals. Computers in Biology and Medicine 146, 105550

  20. [20]

    Improving medical images classification with label noise using dual-uncertainty estimation

    Ju, L., Wang, X., Wang, L., Mahapatra, D., Zhao, X., Zhou, Q., Liu, T., Ge, Z., 2022. Improving medical images classification with label noise using dual-uncertainty estimation. IEEE transactions on medical imaging 41, 1533–1546

  21. [21]

    Not all samples are created equal: Deep learning with importance sampling, in: International conference on machine learning, PMLR

    Katharopoulos, A., Fleuret, F., 2018. Not all samples are created equal: Deep learning with importance sampling, in: International conference on machine learning, PMLR. pp. 2525–2534

  22. [22]

    Arrhythmia detection model using modified densenet for comprehensible grad-cam visual- ization

    Kim, J.K., Jung, S., Park, J., Han, S.W., 2022. Arrhythmia detection model using modified densenet for comprehensible grad-cam visual- ization. Biomedical Signal Processing and Control 73, 103408

  23. [23]

    A unified approach to interpreting model predictions

    Lundberg, S.M., Lee, S.I., 2017. A unified approach to interpreting model predictions. Advances in neural information processing sys- tems 30

  24. [24]

    Methodsforinterpret- ingandunderstandingdeepneuralnetworks.Digitalsignalprocessing 73, 1–15

    Montavon,G.,Samek,W.,Müller,K.R.,2018. Methodsforinterpret- ingandunderstandingdeepneuralnetworks.Digitalsignalprocessing 73, 1–15

  25. [25]

    Confident learning: Esti- matinguncertaintyindatasetlabels

    Northcutt, C., Jiang, L., Chuang, I., 2021. Confident learning: Esti- matinguncertaintyindatasetlabels. JournalofArtificialIntelligence Research 70, 1373–1411

  26. [26]

    Exploring interpretable ai methods for ecg data classification, in: Proceedings of the 5th ACM Workshop on Intelligent Cross-Data Analysis and Retrieval, pp

    Ojha, J., Haugerud, H., Yazidi, A., Lind, P.G., 2024. Exploring interpretable ai methods for ecg data classification, in: Proceedings of the 5th ACM Workshop on Intelligent Cross-Data Analysis and Retrieval, pp. 11–18

  27. [27]

    Carbon emissions andlargeneuralnetworktraining

    Patterson, D., Gonzalez, J., Le, Q., Liang, C., Munguia, L.M., Rothchild, D., So, D., Texier, M., Dean, J., 2021. Carbon emissions andlargeneuralnetworktraining. arXivpreprintarXiv:2104.10350

  28. [28]

    Deeplearningonadata diet:Findingimportantexamplesearlyintraining.Advancesinneural information processing systems 34, 20596–20607

    Paul,M.,Ganguli,S.,Dziugaite,G.K.,2021. Deeplearningonadata diet:Findingimportantexamplesearlyintraining.Advancesinneural information processing systems 34, 20596–20607

  29. [29]

    Rajpurkar, P., Hannun, A.Y., Haghpanahi, M., Bourn, C., Ng, A.Y.,

  30. [30]

    arXiv preprint arXiv:1707.01836

    Cardiologist-level arrhythmia detection with convolutional neural networks. arXiv preprint arXiv:1707.01836

  31. [31]

    Training deep neural networks on noisy labels with bootstrapping

    Reed, S., Lee, H., Anguelov, D., Szegedy, C., Erhan, D., Rabinovich, A., 2014. Training deep neural networks on noisy labels with bootstrapping. arXiv preprint arXiv:1412.6596

  32. [32]

    Classification of 12-leadecgs:Thephysionet/computingincardiologychallenge2020, in: 2020 Computing in Cardiology, IEEE

    Reyna, M.A., Alday, E.A.P., Gu, A., Liu, C., Seyedi, S., Rad, A.B., Elola, A., Li, Q., Sharma, A., Clifford, G.D., 2020. Classification of 12-leadecgs:Thephysionet/computingincardiologychallenge2020, in: 2020 Computing in Cardiology, IEEE. pp. 1–4. V. K. Dangeti, S. N. Gowda:Preprint submitted to ElsevierPage 15 of 16 Using Explainability as a Training-Ti...

  33. [33]

    Right for the right reasons: Training differentiable models by constraining their explanations

    Ross, A.S., Hughes, M.C., Doshi-Velez, F., 2017. Right for the right reasons: Training differentiable models by constraining their explanations. arXiv preprint arXiv:1703.03717

  34. [34]

    Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.C.,

  35. [35]

    4510–4520

    Mobilenetv2: Inverted residuals and linear bottlenecks, in: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 4510–4520

  36. [36]

    Making deep neuralnetworks right forthe right scientificreasons by interact- ingwiththeirexplanations

    Schramowski, P., Stammer, W., Teso, S., Brugger, A., Herbert, F., Shao, X., Luigs, H.G., Mahlein, A.K., Kersting, K., 2020. Making deep neuralnetworks right forthe right scientificreasons by interact- ingwiththeirexplanations. NatureMachineIntelligence2,476–486

  37. [37]

    Green ai

    Schwartz, R., Dodge, J., Smith, N.A., Etzioni, O., 2020. Green ai. Communications of the ACM 63, 54–63

  38. [38]

    Grad-cam:Visualexplanationsfromdeepnetworksvia gradient-basedlocalization,in:ProceedingsoftheIEEEinternational conference on computer vision, pp

    Selvaraju,R.R.,Cogswell,M.,Das,A.,Vedantam,R.,Parikh,D.,Ba- tra,D.,2017. Grad-cam:Visualexplanationsfromdeepnetworksvia gradient-basedlocalization,in:ProceedingsoftheIEEEinternational conference on computer vision, pp. 618–626

  39. [39]

    Progressive data dropout: An embarrassingly simple approach to train faster, in: The Thirty-ninth Annual Conference on Neural Information Processing Systems

    Shriram, M., Hao, X., Hou, S., Lu, Y., Sevilla-Lara, L., Arnab, A., Gowda, S.N., . Progressive data dropout: An embarrassingly simple approach to train faster, in: The Thirty-ninth Annual Conference on Neural Information Processing Systems

  40. [40]

    Trainingregion-based object detectors with online hard example mining, in: Proceedings of the IEEE conference on computer vision and pattern recognition, pp

    Shrivastava,A.,Gupta,A.,Girshick,R.,2016. Trainingregion-based object detectors with online hard example mining, in: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 761–769

  41. [41]

    Smoothgrad: removing noise by adding noise

    Smilkov, D., Thorat, N., Kim, B., Viégas, F., Wattenberg, M., 2017. Smoothgrad: removing noise by adding noise. arXiv preprint arXiv:1706.03825

  42. [42]

    Attention- based 3d cnn with residual connections for efficient ecg-based covid- 19 detection

    Sobahi, N., Sengur, A., Tan, R.S., Acharya, U.R., 2022. Attention- based 3d cnn with residual connections for efficient ecg-based covid- 19 detection. Computers in Biology and Medicine 143, 105335

  43. [43]

    Deep learningforecganalysis:Benchmarksandinsightsfromptb-xl

    Strodthoff, N., Wagner, P., Schaeffter, T., Samek, W., 2020. Deep learningforecganalysis:Benchmarksandinsightsfromptb-xl. IEEE journal of biomedical and health informatics 25, 1519–1528

  44. [44]

    Energy and policy considerations for deep learning in nlp, in: Proceedings of the 57th annual meeting of the association for computational linguistics, pp

    Strubell, E., Ganesh, A., McCallum, A., 2019. Energy and policy considerations for deep learning in nlp, in: Proceedings of the 57th annual meeting of the association for computational linguistics, pp. 3645–3650

  45. [45]

    Axiomatic attribution for deep networks, in: International conference on machine learning, PMLR

    Sundararajan, M., Taly, A., Yan, Q., 2017. Axiomatic attribution for deep networks, in: International conference on machine learning, PMLR. pp. 3319–3328

  46. [46]

    Dataset cartography: Mapping and diagnosing datasets with training dynamics, in: Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), pp

    Swayamdipta, S., Schwartz, R., Lourie, N., Wang, Y., Hajishirzi, H., Smith, N.A., Choi, Y., 2020. Dataset cartography: Mapping and diagnosing datasets with training dynamics, in: Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 9275–9293

  47. [47]

    Efficientnet: Rethinking model scaling for convolutional neural networks, in: International conference on machine learning, PMLR

    Tan, M., Le, Q., 2019. Efficientnet: Rethinking model scaling for convolutional neural networks, in: International conference on machine learning, PMLR. pp. 6105–6114

  48. [48]

    Anempiricalstudyofexampleforgettingduring deep neural network learning

    Toneva, M., Sordoni, A., Combes, R.T.d., Trischler, A., Bengio, Y., Gordon,G.J.,2018. Anempiricalstudyofexampleforgettingduring deep neural network learning. arXiv preprint arXiv:1812.05159

  49. [49]

    Ptb-xl, a large publicly available electrocardiography dataset

    Wagner,P.,Strodthoff,N.,Bousseljot,R.D.,Kreiseler,D.,Lunze,F.I., Samek, W., Schaeffter, T., 2020. Ptb-xl, a large publicly available electrocardiography dataset. Scientific data 7, 154

  50. [50]

    Data dropout: Optimizing training data for convolutional neural networks, in: 2018 IEEE 30th interna- tional conference on tools with artificial intelligence (ICTAI), IEEE

    Wang, T., Huan, J., Li, B., 2018. Data dropout: Optimizing training data for convolutional neural networks, in: 2018 IEEE 30th interna- tional conference on tools with artificial intelligence (ICTAI), IEEE. pp. 39–46

  51. [51]

    Learning with noisy labels revisited: A study using real-world human annota- tions

    Wei, J., Zhu, Z., Cheng, H., Liu, T., Niu, G., Liu, Y., 2021. Learning with noisy labels revisited: A study using real-world human annota- tions. arXiv preprint arXiv:2110.12088

  52. [52]

    Deep learning with noisy labels in medical prediction problems: a scoping review

    Wei, Y., Deng, Y., Sun, C., Lin, M., Jiang, H., Peng, Y., 2024. Deep learning with noisy labels in medical prediction problems: a scoping review. JournaloftheAmericanMedicalInformaticsAssociation31, 1596–1607

  53. [53]

    When dynamic data selection meets data augmentation: Achieving enhanced training ac- celeration,in:InternationalConferenceonMachineLearning,PMLR

    Yang, S., Ye, P., Shen, F., Zhou, D., 2025. When dynamic data selection meets data augmentation: Achieving enhanced training ac- celeration,in:InternationalConferenceonMachineLearning,PMLR. pp. 71508–71520

  54. [54]

    Instance- dependentearlystopping,in:TheThirteenthInternationalConference on Learning Representations

    Yuan, S., Lin, R., Feng, L., Han, B., Liu, T., 2025. Instance- dependentearlystopping,in:TheThirteenthInternationalConference on Learning Representations. V. K. Dangeti, S. N. Gowda:Preprint submitted to ElsevierPage 16 of 16