HeartBeatAI: An Interpretable and Robust Deep Learning Framework for Multi-Label ECG Arrhythmia Detection

Nikhil Panwar; Partha Pratim Roy; Shubham Gupta

arxiv: 2605.24588 · v1 · pith:Q5GDS6JVnew · submitted 2026-05-23 · 💻 cs.AI · cs.LG

HeartBeatAI: An Interpretable and Robust Deep Learning Framework for Multi-Label ECG Arrhythmia Detection

Shubham Gupta , Nikhil Panwar , Partha Pratim Roy This is my paper

Pith reviewed 2026-06-30 13:56 UTC · model grok-4.3

classification 💻 cs.AI cs.LG

keywords ECG arrhythmia detectiondeep learningdomain generalizationmulti-label classification12-lead ECGMixStyle regularizationSE ResNetinterpretability

0 comments

The pith

HeartBeatAI reaches 98% Macro F1-score for multi-label ECG arrhythmia detection within datasets but degrades for rare anomalies under domain shift.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper presents HeartBeatAI as a deep learning framework for classifying multiple arrhythmias from 12-lead ECG recordings. It combines a Squeeze-and-Excitation ResNet to focus on diagnostic leads, a Multi-Layer Concentration Pipeline to aggregate features across scales, and regularization steps including MixStyle and label smoothing to improve generalization. Rigorous tests on four large datasets yield strong results when training and testing data come from the same source, yet Leave-One-Domain-Out evaluations expose clear drops especially for uncommon anomaly classes. The work matters to a sympathetic reader because it shows concrete performance numbers alongside evidence that cross-institution deployment remains difficult despite these techniques.

Core claim

By integrating domain generalization methods with multi-scale feature extraction and explainability components, HeartBeatAI achieves a 98% Macro F1-score in intra-source evaluations on multiple ECG datasets for multi-label arrhythmia classification, yet evaluations using Leave-One-Domain-Out protocols indicate substantial degradation particularly in identifying infrequent anomalies, underscoring ongoing difficulties in achieving robust cross-institutional performance.

What carries the argument

The Squeeze-and-Excitation ResNet paired with a Multi-Layer Concentration Pipeline that isolates diagnostic leads and captures both macro-rhythm and micro-morphological anomalies.

If this is right

The framework reliably handles simultaneous multi-label arrhythmia classification when data distributions match between training and test sets.
MixStyle regularization and label smoothing reduce but do not eliminate degradation on rare classes during domain-shift tests.
Inclusion of clinical explainability components supports potential use in medical settings.
LODO results indicate that further advances are needed before reliable deployment across different recording sites.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Training on ECG recordings drawn from a broader set of institutions could narrow the performance gap seen in LODO tests.
The lead-isolation and multi-scale pipeline could transfer to classification tasks on other time-series biosignals that face similar distribution shifts.
Detailed per-anomaly breakdowns from the LODO runs could reveal which specific rare classes drive most of the cross-domain loss.

Load-bearing premise

That the four datasets and the LODO protocol sufficiently capture real-world domain shifts between institutions and that observed performance drops stem mainly from those shifts rather than label noise or other factors.

What would settle it

Running the same framework on new ECG collections from additional institutions and finding no drop in Macro F1-score for rare anomalies would challenge the claim that domain shift creates persistent cross-institutional challenges.

read the original abstract

While Deep Learning (DL) enhances automated electrocardiogram (ECG) analysis, clinical deployment is hindered by class imbalance and the generalization gap. This paper presents HeartBeatAI, a deep learning framework combining domain generalization, multi-scale feature aggregation, and clinical explainability for robust 12-lead ECG classification. Moving beyond image-based paradigms, HeartBeatAI integrates a Squeeze-and-Excitation (SE) ResNet to isolate diagnostic leads alongside a Multi-Layer Concentration Pipeline to capture macro-rhythm and micro-morphological anomalies. To mitigate domain shift, the framework employs MixStyle regularization and Label Smoothing. Rigorous benchmarking across four large-scale datasets using intra-source and Leave-One-Domain-Out (LODO) protocols demonstrates high performance (98% Macro F1-score) under intra-source conditions. However, LODO evaluations reveal significant degradation in detecting rare anomalies, highlighting a persistent challenge in cross-institutional deployment.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper applies standard DL components like SE-ResNet and MixStyle to ECG classification and reports intra-source performance with LODO degradation, but provides no dataset details or baselines so the claims cannot be checked.

read the letter

The one thing to know is that HeartBeatAI combines a SE-ResNet backbone, multi-scale feature aggregation, MixStyle regularization, and label smoothing for 12-lead multi-label ECG arrhythmia detection, claiming 98% macro F1 inside sources but clear drops in leave-one-domain-out tests on four datasets.

The work does tackle a practical issue: models that work on one hospital's ECG data often fail on another's. Testing across multiple datasets with both same-source and cross-source protocols is better than the usual single-dataset setup, and adding an explainability angle is reasonable for clinical interest.

The soft spots are substantial and central. The abstract states the performance numbers and the LODO degradation without naming the datasets, their sizes, class frequencies, annotation rules, or acquisition differences. No baselines from prior ECG methods appear, no statistical tests, and no error analysis. This leaves open whether the observed drop comes from domain shift or from unmeasured factors like differing label noise or imbalance severity across sources. Without those details the main conclusion about persistent cross-institutional challenges stays unverified.

The methods are described at a high level only, with no implementation specifics or code. The components themselves are established rather than new.

This paper would mainly interest people already working on applied ECG AI who want to see one more regularization stack tried on the problem. A reader looking for rigorous evidence or a new technique will not find it. I would not bring it to a reading group unless the group specifically covers medical signal processing applications, and I would not cite it. It does not deserve peer review in this form because the central empirical claims cannot be assessed from what is provided.

Referee Report

2 major / 1 minor

Summary. The paper introduces HeartBeatAI, a deep learning framework for multi-label 12-lead ECG arrhythmia detection. It combines a SE-ResNet with multi-scale feature aggregation via a Multi-Layer Concentration Pipeline, incorporates MixStyle regularization and Label Smoothing for domain generalization, and aims for clinical explainability. Benchmarking on four large-scale datasets shows 98% Macro F1-score in intra-source settings but significant degradation in Leave-One-Domain-Out (LODO) evaluations, particularly for rare anomalies, underscoring challenges in cross-institutional deployment.

Significance. If the empirical results can be verified with full methodological details, the framework could advance robust and interpretable ECG analysis for clinical use by addressing domain shift and class imbalance. However, the current presentation lacks the necessary details to assess its contribution relative to existing methods.

major comments (2)

Abstract: The reported 98% Macro F1-score under intra-source conditions is presented without any baselines, statistical tests, implementation details, or error analysis, rendering the central performance claim unverifiable.
LODO protocol description: The attribution of performance degradation in LODO to domain shift between institutions lacks supporting information on dataset sizes, per-class frequencies, annotation protocols, or acquisition parameters; without these, alternative explanations such as label noise or varying class imbalance cannot be ruled out.

minor comments (1)

Abstract: The term 'large-scale datasets' is used without specifying the actual dataset names or sizes.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive feedback. We address the two major comments point-by-point below and indicate the revisions we will make.

read point-by-point responses

Referee: Abstract: The reported 98% Macro F1-score under intra-source conditions is presented without any baselines, statistical tests, implementation details, or error analysis, rendering the central performance claim unverifiable.

Authors: We agree that the abstract's brevity limits inclusion of supporting elements. The full manuscript (Sections 3 and 4) provides the requested details: comparisons to multiple baselines, paired statistical tests, implementation hyperparameters, and per-class error breakdowns. To improve verifiability of the headline claim, we will revise the abstract to briefly note the intra-source benchmarking protocol and reference to state-of-the-art comparisons. revision: yes
Referee: LODO protocol description: The attribution of performance degradation in LODO to domain shift between institutions lacks supporting information on dataset sizes, per-class frequencies, annotation protocols, or acquisition parameters; without these, alternative explanations such as label noise or varying class imbalance cannot be ruled out.

Authors: We concur that expanded dataset characterization is needed to strengthen the domain-shift interpretation. The current manuscript references the four public datasets and their source publications but does not tabulate the requested statistics in the LODO section. We will add an explicit table (or expanded subsection) listing dataset sizes, per-class frequencies across domains, annotation sources, and acquisition parameters to allow readers to evaluate alternative explanations such as label noise or imbalance differences. revision: yes

Circularity Check

0 steps flagged

No circularity: purely empirical benchmarking with no derivations or self-referential reductions

full rationale

The paper reports direct experimental outcomes from training and evaluating a DL model (SE-ResNet + Multi-Layer Concentration Pipeline + MixStyle + Label Smoothing) on four datasets under intra-source and LODO protocols, yielding measured metrics such as 98% Macro F1. No equations, parameter-fitting steps presented as predictions, uniqueness theorems, or self-citation chains appear in the abstract or described content. All claims are falsifiable performance statements grounded in external data splits rather than reducing to inputs by construction. The absence of any derivation chain makes circularity analysis inapplicable; the reader's assigned score of 2 reflects this lack of mathematical structure.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

The paper relies on standard deep learning assumptions such as the ability of neural networks to learn discriminative features from labeled ECG data and the validity of LODO as a proxy for domain shift. No new free parameters, axioms, or invented entities are introduced in the abstract.

pith-pipeline@v0.9.1-grok · 5696 in / 1076 out tokens · 36022 ms · 2026-06-30T13:56:01.627923+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

44 extracted references · 24 canonical work pages

[1]

WHO Fact Sheets

World Health Organization: Cardiovascular diseases (CVDs). WHO Fact Sheets. https://www.who.int/news-room/fact-sheets/detail/ cardiovascular-diseases-(cvds) (2021)

2021
[2]

Siontis, K.C.,et al.: Artificial intelligence-enhanced electrocardiography in car- diovascular disease management. Nat. Rev. Cardiol.18, 465–478 (2021). https: //doi.org/10.1038/s41569-020-00503-2

work page doi:10.1038/s41569-020-00503-2 2021
[3]

Hong, S.,et al.: Opportunities and challenges in deep learning methods on electro- cardiogram data: A systematic review. Comput. Biol. Med.122, 103801 (2020). https://doi.org/10.1016/j.compbiomed.2020.103801

work page doi:10.1016/j.compbiomed.2020.103801 2020
[4]

Jin, Y., Li, Z., Wang, M., et al.: Cardiologist-level interpretable knowledge-fused deep neural network for automatic arrhythmia diagnosis. Commun. Med.4(31) 23 (2024). https://doi.org/10.1038/s43856-024-00464-4

work page doi:10.1038/s43856-024-00464-4 2024
[5]

Ribeiro, A.H.,et al.: Automatic diagnosis of the 12-lead ecg using a deep neural network. Nat. Commun.11, 1760 (2020). https://doi.org/10.1038/ s41467-020-15432-4

2020
[6]

Hannun, A.Y.,et al.: Cardiologist-level arrhythmia detection and classification in ambulatory electrocardiograms using a deep neural network. Nat. Med.25, 65–69 (2019). https://doi.org/10.1038/s41591-018-0268-3

work page doi:10.1038/s41591-018-0268-3 2019
[7]

In: Comput- ing in Cardiology (CinC) 2021 (2021)

Li, X., Li, C., Xu, X., Wei, Y., Wei, J., Sun, Y., Qian, B., Xu, X.: Towards generalization of cardiac abnormality classification using ecg signal. In: Comput- ing in Cardiology (CinC) 2021 (2021). https://www.cinc.org/archives/2021/pdf/ CinC2021-212.pdf

2021
[8]

IEEE Transactions on Biomedical Engineering 71(2), 641–652 (2024)

Ballas, A., Diou, C.: Towards domain generalization for ecg and eeg classifica- tion: Algorithms and benchmarks. IEEE Transactions on Biomedical Engineering 71(2), 641–652 (2024). https://ieeexplore.ieee.org/document/10233054

work page arXiv 2024
[9]

IEEE Trans

Dissanayake, T., Fernando, T., Denman, S., Ghaemmaghami, H., Sridharan, S., Fookes, C.: Domain generalization in biosignal classification. IEEE Trans. Biomed. Eng.68(6), 1978–1989 (2021). https://arxiv.org/pdf/2011.06207

work page arXiv 1978
[10]

Neurocomputing 349, 212–224 (2019)

Wang, J.,et al.: Adversarial de-noising of electrocardiogram. Neurocomputing 349, 212–224 (2019). https://doi.org/10.1016/j.neucom.2019.04.041

work page doi:10.1016/j.neucom.2019.04.041 2019
[11]

In: 2020 Computing in Car- diology, pp

Hasani, H., Bitarafan, A., Baghshah, M.S.: Classification of 12-lead ecg signals with adversarial multi-source domain generalization. In: 2020 Computing in Car- diology, pp. 1–4 (2020). https://www.cinc.org/archives/2020/pdf/CinC2020-445. pdf

2020
[12]

Alday, E.A.P.,et al.: Classification of 12-lead ecgs: the physionet/computing in cardiology challenge 2020. Physiol. Meas.41, 124003 (2020). https://doi.org/10. 1088/1361-6579/abc960

2020
[13]

Liu, F.,et al.: An open access database for evaluating the algorithms of electro- cardiogram rhythm and morphology abnormality detection. J. Med. Imag. Health Inform.8, 1368–1373 (2018). https://doi.org/10.1166/jmihi.2018.2442

work page doi:10.1166/jmihi.2018.2442 2018
[14]

Wagner, P.,et al.: Ptb-xl, a large publicly available electrocardiography dataset. Sci. Data7, 154 (2020). https://doi.org/10.1038/s41597-020-0495-6

work page doi:10.1038/s41597-020-0495-6 2020
[15]

Zheng, J.,et al.: A 12-lead electrocardiogram database for arrhythmia research covering more than 10,000 patients. Sci. Data7, 48 (2020). https://doi.org/10. 1038/s41597-020-0386-x

2020
[16]

In: Proc

Hu, J., Shen, L., Sun, G.: Squeeze-and-excitation networks. In: Proc. IEEE Conf. 24 Comput. Vis. Pattern Recognit. (CVPR), pp. 7132–7141 (2018). https://doi.org/ 10.1109/CVPR.2018.00745

work page doi:10.1109/cvpr.2018.00745 2018
[17]

Zhou, K., Yang, Y., Qiao, Y., Xiang, T.: Domain generalization with mixstyle. In: Int. Conf. Learn. Represent. (ICLR) (2021). https://arxiv.org/abs/2104.02008

work page arXiv 2021
[18]

In: Proc

Gulrajani, I., Lopez-Paz, D.: In search of lost domain generalization. In: Proc. 9th Int. Conf. Learn. Represent. (ICLR), pp. 1–26 (2021). https://openreview. net/forum?id=lQdXeXDoWtI

2021
[19]

Sangha, V.,et al.: Automated multilabel diagnosis on electrocardiographic images and signals. Nat. Commun.13, 1583 (2022). https://doi.org/10.1038/ s41467-022-29153-3

2022
[20]

Strodthoff, N.,et al.: Deep learning for ecg analysis: Benchmarks and insights from ptb-xl. IEEE J. Biomed. Health Inform.25, 1519–1528 (2021). https://doi. org/10.1109/JBHI.2020.3022989

work page doi:10.1109/jbhi.2020.3022989 2021
[21]

Lancet394, 861–867 (2019)

Attia, Z.I.,et al.: An artificial intelligence-enabled ecg algorithm for the identi- fication of patients with atrial fibrillation during sinus rhythm: a retrospective analysis of outcome prediction. Lancet394, 861–867 (2019). https://doi.org/10. 1016/S0140-6736(19)31721-0

2019
[22]

Lai, J., Tan, H., Wang, J., et al.: Practical intelligent diagnostic algorithm for wearable 12-lead ecg via self-supervised learning on large-scale dataset. Nat. Commun.14(3741) (2023). https://doi.org/10.1038/s41467-023-39472-8

work page doi:10.1038/s41467-023-39472-8 2023
[23]

Topol, E.J.: High-performance medicine: the convergence of human and artificial intelligence. Nat. Med.25, 44–56 (2019). https://doi.org/10.1038/ s41591-018-0300-7

2019
[24]

In: IEEE Conference on Computer Vision and Pattern Recognition

He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recogni- tion. In: Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), pp. 770–778 (2016). https://doi.org/10.1109/CVPR.2016.90

work page doi:10.1109/cvpr.2016.90 2016
[25]

, year = 2017, month = jul, pages =

Huang, G.,et al.: Densely connected convolutional networks. In: Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), pp. 4700–4708 (2017). https: //doi.org/10.1109/CVPR.2017.243

work page doi:10.1109/cvpr.2017.243 2017
[26]

Journal of Clinical and Biomedi- cal Sciences15, 118–125 (2025)

Febeena, K.R., Kurian, C.: Advanced arrhythmia classification using transformer-based cnn. Journal of Clinical and Biomedi- cal Sciences15, 118–125 (2025). https://jcbsonline.ac.in/articles/ advanced-arrhythmia-classification-using-transformer-based-cnn

2025
[27]

IEEE Trans

Wang, J.,et al.: Generalizing to unseen domains: A survey on domain gen- eralization. IEEE Trans. Knowl. Data Eng.35(8), 8052–8072 (2022). https: //arxiv.org/abs/2103.03097 25

work page arXiv 2022
[28]

In: Proc

Ganin, Y., Lempitsky, V.: Unsupervised domain adaptation by backpropaga- tion. In: Proc. Int. Conf. Mach. Learn. (ICML), pp. 1180–1189 (2015). https: //proceedings.mlr.press/v37/ganin15.html

2015
[29]

Goettling, M.,et al.: xecgarch: a trustworthy deep learning architecture for inter- pretable ecg analysis considering short-term and long-term features. Sci. Rep.14, 13122 (2024). https://doi.org/10.1038/s41598-024-63656-x

work page doi:10.1038/s41598-024-63656-x 2024
[30]

Rethinking the Inception Architecture for Computer Vision

Szegedy, C.,et al.: Rethinking the inception architecture for computer vision. In: Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), pp. 2818–2826 (2016). https://doi.org/10.1109/CVPR.2016.308

work page doi:10.1109/cvpr.2016.308 2016
[31]

Zhou, F., Fang, D.: Classification of multi-lead ecg based on multiple scales and hierarchical feature convolutional neural networks. Sci. Rep.15, 16418 (2025). https://doi.org/10.1038/s41598-025-94127-6

work page doi:10.1038/s41598-025-94127-6 2025
[32]

Jang, J.H.,et al.: A novel xai framework for explainable ai-ecg using generative counterfactual xai (gcx). Sci. Rep.15, 23608 (2025). https://doi.org/10.1038/ s41598-025-08080-5

2025
[33]

In: NeurIPS 2024 Proceedings (2024)

Bedin, L., Cardoso, G., Duchateau, J., Dubois, R., Moulines, E.: Leveraging an ecg beat diffusion model for morphological reconstruction from indirect signals. In: NeurIPS 2024 Proceedings (2024). https://proceedings.neurips.cc/paper files/ paper/2024/file/9988f2c8e07c1f98af7ba9ca31ccae0b-Paper-Conference.pdf

2024
[34]

IEEE Trans

Vapnik, V.N.: An overview of statistical learning theory. IEEE Trans. Neural Netw.10, 988–999 (1999). https://doi.org/10.1109/72.788640

work page doi:10.1109/72.788640 1999
[35]

npj Cardiovasc

Lai, J., Zhang, Y., Zhao, C., et al.: Multi-expert ensemble ecg diagnostic algo- rithm using mutually exclusive–symbiotic correlation between 254 hierarchical multiple labels. npj Cardiovasc. Health1(8) (2024). https://doi.org/10.1038/ s44325-024-00010-0

2024
[36]

European Heart Journal40, 237–269 (2019)

Thygesen, K.,et al.: Fourth universal definition of myocardial infarction (2018). European Heart Journal40, 237–269 (2019). https://doi.org/10.1093/eurheartj/ ehy462

work page doi:10.1093/eurheartj/ 2018
[37]

IEEE Trans

Zhou, K.,et al.: Domain generalization: A survey. IEEE Trans. Pattern Anal. Mach. Intell.45, 4396–4415 (2022). https://doi.org/10.1109/TPAMI.2022. 3195549

work page doi:10.1109/tpami.2022 2022
[38]

In: Proc

Ioffe, S., Szegedy, C.: Batch normalization: Accelerating deep network training by reducing internal covariate shift. In: Proc. Int. Conf. Mach. Learn. (ICML), pp. 448–456 (2015). https://proceedings.mlr.press/v37/ioffe15.html

2015
[39]

Srivastava, N.,et al.: Dropout: A simple way to prevent neural networks from overfitting. J. Mach. Learn. Res.15, 1929–1958 (2014). http://jmlr.org/papers/ 26 v15/srivastava14a.html

1929
[40]

Paszke, A.,et al.: Pytorch: An imperative style, high-performance deep learning library. In: Adv. Neural Inf. Process. Syst. (NeurIPS), vol. 32, pp. 8024–8035 (2019). https://proceedings.neurips.cc/paper/2019/hash/ bdbca288fee7f92f2bfa9f7012727740-Abstract.html

2019
[41]

In: Proc

Loshchilov, I., Hutter, F.: Decoupled weight decay regularization. In: Proc. 7th Int. Conf. Learn. Represent. (ICLR), pp. 1–18 (2019). https://openreview.net/ forum?id=Bkg6RiCqY7

2019
[42]

arXiv preprint arXiv:2009.14119 (2020)

Ben-Baruch, E., et al.: Asymmetric loss for multi-label classification. arXiv preprint arXiv:2009.14119 (2020). https://doi.org/10.48550/arXiv.2009.14119

work page doi:10.48550/arxiv.2009.14119 2009
[43]

Dosovitskiy, A.,et al.: An image is worth 16x16 words: Transformers for image recognition at scale. In: Int. Conf. Learn. Represent. (ICLR) (2021). https:// openreview.net/forum?id=YicbFdNTTy

2021
[44]

arXiv preprint arXiv:2411.00755 (2024)

Tang, X., et al.: Hierarchical transformer for electrocardiogram diagnosis. arXiv preprint arXiv:2411.00755 (2024). https://doi.org/10.48550/arXiv.2411.00755 27

work page doi:10.48550/arxiv.2411.00755 2024

[1] [1]

WHO Fact Sheets

World Health Organization: Cardiovascular diseases (CVDs). WHO Fact Sheets. https://www.who.int/news-room/fact-sheets/detail/ cardiovascular-diseases-(cvds) (2021)

2021

[2] [2]

Siontis, K.C.,et al.: Artificial intelligence-enhanced electrocardiography in car- diovascular disease management. Nat. Rev. Cardiol.18, 465–478 (2021). https: //doi.org/10.1038/s41569-020-00503-2

work page doi:10.1038/s41569-020-00503-2 2021

[3] [3]

Hong, S.,et al.: Opportunities and challenges in deep learning methods on electro- cardiogram data: A systematic review. Comput. Biol. Med.122, 103801 (2020). https://doi.org/10.1016/j.compbiomed.2020.103801

work page doi:10.1016/j.compbiomed.2020.103801 2020

[4] [4]

Jin, Y., Li, Z., Wang, M., et al.: Cardiologist-level interpretable knowledge-fused deep neural network for automatic arrhythmia diagnosis. Commun. Med.4(31) 23 (2024). https://doi.org/10.1038/s43856-024-00464-4

work page doi:10.1038/s43856-024-00464-4 2024

[5] [5]

Ribeiro, A.H.,et al.: Automatic diagnosis of the 12-lead ecg using a deep neural network. Nat. Commun.11, 1760 (2020). https://doi.org/10.1038/ s41467-020-15432-4

2020

[6] [6]

Hannun, A.Y.,et al.: Cardiologist-level arrhythmia detection and classification in ambulatory electrocardiograms using a deep neural network. Nat. Med.25, 65–69 (2019). https://doi.org/10.1038/s41591-018-0268-3

work page doi:10.1038/s41591-018-0268-3 2019

[7] [7]

In: Comput- ing in Cardiology (CinC) 2021 (2021)

Li, X., Li, C., Xu, X., Wei, Y., Wei, J., Sun, Y., Qian, B., Xu, X.: Towards generalization of cardiac abnormality classification using ecg signal. In: Comput- ing in Cardiology (CinC) 2021 (2021). https://www.cinc.org/archives/2021/pdf/ CinC2021-212.pdf

2021

[8] [8]

IEEE Transactions on Biomedical Engineering 71(2), 641–652 (2024)

Ballas, A., Diou, C.: Towards domain generalization for ecg and eeg classifica- tion: Algorithms and benchmarks. IEEE Transactions on Biomedical Engineering 71(2), 641–652 (2024). https://ieeexplore.ieee.org/document/10233054

work page arXiv 2024

[9] [9]

IEEE Trans

Dissanayake, T., Fernando, T., Denman, S., Ghaemmaghami, H., Sridharan, S., Fookes, C.: Domain generalization in biosignal classification. IEEE Trans. Biomed. Eng.68(6), 1978–1989 (2021). https://arxiv.org/pdf/2011.06207

work page arXiv 1978

[10] [10]

Neurocomputing 349, 212–224 (2019)

Wang, J.,et al.: Adversarial de-noising of electrocardiogram. Neurocomputing 349, 212–224 (2019). https://doi.org/10.1016/j.neucom.2019.04.041

work page doi:10.1016/j.neucom.2019.04.041 2019

[11] [11]

In: 2020 Computing in Car- diology, pp

Hasani, H., Bitarafan, A., Baghshah, M.S.: Classification of 12-lead ecg signals with adversarial multi-source domain generalization. In: 2020 Computing in Car- diology, pp. 1–4 (2020). https://www.cinc.org/archives/2020/pdf/CinC2020-445. pdf

2020

[12] [12]

Alday, E.A.P.,et al.: Classification of 12-lead ecgs: the physionet/computing in cardiology challenge 2020. Physiol. Meas.41, 124003 (2020). https://doi.org/10. 1088/1361-6579/abc960

2020

[13] [13]

Liu, F.,et al.: An open access database for evaluating the algorithms of electro- cardiogram rhythm and morphology abnormality detection. J. Med. Imag. Health Inform.8, 1368–1373 (2018). https://doi.org/10.1166/jmihi.2018.2442

work page doi:10.1166/jmihi.2018.2442 2018

[14] [14]

Wagner, P.,et al.: Ptb-xl, a large publicly available electrocardiography dataset. Sci. Data7, 154 (2020). https://doi.org/10.1038/s41597-020-0495-6

work page doi:10.1038/s41597-020-0495-6 2020

[15] [15]

Zheng, J.,et al.: A 12-lead electrocardiogram database for arrhythmia research covering more than 10,000 patients. Sci. Data7, 48 (2020). https://doi.org/10. 1038/s41597-020-0386-x

2020

[16] [16]

In: Proc

Hu, J., Shen, L., Sun, G.: Squeeze-and-excitation networks. In: Proc. IEEE Conf. 24 Comput. Vis. Pattern Recognit. (CVPR), pp. 7132–7141 (2018). https://doi.org/ 10.1109/CVPR.2018.00745

work page doi:10.1109/cvpr.2018.00745 2018

[17] [17]

Zhou, K., Yang, Y., Qiao, Y., Xiang, T.: Domain generalization with mixstyle. In: Int. Conf. Learn. Represent. (ICLR) (2021). https://arxiv.org/abs/2104.02008

work page arXiv 2021

[18] [18]

In: Proc

Gulrajani, I., Lopez-Paz, D.: In search of lost domain generalization. In: Proc. 9th Int. Conf. Learn. Represent. (ICLR), pp. 1–26 (2021). https://openreview. net/forum?id=lQdXeXDoWtI

2021

[19] [19]

Sangha, V.,et al.: Automated multilabel diagnosis on electrocardiographic images and signals. Nat. Commun.13, 1583 (2022). https://doi.org/10.1038/ s41467-022-29153-3

2022

[20] [20]

Strodthoff, N.,et al.: Deep learning for ecg analysis: Benchmarks and insights from ptb-xl. IEEE J. Biomed. Health Inform.25, 1519–1528 (2021). https://doi. org/10.1109/JBHI.2020.3022989

work page doi:10.1109/jbhi.2020.3022989 2021

[21] [21]

Lancet394, 861–867 (2019)

Attia, Z.I.,et al.: An artificial intelligence-enabled ecg algorithm for the identi- fication of patients with atrial fibrillation during sinus rhythm: a retrospective analysis of outcome prediction. Lancet394, 861–867 (2019). https://doi.org/10. 1016/S0140-6736(19)31721-0

2019

[22] [22]

Lai, J., Tan, H., Wang, J., et al.: Practical intelligent diagnostic algorithm for wearable 12-lead ecg via self-supervised learning on large-scale dataset. Nat. Commun.14(3741) (2023). https://doi.org/10.1038/s41467-023-39472-8

work page doi:10.1038/s41467-023-39472-8 2023

[23] [23]

Topol, E.J.: High-performance medicine: the convergence of human and artificial intelligence. Nat. Med.25, 44–56 (2019). https://doi.org/10.1038/ s41591-018-0300-7

2019

[24] [24]

In: IEEE Conference on Computer Vision and Pattern Recognition

He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recogni- tion. In: Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), pp. 770–778 (2016). https://doi.org/10.1109/CVPR.2016.90

work page doi:10.1109/cvpr.2016.90 2016

[25] [25]

, year = 2017, month = jul, pages =

Huang, G.,et al.: Densely connected convolutional networks. In: Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), pp. 4700–4708 (2017). https: //doi.org/10.1109/CVPR.2017.243

work page doi:10.1109/cvpr.2017.243 2017

[26] [26]

Journal of Clinical and Biomedi- cal Sciences15, 118–125 (2025)

Febeena, K.R., Kurian, C.: Advanced arrhythmia classification using transformer-based cnn. Journal of Clinical and Biomedi- cal Sciences15, 118–125 (2025). https://jcbsonline.ac.in/articles/ advanced-arrhythmia-classification-using-transformer-based-cnn

2025

[27] [27]

IEEE Trans

Wang, J.,et al.: Generalizing to unseen domains: A survey on domain gen- eralization. IEEE Trans. Knowl. Data Eng.35(8), 8052–8072 (2022). https: //arxiv.org/abs/2103.03097 25

work page arXiv 2022

[28] [28]

In: Proc

Ganin, Y., Lempitsky, V.: Unsupervised domain adaptation by backpropaga- tion. In: Proc. Int. Conf. Mach. Learn. (ICML), pp. 1180–1189 (2015). https: //proceedings.mlr.press/v37/ganin15.html

2015

[29] [29]

Goettling, M.,et al.: xecgarch: a trustworthy deep learning architecture for inter- pretable ecg analysis considering short-term and long-term features. Sci. Rep.14, 13122 (2024). https://doi.org/10.1038/s41598-024-63656-x

work page doi:10.1038/s41598-024-63656-x 2024

[30] [30]

Rethinking the Inception Architecture for Computer Vision

Szegedy, C.,et al.: Rethinking the inception architecture for computer vision. In: Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), pp. 2818–2826 (2016). https://doi.org/10.1109/CVPR.2016.308

work page doi:10.1109/cvpr.2016.308 2016

[31] [31]

Zhou, F., Fang, D.: Classification of multi-lead ecg based on multiple scales and hierarchical feature convolutional neural networks. Sci. Rep.15, 16418 (2025). https://doi.org/10.1038/s41598-025-94127-6

work page doi:10.1038/s41598-025-94127-6 2025

[32] [32]

Jang, J.H.,et al.: A novel xai framework for explainable ai-ecg using generative counterfactual xai (gcx). Sci. Rep.15, 23608 (2025). https://doi.org/10.1038/ s41598-025-08080-5

2025

[33] [33]

In: NeurIPS 2024 Proceedings (2024)

Bedin, L., Cardoso, G., Duchateau, J., Dubois, R., Moulines, E.: Leveraging an ecg beat diffusion model for morphological reconstruction from indirect signals. In: NeurIPS 2024 Proceedings (2024). https://proceedings.neurips.cc/paper files/ paper/2024/file/9988f2c8e07c1f98af7ba9ca31ccae0b-Paper-Conference.pdf

2024

[34] [34]

IEEE Trans

Vapnik, V.N.: An overview of statistical learning theory. IEEE Trans. Neural Netw.10, 988–999 (1999). https://doi.org/10.1109/72.788640

work page doi:10.1109/72.788640 1999

[35] [35]

npj Cardiovasc

Lai, J., Zhang, Y., Zhao, C., et al.: Multi-expert ensemble ecg diagnostic algo- rithm using mutually exclusive–symbiotic correlation between 254 hierarchical multiple labels. npj Cardiovasc. Health1(8) (2024). https://doi.org/10.1038/ s44325-024-00010-0

2024

[36] [36]

European Heart Journal40, 237–269 (2019)

Thygesen, K.,et al.: Fourth universal definition of myocardial infarction (2018). European Heart Journal40, 237–269 (2019). https://doi.org/10.1093/eurheartj/ ehy462

work page doi:10.1093/eurheartj/ 2018

[37] [37]

IEEE Trans

Zhou, K.,et al.: Domain generalization: A survey. IEEE Trans. Pattern Anal. Mach. Intell.45, 4396–4415 (2022). https://doi.org/10.1109/TPAMI.2022. 3195549

work page doi:10.1109/tpami.2022 2022

[38] [38]

In: Proc

Ioffe, S., Szegedy, C.: Batch normalization: Accelerating deep network training by reducing internal covariate shift. In: Proc. Int. Conf. Mach. Learn. (ICML), pp. 448–456 (2015). https://proceedings.mlr.press/v37/ioffe15.html

2015

[39] [39]

Srivastava, N.,et al.: Dropout: A simple way to prevent neural networks from overfitting. J. Mach. Learn. Res.15, 1929–1958 (2014). http://jmlr.org/papers/ 26 v15/srivastava14a.html

1929

[40] [40]

Paszke, A.,et al.: Pytorch: An imperative style, high-performance deep learning library. In: Adv. Neural Inf. Process. Syst. (NeurIPS), vol. 32, pp. 8024–8035 (2019). https://proceedings.neurips.cc/paper/2019/hash/ bdbca288fee7f92f2bfa9f7012727740-Abstract.html

2019

[41] [41]

In: Proc

Loshchilov, I., Hutter, F.: Decoupled weight decay regularization. In: Proc. 7th Int. Conf. Learn. Represent. (ICLR), pp. 1–18 (2019). https://openreview.net/ forum?id=Bkg6RiCqY7

2019

[42] [42]

arXiv preprint arXiv:2009.14119 (2020)

Ben-Baruch, E., et al.: Asymmetric loss for multi-label classification. arXiv preprint arXiv:2009.14119 (2020). https://doi.org/10.48550/arXiv.2009.14119

work page doi:10.48550/arxiv.2009.14119 2009

[43] [43]

Dosovitskiy, A.,et al.: An image is worth 16x16 words: Transformers for image recognition at scale. In: Int. Conf. Learn. Represent. (ICLR) (2021). https:// openreview.net/forum?id=YicbFdNTTy

2021

[44] [44]

arXiv preprint arXiv:2411.00755 (2024)

Tang, X., et al.: Hierarchical transformer for electrocardiogram diagnosis. arXiv preprint arXiv:2411.00755 (2024). https://doi.org/10.48550/arXiv.2411.00755 27

work page doi:10.48550/arxiv.2411.00755 2024