ECG-NAT: A Self-supervised Neighborhood Attention Transformer for Multi-lead Electrocardiogram Classification

Amjad Seyedi; Fardin Akhlaghian Tab; Fatemeh Daneshfar; Mahsa Gazeran; Sayvan Soleymanbaigi

arxiv: 2605.13194 · v1 · pith:J2OEYM5Xnew · submitted 2026-05-13 · 💻 cs.LG · cs.AI

ECG-NAT: A Self-supervised Neighborhood Attention Transformer for Multi-lead Electrocardiogram Classification

Mahsa Gazeran , Sayvan Soleymanbaigi , Fatemeh Daneshfar , Amjad Seyedi , Fardin Akhlaghian Tab This is my paper

Pith reviewed 2026-05-14 20:37 UTC · model grok-4.3

classification 💻 cs.LG cs.AI

keywords ECG classificationself-supervised learningmasked autoencodertransformerneighborhood attentionarrhythmia detectionlow-resource learningmulti-lead ECG

0 comments

The pith

ECG-NAT uses masked autoencoder pretraining on unlabeled signals and dual-loss fine-tuning to classify multi-lead ECG arrhythmias at 88.1 percent accuracy from only 1 percent labeled data.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper presents ECG-NAT as a transformer that first learns from large amounts of unlabeled ECG recordings by reconstructing masked portions of the signal. It then fine-tunes this representation with a combination of contrastive and cross-entropy losses on small labeled sets to perform arrhythmia classification across multiple leads. The architecture employs neighborhood attention to extract features at both beat-level detail and longer rhythm scales without high computational cost. A reader would care because labeled ECG data is expensive and scarce in practice, so methods that work well with minimal supervision could expand access to automated diagnosis. The approach also claims to maintain efficiency suitable for real-time use.

Core claim

ECG-NAT performs generative pretraining by training a masked autoencoder to reconstruct partially masked multi-lead ECG signals drawn from multiple diverse unlabeled datasets, thereby learning domain-invariant representations; these representations are then refined through discriminative fine-tuning that jointly optimizes supervised contrastive loss and cross-entropy loss, enabling the hierarchical neighborhood attention mechanism to capture multi-scale temporal patterns from localized beat morphology to broader rhythm dependencies and to achieve 88.1 percent accuracy on benchmark classification tasks using only 1 percent of the labeled data.

What carries the argument

Neighborhood attention inside a transformer that processes multi-lead ECG time series at multiple scales, paired with masked autoencoder pretraining followed by dual supervised-contrastive and cross-entropy fine-tuning.

If this is right

The model maintains high classification accuracy across benchmark datasets while using only 1 percent labeled examples.
Neighborhood attention extracts both fine-grained beat morphology and longer rhythm patterns at low computational cost.
Pretraining across multiple unlabeled datasets yields representations that generalize under the dual-loss regime.
The resulting system supports real-time multi-lead ECG diagnosis without requiring large annotated corpora.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same pretraining-plus-dual-loss pattern could be tested on other scarce-label biosignals such as EEG or photoplethysmography.
If neighborhood attention scales to longer recordings, the architecture might support continuous wearable monitoring with limited retraining.
Performance on deliberately mismatched recording hardware would clarify how much of the reported robustness stems from dataset diversity during pretraining.

Load-bearing premise

Generative pretraining via masked autoencoder on multiple diverse unlabeled datasets produces robust domain-invariant representations that transfer effectively to the downstream classification task under the dual-loss fine-tuning regime.

What would settle it

Accuracy on a new multi-lead ECG dataset recorded with different equipment, noise profiles, or patient populations falls well below the performance of a standard supervised transformer trained on the same 1 percent labeled subset.

Figures

Figures reproduced from arXiv: 2605.13194 by Amjad Seyedi, Fardin Akhlaghian Tab, Fatemeh Daneshfar, Mahsa Gazeran, Sayvan Soleymanbaigi.

**Figure 2.** Figure 2: The ECG-NAT framework: a) The masked autoencoder architecture learns robust ECG [PITH_FULL_IMAGE:figures/full_fig_p006_2.png] view at source ↗

**Figure 3.** Figure 3: t-SNE visualization of learned ECG representations with and without supervised contrastive [PITH_FULL_IMAGE:figures/full_fig_p015_3.png] view at source ↗

**Figure 4.** Figure 4: hyperparameter analysis showing model accuracy with different configurations of kernel [PITH_FULL_IMAGE:figures/full_fig_p016_4.png] view at source ↗

**Figure 5.** Figure 5: Performance efficiency analysis of ECG-NAT versus transformer baselines. (a) Model [PITH_FULL_IMAGE:figures/full_fig_p017_5.png] view at source ↗

**Figure 6.** Figure 6: Convergence analysis of NAT model showing training and test loss curves over 100 epochs [PITH_FULL_IMAGE:figures/full_fig_p017_6.png] view at source ↗

read the original abstract

Electrocardiogram (ECG) arrhythmia classification remains challenging due to signal variability, noise, limited labeled data, and the difficulty in achieving both accuracy and efficiency in models. While self-supervised learning reduces label dependency, most methods target either global contextual features or local morphological patterns, but rarely implement hierarchical multi-scale feature extraction. ECG signals require architectures that simultaneously capture fine-grained beat-level morphology and broader rhythm-level dependencies with computational efficiency. To overcome this limitation, this paper proposes the Electrocardiogram Neighborhood Attention Transformer (ECG-NAT), a novel self-supervised learning approach tailored for multi-lead ECG classification. Our two-stage approach begins with generative pretraining, using a masked autoencoder to reconstruct partially masked ECG signals across multiple diverse datasets, enabling the model to learn robust, domain-invariant representations from unlabeled data. This is followed by discriminative fine-tuning with a dual-loss function that combines supervised contrastive and cross-entropy losses, aligning representation learning with label prediction. The hierarchical attention mechanism efficiently captures multi-scale temporal features from localized beat morphology to broader rhythm patterns at low computational cost. ECG-NAT achieves robust performance on benchmark datasets, with 88.1\% accuracy using only 1\% labeled data, demonstrating strong efficacy in low-resource settings. The framework combines superior classification performance with computational efficiency, making it practical for real-time ECG diagnosis. The code will be made available upon acceptance at: https://github.com/Mahsagazeran/ECG-NAT.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

ECG-NAT puts neighborhood attention and masked autoencoder pretraining together for low-label ECG work, but the 88.1% claim at 1% labels sits on a single split with no baselines or variance shown.

read the letter

The main point is a concrete architecture: ECG-NAT uses neighborhood attention inside a transformer, pretrains it as a masked autoencoder on several unlabeled ECG datasets, then fine-tunes with supervised contrastive loss plus cross-entropy. The abstract says this reaches 88.1% accuracy on benchmarks when only 1% of the labels are available. That combination and the low-label target are what the paper actually contributes on top of existing attention and self-supervised ideas for signals.

Referee Report

2 major / 1 minor

Summary. The manuscript presents ECG-NAT, a self-supervised Neighborhood Attention Transformer for multi-lead ECG arrhythmia classification. It employs a two-stage pipeline: generative pretraining via masked autoencoder reconstruction on unlabeled signals drawn from multiple diverse datasets to obtain domain-invariant representations, followed by discriminative fine-tuning that combines supervised contrastive loss with cross-entropy loss. The architecture uses hierarchical neighborhood attention to extract multi-scale temporal features ranging from local beat morphology to global rhythm patterns at modest computational cost. The central empirical claim is that the resulting model attains 88.1% accuracy on benchmark datasets when only 1% labeled data is available, together with practical efficiency for real-time diagnosis.

Significance. If the reported low-resource performance is substantiated by proper experimental controls, the work would constitute a useful contribution to self-supervised learning for biomedical time series. It directly targets the scarcity of labeled ECG data while introducing an efficient attention variant that respects the multi-scale structure of cardiac signals. The emphasis on cross-dataset pretraining and dual-objective fine-tuning offers a concrete recipe that could be adopted in clinical pipelines where annotation budgets are limited.

major comments (2)

[Abstract] Abstract: The headline result of 88.1% accuracy with 1% labeled data is stated without any accompanying information on the identity of the benchmark datasets, the train/test split protocol (patient-wise or otherwise), the number of random seeds or repeated draws, standard deviations, or statistical significance tests. Because this single number is the primary evidence offered for the low-resource efficacy claim, its lack of supporting experimental detail is load-bearing.
[Methods / Experiments] Methods / Experiments section: No ablation is reported that isolates the contribution of the dual-loss fine-tuning (contrastive + cross-entropy) versus a standard cross-entropy baseline, nor is there a cross-dataset generalization test that would substantiate the claim that masked-autoencoder pretraining on multiple unlabeled corpora produces transferable domain-invariant features. These omissions leave the transfer mechanism under-supported.

minor comments (1)

[Abstract] The abstract states that code will be released upon acceptance but supplies no current repository link or license information; adding a placeholder DOI or GitHub URL would improve reproducibility.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive feedback. We address each major comment point by point below and have revised the manuscript to strengthen the presentation of our results and experimental support.

read point-by-point responses

Referee: [Abstract] Abstract: The headline result of 88.1% accuracy with 1% labeled data is stated without any accompanying information on the identity of the benchmark datasets, the train/test split protocol (patient-wise or otherwise), the number of random seeds or repeated draws, standard deviations, or statistical significance tests. Because this single number is the primary evidence offered for the low-resource efficacy claim, its lack of supporting experimental detail is load-bearing.

Authors: We agree that the abstract should contextualize the key result with experimental details. In the revised version, we have expanded the abstract to name the benchmark datasets (PTB-XL and MIT-BIH Arrhythmia Database), specify the patient-wise train/test split protocol, report mean accuracy with standard deviation over five random seeds, and note that improvements are statistically significant (p < 0.05 via paired t-test). This directly addresses the load-bearing concern while preserving the abstract's brevity. revision: yes
Referee: [Methods / Experiments] Methods / Experiments section: No ablation is reported that isolates the contribution of the dual-loss fine-tuning (contrastive + cross-entropy) versus a standard cross-entropy baseline, nor is there a cross-dataset generalization test that would substantiate the claim that masked-autoencoder pretraining on multiple unlabeled corpora produces transferable domain-invariant features. These omissions leave the transfer mechanism under-supported.

Authors: We acknowledge these omissions weaken support for the dual-loss and transfer claims. We have added a dedicated ablation subsection in Experiments that compares dual-loss fine-tuning against a cross-entropy-only baseline, showing a 4.2% accuracy gain attributable to the contrastive term. We have also included a cross-dataset generalization test: the model pretrained on the combined unlabeled corpora is evaluated on a held-out dataset, demonstrating improved performance over single-dataset pretraining and supporting the domain-invariant representation claim. revision: yes

Circularity Check

0 steps flagged

No significant circularity in derivation chain

full rationale

The paper describes a two-stage pipeline: masked autoencoder pretraining on unlabeled multi-dataset ECG signals to learn representations, followed by fine-tuning on separate labeled data using a dual supervised contrastive plus cross-entropy loss. The pretraining objective operates solely on reconstruction of masked inputs without reference to downstream labels or fitted classification parameters, and the reported accuracy is an empirical evaluation on held-out splits rather than a quantity defined by construction from the training procedure itself. No load-bearing self-citations, uniqueness theorems, or ansatzes are invoked to force the architecture or results; the neighborhood attention mechanism and hierarchical feature extraction are presented as design choices with independent motivation. The derivation therefore remains self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

1 free parameters · 2 axioms · 0 invented entities

The central claim rests on the standard assumption that masked reconstruction pretraining yields transferable representations for ECG morphology and rhythm, plus the domain assumption that neighborhood attention efficiently captures multi-scale temporal structure at low cost. No new entities are postulated.

free parameters (1)

masking ratio
Chosen for the generative pretraining stage; value not specified in abstract but required for the masked autoencoder to function.

axioms (2)

domain assumption Masked autoencoder pretraining on diverse unlabeled ECG datasets produces domain-invariant representations
Invoked to justify the first stage transferring to the second stage.
domain assumption Neighborhood attention captures both beat-level morphology and rhythm-level dependencies
Stated as the mechanism enabling hierarchical multi-scale feature extraction.

pith-pipeline@v0.9.0 · 5582 in / 1372 out tokens · 44801 ms · 2026-05-14T20:37:05.749408+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

51 extracted references · 51 canonical work pages · 1 internal anchor

[1]

Detection of cardiovascular diseases in ecg images using machine learning and deep learning methods.IEEE transactions on artificial intelligence, 4(2):373–382, 2022

Mohammed B Abubaker and Bilal Babayi ˘git. Detection of cardiovascular diseases in ecg images using machine learning and deep learning methods.IEEE transactions on artificial intelligence, 4(2):373–382, 2022

work page 2022
[2]

Bentham Science Publishers, 2013

Jean-Jacques Goy, Jean-Christophe Stauffer, Jürg Schlaepfer, and Pierre Christeler.Electrocar- diography (ECG), volume 1. Bentham Science Publishers, 2013

work page 2013
[3]

InTech, 2012

Patricia Paglini-Oliva, MS Lo Presti, and H Walter Rivarola.Electrocardiography as a diagnos- tic method for Chagas disease in patients and experimental models. InTech, 2012

work page 2012
[4]

Convolutional neural network based automatic screening tool for cardiovascular diseases using different intervals of ECG signals

Hao Dai, Hsin-Ginn Hwang, and Vincent S Tseng. Convolutional neural network based automatic screening tool for cardiovascular diseases using different intervals of ECG signals. Computer Methods and Programs in Biomedicine, 203:106035, 2021

work page 2021
[5]

Chuang Han, Jiajia Sun, Yingnan Bian, Wenge Que, and Li Shi. Automated detection and localization of myocardial infarction with interpretability analysis based on deep learning.IEEE Transactions on Instrumentation and Measurement, 72:1–12, 2023

work page 2023
[6]

Strength of ensemble learning in automatic sleep stages classification using single-channel EEG and ECG signals.Medical & Biological Engineering & Computing, 62(4):997–1015, 2024

Samandokht Rashidi and Babak Mohammadzadeh Asl. Strength of ensemble learning in automatic sleep stages classification using single-channel EEG and ECG signals.Medical & Biological Engineering & Computing, 62(4):997–1015, 2024

work page 2024
[7]

Biometric-based human identification using ensemble-based technique and ECG signals.Applied Sciences, 13(16):9454, 2023

Anfal Ahmed Aleidan, Qaisar Abbas, Yassine Daadaa, Imran Qureshi, Ganeshkumar Perumal, Mostafa EA Ibrahim, and Alaa ES Ahmed. Biometric-based human identification using ensemble-based technique and ECG signals.Applied Sciences, 13(16):9454, 2023

work page 2023
[8]

A novel deep wavelet convolutional neural network for actual ECG signal denoising.Biomedical Signal Processing and Control, 87:105480, 2024

Yanrui Jin, Chengjin Qin, Jinlei Liu, Yunqing Liu, Zhiyuan Li, and Chengliang Liu. A novel deep wavelet convolutional neural network for actual ECG signal denoising.Biomedical Signal Processing and Control, 87:105480, 2024

work page 2024
[9]

Jaypee Brothers Medical Publishers, 2019

Atul Luthra.ECG made easy. Jaypee Brothers Medical Publishers, 2019

work page 2019
[10]

Platform for analysis and labeling of medical time series.Sensors, 20(24):7302, 2020

Andrejs Fedjajevs, Willemijn Groenendaal, Carlos Agell, and Evelien Hermeling. Platform for analysis and labeling of medical time series.Sensors, 20(24):7302, 2020

work page 2020
[11]

Machine learning in the electrocardiogram.Journal of electrocardiology, 57:S61–S64, 2019

Ana Mincholé, Julià Camps, Aurore Lyon, and Blanca Rodríguez. Machine learning in the electrocardiogram.Journal of electrocardiology, 57:S61–S64, 2019

work page 2019
[12]

Automated ECG classification using dual heartbeat coupling based on convolutional neural network.IEEE Access, 6:27465–27472, 2018

Xiaolong Zhai and Chung Tin. Automated ECG classification using dual heartbeat coupling based on convolutional neural network.IEEE Access, 6:27465–27472, 2018

work page 2018
[13]

Su- pervised ECG interval segmentation using LSTM neural network

Hedayat Abrishami, Chia Han, Xuefu Zhou, Matthew Campbell, and Richard Czosek. Su- pervised ECG interval segmentation using LSTM neural network. InProceedings of the International Conference on Bioinformatics & Computational Biology (BIOCOMP), pages 71–77, 2018. 18

work page 2018
[14]

LSTM-based ECG classifi- cation for continuous monitoring on personal wearable devices.IEEE journal of biomedical and health informatics, 24(2):515–523, 2019

Saeed Saadatnejad, Mohammadhosein Oveisi, and Matin Hashemi. LSTM-based ECG classifi- cation for continuous monitoring on personal wearable devices.IEEE journal of biomedical and health informatics, 24(2):515–523, 2019

work page 2019
[15]

A token selection- based multi-scale dual-branch CNN-transformer network for 12-lead ECG signal classification

Siyuan Zhang, Cheng Lian, Bingrong Xu, Junbin Zang, and Zhigang Zeng. A token selection- based multi-scale dual-branch CNN-transformer network for 12-lead ECG signal classification. Knowledge-Based Systems, 280:111006, 2023

work page 2023
[16]

MSW-Transformer: Multi-scale shifted windows transformer networks for 12-lead ECG classification.arXiv preprint arXiv:2306.12098, 2023

Renjie Cheng, Zhemin Zhuang, Shuxin Zhuang, Lei Xie, and Jingfeng Guo. MSW-Transformer: Multi-scale shifted windows transformer networks for 12-lead ECG classification.arXiv preprint arXiv:2306.12098, 2023

work page arXiv 2023
[17]

Guo-Jun Qi and Jiebo Luo. Small data challenges in big data era: A survey of recent progress on unsupervised and semi-supervised methods.IEEE Transactions on Pattern Analysis and Machine Intelligence, 44(4):2168–2187, 2020

work page 2020
[18]

Self-supervised representation learning from 12-lead ECG data.Computers in biology and medicine, 141:105114, 2022

Temesgen Mehari and Nils Strodthoff. Self-supervised representation learning from 12-lead ECG data.Computers in biology and medicine, 141:105114, 2022

work page 2022
[19]

Adver- sarial spatiotemporal contrastive learning for electrocardiogram signals.IEEE Transactions on Neural Networks and Learning Systems, 35(10):13845–13859, 2023

Ning Wang, Panpan Feng, Zhaoyang Ge, Yanjie Zhou, Bing Zhou, and Zongmin Wang. Adver- sarial spatiotemporal contrastive learning for electrocardiogram signals.IEEE Transactions on Neural Networks and Learning Systems, 35(10):13845–13859, 2023

work page 2023
[20]

A simple framework for contrastive learning of visual representations

Ting Chen, Simon Kornblith, Mohammad Norouzi, and Geoffrey Hinton. A simple framework for contrastive learning of visual representations. InInternational conference on machine learning, pages 1597–1607. PMLR, 2020

work page 2020
[21]

Spatiotemporal self-supervised representation learning from multi-lead ECG signals.Biomedical Signal Processing and Control, 84:104772, 2023

Rui Hu, Jie Chen, and Li Zhou. Spatiotemporal self-supervised representation learning from multi-lead ECG signals.Biomedical Signal Processing and Control, 84:104772, 2023

work page 2023
[22]

Masked autoencoder for ECG representation learning

Shunxiang Yang, Cheng Lian, and Zhigang Zeng. Masked autoencoder for ECG representation learning. In2022 12th International Conference on Information Science and Technology (ICIST), pages 95–98. IEEE, 2022

work page 2022
[23]

Fusion of heterogeneous attention mechanisms in multi-view convolutional neural network for text classification.Information Sciences, 548:295–312, 2021

Yunji Liang, Huihui Li, Bin Guo, Zhiwen Yu, Xiaolong Zheng, Sagar Samtani, and Daniel D Zeng. Fusion of heterogeneous attention mechanisms in multi-view convolutional neural network for text classification.Information Sciences, 548:295–312, 2021

work page 2021
[24]

A decomposable attention model for natural language inference

Ankur Parikh, Oscar Täckström, Dipanjan Das, and Jakob Uszkoreit. A decomposable attention model for natural language inference. InProceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, pages 2249–2255, 2016

work page 2016
[25]

A wide and deep transformer neural network for 12-lead ECG classification

Annamalai Natarajan, Yale Chang, Sara Mariani, Asif Rahman, Gregory Boverman, Shruti Vij, and Jonathan Rubin. A wide and deep transformer neural network for 12-lead ECG classification. In2020 Computing in Cardiology, pages 1–4. IEEE, 2020

work page 2020
[26]

An arrhythmia classifica- tion model based on vision transformer with deformable attention.Micromachines, 14(6):1155, 2023

Yanfang Dong, Miao Zhang, Lishen Qiu, Lirong Wang, and Yong Yu. An arrhythmia classifica- tion model based on vision transformer with deformable attention.Micromachines, 14(6):1155, 2023

work page 2023
[27]

Longformer: The Long-Document Transformer

Iz Beltagy, Matthew E Peters, and Arman Cohan. Longformer: The long-document transformer. arXiv preprint arXiv:2004.05150, 2020

work page internal anchor Pith review Pith/arXiv arXiv 2004
[28]

Swin transformer: Hierarchical vision transformer using shifted windows

Ze Liu, Yutong Lin, Yue Cao, Han Hu, Yixuan Wei, Zheng Zhang, Stephen Lin, and Baining Guo. Swin transformer: Hierarchical vision transformer using shifted windows. InProceedings of the IEEE/CVF international conference on computer vision, pages 10012–10022, 2021

work page 2021
[29]

Neighborhood attention transformer

Ali Hassani, Steven Walton, Jiachen Li, Shen Li, and Humphrey Shi. Neighborhood attention transformer. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 6185–6194, 2023

work page 2023
[30]

End-to-end blood pressure prediction via fully convolutional networks.IEEE Access, 7:185458–185468, 2019

Sanghyun Baek, Jiyong Jang, and Sungroh Yoon. End-to-end blood pressure prediction via fully convolutional networks.IEEE Access, 7:185458–185468, 2019. 19

work page 2019
[31]

Masked self-supervised ECG representation learning via multiview information bottleneck

Shunxiang Yang, Cheng Lian, Zhigang Zeng, Bingrong Xu, Yixin Su, and Chenyang Xue. Masked self-supervised ECG representation learning via multiview information bottleneck. Neural Computing and Applications, pages 1–13, 2024

work page 2024
[32]

PTB-XL, a large publicly available electrocardiography dataset.Scientific data, 7(1):1–15, 2020

Patrick Wagner, Nils Strodthoff, Ralf-Dieter Bousseljot, Dieter Kreiseler, Fatima I Lunze, Wojciech Samek, and Tobias Schaeffter. PTB-XL, a large publicly available electrocardiography dataset.Scientific data, 7(1):1–15, 2020

work page 2020
[33]

Feifei Liu, Chengyu Liu, Lina Zhao, Xiangyu Zhang, Xiaoling Wu, Xiaoyan Xu, Yulin Liu, Caiyun Ma, Shoushui Wei, Zhiqiang He, et al. An open access database for evaluating the algorithms of electrocardiogram rhythm and morphology abnormality detection.Journal of Medical Imaging and Health Informatics, 8(7):1368–1373, 2018

work page 2018
[34]

Lightweight multireceptive field CNN for 12-lead ECG signal classification.Computational Intelligence and Neuroscience, 2022(1):8413294, 2022

Degaga Wolde Feyisa, Taye Girma Debelee, Yehualashet Megersa Ayano, Samuel Rahimeto Kebede, and Tariku Fekadu Assore. Lightweight multireceptive field CNN for 12-lead ECG signal classification.Computational Intelligence and Neuroscience, 2022(1):8413294, 2022

work page 2022
[35]

A transformer-based deep neural network for arrhythmia detection using continuous ECG signals.Computers in Biology and Medicine, 144:105325, 2022

Rui Hu, Jie Chen, and Li Zhou. A transformer-based deep neural network for arrhythmia detection using continuous ECG signals.Computers in Biology and Medicine, 144:105325, 2022

work page 2022
[36]

ECGTransForm: Empowering adaptive ECG arrhyth- mia classification framework with bidirectional transformer.Biomedical Signal Processing and Control, 89:105714, 2024

Hany El-Ghaish and Emadeldeen Eldele. ECGTransForm: Empowering adaptive ECG arrhyth- mia classification framework with bidirectional transformer.Biomedical Signal Processing and Control, 89:105714, 2024

work page 2024
[37]

An empirical study of training self-supervised vision transformers

Xinlei Chen, Saining Xie, and Kaiming He. An empirical study of training self-supervised vision transformers. InProceedings of the IEEE/CVF international conference on computer vision, pages 9640–9649, 2021

work page 2021
[38]

CLOCS: Contrastive learning of cardiac signals across space, time, and patients

Dani Kiyasseh, Tingting Zhu, and David A Clifton. CLOCS: Contrastive learning of cardiac signals across space, time, and patients. InInternational Conference on Machine Learning, pages 5606–5615. PMLR, 2021

work page 2021
[39]

Emadeldeen Eldele, Mohamed Ragab, Zhenghua Chen, Min Wu, Chee-Keong Kwoh, Xiaoli Li, and Cuntai Guan. Self-supervised contrastive representation learning for semi-supervised time-series classification.IEEE Transactions on Pattern Analysis and Machine Intelligence, 45 (12):15604–15618, 2023

work page 2023
[40]

sCL-ST: Supervised con- trastive learning with semantic transformations for multiple lead ECG arrhythmia classification

Duc Le, Sang Truong, Patel Brijesh, Donald A Adjeroh, and Ngan Le. sCL-ST: Supervised con- trastive learning with semantic transformations for multiple lead ECG arrhythmia classification. IEEE journal of biomedical and health informatics, 27(6):2818–2828, 2023

work page 2023
[41]

Masked autoencoders are scalable vision learners

Kaiming He, Xinlei Chen, Saining Xie, Yanghao Li, Piotr Dollár, and Ross Girshick. Masked autoencoders are scalable vision learners. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 16000–16009, 2022

work page 2022
[42]

Self-supervised time series repre- sentation learning via cross reconstruction transformer.IEEE Transactions on Neural Networks and Learning Systems, 35(11):16129–16138, 2024

Wenrui Zhang, Ling Yang, Shijia Geng, and Shenda Hong. Self-supervised time series repre- sentation learning via cross reconstruction transformer.IEEE Transactions on Neural Networks and Learning Systems, 35(11):16129–16138, 2024

work page 2024
[43]

MaeFE: Masked autoencoders family of electrocardiogram for self-supervised pretraining and transfer learning.IEEE Transactions on Instrumentation and Measurement, 72:1–15, 2022

Huaicheng Zhang, Wenhan Liu, Jiguang Shi, Sheng Chang, Hao Wang, Jin He, and Qijun Huang. MaeFE: Masked autoencoders family of electrocardiogram for self-supervised pretraining and transfer learning.IEEE Transactions on Instrumentation and Measurement, 72:1–15, 2022

work page 2022
[44]

Guiding masked representa- tion learning to capture spatio-temporal relationship of electrocardiogram

Yeongyeon Na, Minje Park, Yunwon Tae, and Sunghoon Joo. Guiding masked representa- tion learning to capture spatio-temporal relationship of electrocardiogram. InThe Twelfth International Conference on Learning Representations, 2024

work page 2024
[45]

Spectral transformations for digital filters

Anton G Constantinides. Spectral transformations for digital filters. InProceedings of the Institution of Electrical Engineers, volume 117, pages 1585–1590. IET, 1970

work page 1970
[46]

Goldberger et al

Ary L. Goldberger et al. The Chapman University and Shaoxing People’s Hospital ECG database. PhysioNet, 2020. URLhttps://physionet.org/content/chapmanecg/1.0.0/. 20

work page 2020
[47]

Optimal multi-stage arrhythmia classification approach.Scientific reports, 10(1):2898, 2020

Jianwei Zheng, Huimin Chu, Daniele Struppa, Jianming Zhang, Sir Magdi Yacoub, Hesham El-Askary, Anthony Chang, Louis Ehwerhemuepha, Islam Abudayyeh, Alexander Barrett, et al. Optimal multi-stage arrhythmia classification approach.Scientific reports, 10(1):2898, 2020

work page 2020
[48]

Lead-agnostic self-supervised learning for local and global representations of electrocardiogram

Jungwoo Oh, Hyunseung Chung, Joon-myoung Kwon, Dong-gyun Hong, and Edward Choi. Lead-agnostic self-supervised learning for local and global representations of electrocardiogram. InConference on Health, Inference, and Learning, pages 338–353. PMLR, 2022

work page 2022
[49]

Direct lead assignment: A simple and scalable contrastive learning method for ECG and its IoMT applications.IEEE Internet of Things Journal, 12(5):5672–5686, 2024

Wenhan Liu, Shurong Pan, Sheng Chang, Qijun Huang, and Nan Jiang. Direct lead assignment: A simple and scalable contrastive learning method for ECG and its IoMT applications.IEEE Internet of Things Journal, 12(5):5672–5686, 2024

work page 2024
[50]

Lead- fusion barlow twins: A fused self-supervised learning method for multi-lead electrocardiograms

Wenhan Liu, Shurong Pan, Zhoutong Li, Sheng Chang, Qijun Huang, and Nan Jiang. Lead- fusion barlow twins: A fused self-supervised learning method for multi-lead electrocardiograms. Information Fusion, 114:102698, 2025

work page 2025
[51]

Jamc: A jigsaw-based autoencoder with masked contrastive learning for cardiovascular disease diagnosis.Knowledge-Based Systems, 311:113090, 2025

Yue Ge, Huaicheng Zhang, Jiguang Shi, Deyu Luo, Sheng Chang, Jin He, Qijun Huang, and Hao Wang. Jamc: A jigsaw-based autoencoder with masked contrastive learning for cardiovascular disease diagnosis.Knowledge-Based Systems, 311:113090, 2025. 21

work page 2025

[1] [1]

Detection of cardiovascular diseases in ecg images using machine learning and deep learning methods.IEEE transactions on artificial intelligence, 4(2):373–382, 2022

Mohammed B Abubaker and Bilal Babayi ˘git. Detection of cardiovascular diseases in ecg images using machine learning and deep learning methods.IEEE transactions on artificial intelligence, 4(2):373–382, 2022

work page 2022

[2] [2]

Bentham Science Publishers, 2013

Jean-Jacques Goy, Jean-Christophe Stauffer, Jürg Schlaepfer, and Pierre Christeler.Electrocar- diography (ECG), volume 1. Bentham Science Publishers, 2013

work page 2013

[3] [3]

InTech, 2012

Patricia Paglini-Oliva, MS Lo Presti, and H Walter Rivarola.Electrocardiography as a diagnos- tic method for Chagas disease in patients and experimental models. InTech, 2012

work page 2012

[4] [4]

Convolutional neural network based automatic screening tool for cardiovascular diseases using different intervals of ECG signals

Hao Dai, Hsin-Ginn Hwang, and Vincent S Tseng. Convolutional neural network based automatic screening tool for cardiovascular diseases using different intervals of ECG signals. Computer Methods and Programs in Biomedicine, 203:106035, 2021

work page 2021

[5] [5]

Chuang Han, Jiajia Sun, Yingnan Bian, Wenge Que, and Li Shi. Automated detection and localization of myocardial infarction with interpretability analysis based on deep learning.IEEE Transactions on Instrumentation and Measurement, 72:1–12, 2023

work page 2023

[6] [6]

Strength of ensemble learning in automatic sleep stages classification using single-channel EEG and ECG signals.Medical & Biological Engineering & Computing, 62(4):997–1015, 2024

Samandokht Rashidi and Babak Mohammadzadeh Asl. Strength of ensemble learning in automatic sleep stages classification using single-channel EEG and ECG signals.Medical & Biological Engineering & Computing, 62(4):997–1015, 2024

work page 2024

[7] [7]

Biometric-based human identification using ensemble-based technique and ECG signals.Applied Sciences, 13(16):9454, 2023

Anfal Ahmed Aleidan, Qaisar Abbas, Yassine Daadaa, Imran Qureshi, Ganeshkumar Perumal, Mostafa EA Ibrahim, and Alaa ES Ahmed. Biometric-based human identification using ensemble-based technique and ECG signals.Applied Sciences, 13(16):9454, 2023

work page 2023

[8] [8]

A novel deep wavelet convolutional neural network for actual ECG signal denoising.Biomedical Signal Processing and Control, 87:105480, 2024

Yanrui Jin, Chengjin Qin, Jinlei Liu, Yunqing Liu, Zhiyuan Li, and Chengliang Liu. A novel deep wavelet convolutional neural network for actual ECG signal denoising.Biomedical Signal Processing and Control, 87:105480, 2024

work page 2024

[9] [9]

Jaypee Brothers Medical Publishers, 2019

Atul Luthra.ECG made easy. Jaypee Brothers Medical Publishers, 2019

work page 2019

[10] [10]

Platform for analysis and labeling of medical time series.Sensors, 20(24):7302, 2020

Andrejs Fedjajevs, Willemijn Groenendaal, Carlos Agell, and Evelien Hermeling. Platform for analysis and labeling of medical time series.Sensors, 20(24):7302, 2020

work page 2020

[11] [11]

Machine learning in the electrocardiogram.Journal of electrocardiology, 57:S61–S64, 2019

Ana Mincholé, Julià Camps, Aurore Lyon, and Blanca Rodríguez. Machine learning in the electrocardiogram.Journal of electrocardiology, 57:S61–S64, 2019

work page 2019

[12] [12]

Automated ECG classification using dual heartbeat coupling based on convolutional neural network.IEEE Access, 6:27465–27472, 2018

Xiaolong Zhai and Chung Tin. Automated ECG classification using dual heartbeat coupling based on convolutional neural network.IEEE Access, 6:27465–27472, 2018

work page 2018

[13] [13]

Su- pervised ECG interval segmentation using LSTM neural network

Hedayat Abrishami, Chia Han, Xuefu Zhou, Matthew Campbell, and Richard Czosek. Su- pervised ECG interval segmentation using LSTM neural network. InProceedings of the International Conference on Bioinformatics & Computational Biology (BIOCOMP), pages 71–77, 2018. 18

work page 2018

[14] [14]

LSTM-based ECG classifi- cation for continuous monitoring on personal wearable devices.IEEE journal of biomedical and health informatics, 24(2):515–523, 2019

Saeed Saadatnejad, Mohammadhosein Oveisi, and Matin Hashemi. LSTM-based ECG classifi- cation for continuous monitoring on personal wearable devices.IEEE journal of biomedical and health informatics, 24(2):515–523, 2019

work page 2019

[15] [15]

A token selection- based multi-scale dual-branch CNN-transformer network for 12-lead ECG signal classification

Siyuan Zhang, Cheng Lian, Bingrong Xu, Junbin Zang, and Zhigang Zeng. A token selection- based multi-scale dual-branch CNN-transformer network for 12-lead ECG signal classification. Knowledge-Based Systems, 280:111006, 2023

work page 2023

[16] [16]

MSW-Transformer: Multi-scale shifted windows transformer networks for 12-lead ECG classification.arXiv preprint arXiv:2306.12098, 2023

Renjie Cheng, Zhemin Zhuang, Shuxin Zhuang, Lei Xie, and Jingfeng Guo. MSW-Transformer: Multi-scale shifted windows transformer networks for 12-lead ECG classification.arXiv preprint arXiv:2306.12098, 2023

work page arXiv 2023

[17] [17]

Guo-Jun Qi and Jiebo Luo. Small data challenges in big data era: A survey of recent progress on unsupervised and semi-supervised methods.IEEE Transactions on Pattern Analysis and Machine Intelligence, 44(4):2168–2187, 2020

work page 2020

[18] [18]

Self-supervised representation learning from 12-lead ECG data.Computers in biology and medicine, 141:105114, 2022

Temesgen Mehari and Nils Strodthoff. Self-supervised representation learning from 12-lead ECG data.Computers in biology and medicine, 141:105114, 2022

work page 2022

[19] [19]

Adver- sarial spatiotemporal contrastive learning for electrocardiogram signals.IEEE Transactions on Neural Networks and Learning Systems, 35(10):13845–13859, 2023

Ning Wang, Panpan Feng, Zhaoyang Ge, Yanjie Zhou, Bing Zhou, and Zongmin Wang. Adver- sarial spatiotemporal contrastive learning for electrocardiogram signals.IEEE Transactions on Neural Networks and Learning Systems, 35(10):13845–13859, 2023

work page 2023

[20] [20]

A simple framework for contrastive learning of visual representations

Ting Chen, Simon Kornblith, Mohammad Norouzi, and Geoffrey Hinton. A simple framework for contrastive learning of visual representations. InInternational conference on machine learning, pages 1597–1607. PMLR, 2020

work page 2020

[21] [21]

Spatiotemporal self-supervised representation learning from multi-lead ECG signals.Biomedical Signal Processing and Control, 84:104772, 2023

Rui Hu, Jie Chen, and Li Zhou. Spatiotemporal self-supervised representation learning from multi-lead ECG signals.Biomedical Signal Processing and Control, 84:104772, 2023

work page 2023

[22] [22]

Masked autoencoder for ECG representation learning

Shunxiang Yang, Cheng Lian, and Zhigang Zeng. Masked autoencoder for ECG representation learning. In2022 12th International Conference on Information Science and Technology (ICIST), pages 95–98. IEEE, 2022

work page 2022

[23] [23]

Fusion of heterogeneous attention mechanisms in multi-view convolutional neural network for text classification.Information Sciences, 548:295–312, 2021

Yunji Liang, Huihui Li, Bin Guo, Zhiwen Yu, Xiaolong Zheng, Sagar Samtani, and Daniel D Zeng. Fusion of heterogeneous attention mechanisms in multi-view convolutional neural network for text classification.Information Sciences, 548:295–312, 2021

work page 2021

[24] [24]

A decomposable attention model for natural language inference

Ankur Parikh, Oscar Täckström, Dipanjan Das, and Jakob Uszkoreit. A decomposable attention model for natural language inference. InProceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, pages 2249–2255, 2016

work page 2016

[25] [25]

A wide and deep transformer neural network for 12-lead ECG classification

Annamalai Natarajan, Yale Chang, Sara Mariani, Asif Rahman, Gregory Boverman, Shruti Vij, and Jonathan Rubin. A wide and deep transformer neural network for 12-lead ECG classification. In2020 Computing in Cardiology, pages 1–4. IEEE, 2020

work page 2020

[26] [26]

An arrhythmia classifica- tion model based on vision transformer with deformable attention.Micromachines, 14(6):1155, 2023

Yanfang Dong, Miao Zhang, Lishen Qiu, Lirong Wang, and Yong Yu. An arrhythmia classifica- tion model based on vision transformer with deformable attention.Micromachines, 14(6):1155, 2023

work page 2023

[27] [27]

Longformer: The Long-Document Transformer

Iz Beltagy, Matthew E Peters, and Arman Cohan. Longformer: The long-document transformer. arXiv preprint arXiv:2004.05150, 2020

work page internal anchor Pith review Pith/arXiv arXiv 2004

[28] [28]

Swin transformer: Hierarchical vision transformer using shifted windows

Ze Liu, Yutong Lin, Yue Cao, Han Hu, Yixuan Wei, Zheng Zhang, Stephen Lin, and Baining Guo. Swin transformer: Hierarchical vision transformer using shifted windows. InProceedings of the IEEE/CVF international conference on computer vision, pages 10012–10022, 2021

work page 2021

[29] [29]

Neighborhood attention transformer

Ali Hassani, Steven Walton, Jiachen Li, Shen Li, and Humphrey Shi. Neighborhood attention transformer. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 6185–6194, 2023

work page 2023

[30] [30]

End-to-end blood pressure prediction via fully convolutional networks.IEEE Access, 7:185458–185468, 2019

Sanghyun Baek, Jiyong Jang, and Sungroh Yoon. End-to-end blood pressure prediction via fully convolutional networks.IEEE Access, 7:185458–185468, 2019. 19

work page 2019

[31] [31]

Masked self-supervised ECG representation learning via multiview information bottleneck

Shunxiang Yang, Cheng Lian, Zhigang Zeng, Bingrong Xu, Yixin Su, and Chenyang Xue. Masked self-supervised ECG representation learning via multiview information bottleneck. Neural Computing and Applications, pages 1–13, 2024

work page 2024

[32] [32]

PTB-XL, a large publicly available electrocardiography dataset.Scientific data, 7(1):1–15, 2020

Patrick Wagner, Nils Strodthoff, Ralf-Dieter Bousseljot, Dieter Kreiseler, Fatima I Lunze, Wojciech Samek, and Tobias Schaeffter. PTB-XL, a large publicly available electrocardiography dataset.Scientific data, 7(1):1–15, 2020

work page 2020

[33] [33]

Feifei Liu, Chengyu Liu, Lina Zhao, Xiangyu Zhang, Xiaoling Wu, Xiaoyan Xu, Yulin Liu, Caiyun Ma, Shoushui Wei, Zhiqiang He, et al. An open access database for evaluating the algorithms of electrocardiogram rhythm and morphology abnormality detection.Journal of Medical Imaging and Health Informatics, 8(7):1368–1373, 2018

work page 2018

[34] [34]

Lightweight multireceptive field CNN for 12-lead ECG signal classification.Computational Intelligence and Neuroscience, 2022(1):8413294, 2022

Degaga Wolde Feyisa, Taye Girma Debelee, Yehualashet Megersa Ayano, Samuel Rahimeto Kebede, and Tariku Fekadu Assore. Lightweight multireceptive field CNN for 12-lead ECG signal classification.Computational Intelligence and Neuroscience, 2022(1):8413294, 2022

work page 2022

[35] [35]

A transformer-based deep neural network for arrhythmia detection using continuous ECG signals.Computers in Biology and Medicine, 144:105325, 2022

Rui Hu, Jie Chen, and Li Zhou. A transformer-based deep neural network for arrhythmia detection using continuous ECG signals.Computers in Biology and Medicine, 144:105325, 2022

work page 2022

[36] [36]

ECGTransForm: Empowering adaptive ECG arrhyth- mia classification framework with bidirectional transformer.Biomedical Signal Processing and Control, 89:105714, 2024

Hany El-Ghaish and Emadeldeen Eldele. ECGTransForm: Empowering adaptive ECG arrhyth- mia classification framework with bidirectional transformer.Biomedical Signal Processing and Control, 89:105714, 2024

work page 2024

[37] [37]

An empirical study of training self-supervised vision transformers

Xinlei Chen, Saining Xie, and Kaiming He. An empirical study of training self-supervised vision transformers. InProceedings of the IEEE/CVF international conference on computer vision, pages 9640–9649, 2021

work page 2021

[38] [38]

CLOCS: Contrastive learning of cardiac signals across space, time, and patients

Dani Kiyasseh, Tingting Zhu, and David A Clifton. CLOCS: Contrastive learning of cardiac signals across space, time, and patients. InInternational Conference on Machine Learning, pages 5606–5615. PMLR, 2021

work page 2021

[39] [39]

Emadeldeen Eldele, Mohamed Ragab, Zhenghua Chen, Min Wu, Chee-Keong Kwoh, Xiaoli Li, and Cuntai Guan. Self-supervised contrastive representation learning for semi-supervised time-series classification.IEEE Transactions on Pattern Analysis and Machine Intelligence, 45 (12):15604–15618, 2023

work page 2023

[40] [40]

sCL-ST: Supervised con- trastive learning with semantic transformations for multiple lead ECG arrhythmia classification

Duc Le, Sang Truong, Patel Brijesh, Donald A Adjeroh, and Ngan Le. sCL-ST: Supervised con- trastive learning with semantic transformations for multiple lead ECG arrhythmia classification. IEEE journal of biomedical and health informatics, 27(6):2818–2828, 2023

work page 2023

[41] [41]

Masked autoencoders are scalable vision learners

Kaiming He, Xinlei Chen, Saining Xie, Yanghao Li, Piotr Dollár, and Ross Girshick. Masked autoencoders are scalable vision learners. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 16000–16009, 2022

work page 2022

[42] [42]

Self-supervised time series repre- sentation learning via cross reconstruction transformer.IEEE Transactions on Neural Networks and Learning Systems, 35(11):16129–16138, 2024

Wenrui Zhang, Ling Yang, Shijia Geng, and Shenda Hong. Self-supervised time series repre- sentation learning via cross reconstruction transformer.IEEE Transactions on Neural Networks and Learning Systems, 35(11):16129–16138, 2024

work page 2024

[43] [43]

MaeFE: Masked autoencoders family of electrocardiogram for self-supervised pretraining and transfer learning.IEEE Transactions on Instrumentation and Measurement, 72:1–15, 2022

Huaicheng Zhang, Wenhan Liu, Jiguang Shi, Sheng Chang, Hao Wang, Jin He, and Qijun Huang. MaeFE: Masked autoencoders family of electrocardiogram for self-supervised pretraining and transfer learning.IEEE Transactions on Instrumentation and Measurement, 72:1–15, 2022

work page 2022

[44] [44]

Guiding masked representa- tion learning to capture spatio-temporal relationship of electrocardiogram

Yeongyeon Na, Minje Park, Yunwon Tae, and Sunghoon Joo. Guiding masked representa- tion learning to capture spatio-temporal relationship of electrocardiogram. InThe Twelfth International Conference on Learning Representations, 2024

work page 2024

[45] [45]

Spectral transformations for digital filters

Anton G Constantinides. Spectral transformations for digital filters. InProceedings of the Institution of Electrical Engineers, volume 117, pages 1585–1590. IET, 1970

work page 1970

[46] [46]

Goldberger et al

Ary L. Goldberger et al. The Chapman University and Shaoxing People’s Hospital ECG database. PhysioNet, 2020. URLhttps://physionet.org/content/chapmanecg/1.0.0/. 20

work page 2020

[47] [47]

Optimal multi-stage arrhythmia classification approach.Scientific reports, 10(1):2898, 2020

Jianwei Zheng, Huimin Chu, Daniele Struppa, Jianming Zhang, Sir Magdi Yacoub, Hesham El-Askary, Anthony Chang, Louis Ehwerhemuepha, Islam Abudayyeh, Alexander Barrett, et al. Optimal multi-stage arrhythmia classification approach.Scientific reports, 10(1):2898, 2020

work page 2020

[48] [48]

Lead-agnostic self-supervised learning for local and global representations of electrocardiogram

Jungwoo Oh, Hyunseung Chung, Joon-myoung Kwon, Dong-gyun Hong, and Edward Choi. Lead-agnostic self-supervised learning for local and global representations of electrocardiogram. InConference on Health, Inference, and Learning, pages 338–353. PMLR, 2022

work page 2022

[49] [49]

Direct lead assignment: A simple and scalable contrastive learning method for ECG and its IoMT applications.IEEE Internet of Things Journal, 12(5):5672–5686, 2024

Wenhan Liu, Shurong Pan, Sheng Chang, Qijun Huang, and Nan Jiang. Direct lead assignment: A simple and scalable contrastive learning method for ECG and its IoMT applications.IEEE Internet of Things Journal, 12(5):5672–5686, 2024

work page 2024

[50] [50]

Lead- fusion barlow twins: A fused self-supervised learning method for multi-lead electrocardiograms

Wenhan Liu, Shurong Pan, Zhoutong Li, Sheng Chang, Qijun Huang, and Nan Jiang. Lead- fusion barlow twins: A fused self-supervised learning method for multi-lead electrocardiograms. Information Fusion, 114:102698, 2025

work page 2025

[51] [51]

Jamc: A jigsaw-based autoencoder with masked contrastive learning for cardiovascular disease diagnosis.Knowledge-Based Systems, 311:113090, 2025

Yue Ge, Huaicheng Zhang, Jiguang Shi, Deyu Luo, Sheng Chang, Jin He, Qijun Huang, and Hao Wang. Jamc: A jigsaw-based autoencoder with masked contrastive learning for cardiovascular disease diagnosis.Knowledge-Based Systems, 311:113090, 2025. 21

work page 2025