Learning General Representation of 12-Lead Electrocardiogram with a Joint-Embedding Predictive Architecture

Sehun Kim

arxiv: 2410.08559 · v5 · submitted 2024-10-11 · 💻 cs.LG · cs.AI

Learning General Representation of 12-Lead Electrocardiogram with a Joint-Embedding Predictive Architecture

Sehun Kim This is my paper

Pith reviewed 2026-05-23 19:14 UTC · model grok-4.3

classification 💻 cs.LG cs.AI

keywords self-supervised learning12-lead ECGmasked modelingjoint embedding predictive architecturediagnostic classificationfeature extractionsegmentation

0 comments

The pith

ECG-JEPA learns semantic 12-lead ECG representations by predicting masked tokens in latent space rather than reconstructing raw signals.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper presents ECG-JEPA as a self-supervised model that performs masked modeling directly in the hidden latent space of 12-lead ECG recordings. It argues this bypasses the drawbacks of reconstructing raw waveforms, such as amplifying noise and the shortcomings of simple L2 losses. The model is pre-trained on roughly 180,000 unlabeled samples drawn from multiple public ECG collections. A specialized Cross-Pattern Attention module is added to respect the multi-lead structure. The resulting representations reach state-of-the-art results on downstream diagnostic classification, feature extraction, and segmentation tasks.

Core claim

Masked modeling in the latent space can be a powerful alternative to existing self-supervised methods in the ECG domain. ECG-JEPA learns semantic representations of ECG data by predicting in the hidden latent space, bypassing the need to reconstruct raw signals, and achieves state-of-the-art performance in various downstream tasks including diagnostic classification, feature extraction, and segmentation.

What carries the argument

ECG-JEPA, a joint-embedding predictive architecture that predicts masked representations in latent space, together with Cross-Pattern Attention (CroPA), a masked attention mechanism designed for the 12-lead structure.

If this is right

Representations learned without labels can be directly transferred to diagnostic classification of cardiac conditions.
The same pre-trained encoder improves performance on ECG feature extraction and segmentation tasks.
Training on the union of open ECG datasets totaling approximately 180,000 samples produces general-purpose 12-lead representations.
Cross-Pattern Attention enables the model to exploit inter-lead relationships during masked prediction.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same latent-prediction approach may transfer to other noisy physiological signals such as EEG or EMG.
Because raw-signal reconstruction is avoided, the method could lower memory and compute costs during pre-training on large ECG archives.
The learned representations might support few-shot adaptation to rare arrhythmia subtypes not seen in the original training union.

Load-bearing premise

Predicting in the latent space rather than reconstructing raw signals addresses the limitations of naive L2 loss and avoids producing unnecessary noise details common in ECG data.

What would settle it

A controlled experiment in which a reconstruction-based self-supervised baseline, trained on the same 180,000-sample union and evaluated on identical downstream splits, matches or exceeds ECG-JEPA accuracy on diagnostic classification and segmentation would falsify the claimed advantage of latent-space prediction.

Figures

Figures reproduced from arXiv: 2410.08559 by Sehun Kim.

**Figure 2.** Figure 2: Key ECG Features. Student Teacher Predictor Predictor Predictor Masked patches ECG patches Contextualized representations Masked representations Predictions L1 loss [PITH_FULL_IMAGE:figures/full_fig_p003_2.png] view at source ↗

**Figure 3.** Figure 3: ECG-JEPA training overview. For illustration, we use [PITH_FULL_IMAGE:figures/full_fig_p003_3.png] view at source ↗

**Figure 4.** Figure 4: Cross-Pattern Attention (CroPA). The patch in the middle attends only to the colored patches. [PITH_FULL_IMAGE:figures/full_fig_p005_4.png] view at source ↗

**Figure 5.** Figure 5: Squares following the encoder represent the representations of ECG patches. The representations are [PITH_FULL_IMAGE:figures/full_fig_p005_5.png] view at source ↗

read the original abstract

Electrocardiogram (ECG) captures the heart's electrical signals, offering valuable information for diagnosing cardiac conditions. However, the scarcity of labeled data makes it challenging to fully leverage supervised learning in the medical domain. Self-supervised learning (SSL) offers a promising solution, enabling models to learn from unlabeled data and uncover meaningful patterns. In this paper, we show that masked modeling in the latent space can be a powerful alternative to existing self-supervised methods in the ECG domain. We introduce ECG-JEPA, an SSL model for 12-lead ECG analysis that learns semantic representations of ECG data by predicting in the hidden latent space, bypassing the need to reconstruct raw signals. This approach offers several advantages in the ECG domain: (1) it avoids producing unnecessary details, such as noise, which is common in ECG; and (2) it addresses the limitations of naive L2 loss between raw signals. Another key contribution is the introduction of Cross-Pattern Attention (CroPA), a specialized masked attention mechanism tailored for 12-lead ECG data. ECG-JEPA is trained on the union of several open ECG datasets, totaling approximately 180,000 samples, and achieves state-of-the-art performance in various downstream tasks including diagnostic classification, feature extraction, and segmentation. Our code is openly available at https://github.com/sehunfromdaegu/ECG_JEPA.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

ECG-JEPA adapts JEPA latent prediction to 12-lead ECG via CroPA and reports gains on downstream tasks with open code.

read the letter

ECG-JEPA adapts the joint-embedding predictive architecture to 12-lead ECG, using a new Cross-Pattern Attention module to manage lead-specific masking. The model is trained on roughly 180,000 samples from public datasets and claims better results than prior self-supervised methods on diagnostic classification, feature extraction, and segmentation tasks. The code is released on GitHub, which is a practical plus for anyone wanting to test it directly. The core idea of predicting in latent space rather than reconstructing raw signals is laid out clearly and matches the stated ECG-specific motivation around avoiding noise and L2-loss issues. CroPA looks like a straightforward domain tweak to handle the multi-lead structure. The methods section stays internally consistent with no obvious contradictions or unstated leakage risks. The main soft spot is the empirical side. The abstract-level claims of state-of-the-art performance need the full tables, baselines, ablations on mask ratio and CroPA, and evaluation details to land solidly; if those are present and use standard splits, the concern is minor. No circular reasoning appears. This paper is aimed at researchers working on self-supervised representations for ECG or similar biosignals. A reader in that niche can pull the architecture details and the repo for their own experiments. It has enough new pieces and reproducible elements to warrant a serious referee, even if the gains turn out incremental once the numbers are checked.

Referee Report

0 major / 2 minor

Summary. The paper introduces ECG-JEPA, a self-supervised learning model based on the Joint-Embedding Predictive Architecture (JEPA) for 12-lead ECG signals. It performs masked prediction in the latent space rather than raw-signal reconstruction, introduces a Cross-Pattern Attention (CroPA) mechanism for multi-lead data, pretrains on a union of open datasets totaling ~180k samples, and reports state-of-the-art results on downstream tasks including diagnostic classification, feature extraction, and segmentation. The code is released openly.

Significance. If the empirical results hold under standard evaluation protocols, the work provides a useful alternative SSL approach for ECG representation learning that sidesteps issues with raw-signal L2 reconstruction. The open-source code is a clear strength that supports reproducibility and follow-up work in cardiac signal analysis.

minor comments (2)

[Abstract] The abstract asserts SOTA performance without any quantitative metrics or baseline names; adding one or two key numbers (e.g., AUC or Dice improvements) would strengthen the summary.
[Methods] Notation for the CroPA module and the latent-space predictor could be clarified with an explicit equation or diagram reference in the methods section to aid readers unfamiliar with JEPA variants.

Simulated Author's Rebuttal

0 responses · 0 unresolved

We thank the referee for the positive summary, significance assessment, and recommendation of minor revision. The report highlights the strengths of ECG-JEPA, including the latent-space prediction approach, CroPA mechanism, pretraining scale, downstream results, and open code release. No major comments are provided in the report.

Circularity Check

0 steps flagged

No significant circularity; derivation is empirical and self-contained

full rationale

The paper introduces ECG-JEPA as an application of JEPA-style latent-space masked prediction to 12-lead ECG, augmented by the new CroPA attention mechanism. Training occurs on ~180k unlabeled samples with standard SSL objectives; performance is measured on held-out downstream tasks (classification, segmentation). No equations or claims reduce a 'prediction' to a fitted input by construction, no self-citation chain bears the central result, and no ansatz is smuggled via prior author work. The approach follows standard self-supervised principles without internal definitional collapse.

Axiom & Free-Parameter Ledger

1 free parameters · 1 axioms · 1 invented entities

The central claim rests on the effectiveness of latent-space masked prediction for noisy ECG signals and on the utility of the newly introduced CroPA module. Training uses a union of public datasets whose representativeness is assumed but not independently verified in the abstract.

free parameters (1)

Mask ratio and other training hyperparameters
Standard deep-learning choices whose specific values are not reported in the abstract but affect performance.

axioms (1)

domain assumption Predicting representations in latent space is advantageous for ECG because it avoids reconstructing noise and sidesteps limitations of naive L2 loss on raw signals
Explicitly listed as advantages (1) and (2) in the abstract.

invented entities (1)

Cross-Pattern Attention (CroPA) no independent evidence
purpose: Specialized masked attention mechanism tailored for 12-lead ECG data
Introduced as a key architectural contribution; no independent evidence of effectiveness outside the paper's claimed results.

pith-pipeline@v0.9.0 · 5768 in / 1185 out tokens · 35270 ms · 2026-05-23T19:14:37.969723+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

masked modeling in the latent space... avoids producing unnecessary details, such as noise... addresses the limitations of naïve L2 loss between raw signals
IndisputableMonolith/Foundation/RealityFromDistinction.lean reality_from_one_distinction unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

ECG-JEPA... transformer... CroPA... student-teacher... smooth L1 loss

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Forward citations

Cited by 3 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

Pretraining Strategies and Scaling for ECG Foundation Models: A Systematic Study
eess.SP 2026-05 unverdicted novelty 7.0

Contrastive predictive coding pretraining combined with structured state space models yields the strongest ECG foundation models, with continued gains from scaling data to 11 million samples.
Extending Pretrained 10-Second ECG Foundation Models to Longer Horizons
cs.LG 2026-05 unverdicted novelty 5.0

A parameter-efficient plug-in framework adds structurally compatible long-sequence processing and semantically informed temporal modeling to extend pretrained 10-second ECG foundation models to longer variable-length inputs.
ECG Foundation Models and Medical LLMs for Agentic Cardiovascular Intelligence at the Edge: A Review and Outlook
eess.SP 2026-04 unverdicted novelty 3.0

ECG foundation models for signal interpretation and medical LLMs for reasoning can be integrated into agentic systems for real-time cardiovascular intelligence on edge devices.

Reference graph

Works this paper leans on

38 extracted references · 38 canonical work pages · cited by 3 Pith papers · 3 internal anchors

[1]

Cardiologist-level arrhythmia detection and classification in ambulatory electrocardiograms using a deep neural network

Awni Y Hannun, Pranav Rajpurkar, Masoumeh Haghpanahi, Geoffrey H Tison, Codie Bourn, Mintu P Turakhia, and Andrew Y Ng. Cardiologist-level arrhythmia detection and classification in ambulatory electrocardiograms using a deep neural network. Nature medicine, 25(1):65–69, 2019

work page 2019
[2]

Automatic diagnosis of the 12-lead ecg using a deep neural network

Antônio H Ribeiro, Manoel Horta Ribeiro, Gabriela MM Paixão, Derick M Oliveira, Paulo R Gomes, Jéssica A Canazart, Milton PS Ferreira, Carl R Andersson, Peter W Macfarlane, Wagner Meira Jr, et al. Automatic diagnosis of the 12-lead ecg using a deep neural network. Nature communications, 11(1):1760, 2020

work page 2020
[3]

Artificial intelligence-enhanced electrocardiography in cardiovascular disease management

Konstantinos C Siontis, Peter A Noseworthy, Zachi I Attia, and Paul A Friedman. Artificial intelligence-enhanced electrocardiography in cardiovascular disease management. Nature Reviews Cardiology, 18(7):465–478, 2021

work page 2021
[4]

Bert: Pre-training of deep bidirectional transformers for language understanding, 2019

Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. Bert: Pre-training of deep bidirectional transformers for language understanding, 2019. 13

work page 2019
[5]

Language models are few-shot learners

Tom Brown, Benjamin Mann, Nick Ryder, Melanie Subbiah, Jared D Kaplan, Prafulla Dhariwal, Arvind Nee- lakantan, Pranav Shyam, Girish Sastry, Amanda Askell, et al. Language models are few-shot learners. Advances in neural information processing systems, 33:1877–1901, 2020

work page 1901
[6]

Llama: Open and efficient foundation language models, 2023

Hugo Touvron, Thibaut Lavril, Gautier Izacard, Xavier Martinet, Marie-Anne Lachaux, Timothée Lacroix, Baptiste Rozière, Naman Goyal, Eric Hambro, Faisal Azhar, Aurelien Rodriguez, Armand Joulin, Edouard Grave, and Guillaume Lample. Llama: Open and efficient foundation language models, 2023

work page 2023
[7]

A simple framework for contrastive learning of visual representations

Ting Chen, Simon Kornblith, Mohammad Norouzi, and Geoffrey Hinton. A simple framework for contrastive learning of visual representations. In International conference on machine learning, pages 1597–1607. PMLR, 2020

work page 2020
[8]

Masked autoencoders are scalable vision learners

Kaiming He, Xinlei Chen, Saining Xie, Yanghao Li, Piotr Dollár, and Ross Girshick. Masked autoencoders are scalable vision learners. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 16000–16009, 2022

work page 2022
[9]

Self-supervised learning from images with a joint-embedding predictive architecture

Mahmoud Assran, Quentin Duval, Ishan Misra, Piotr Bojanowski, Pascal Vincent, Michael Rabbat, Yann LeCun, and Nicolas Ballas. Self-supervised learning from images with a joint-embedding predictive architecture. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 15619–15629, 2023

work page 2023
[10]

Videomae: Masked autoencoders are data-efficient learners for self-supervised video pre-training

Zhan Tong, Yibing Song, Jue Wang, and Limin Wang. Videomae: Masked autoencoders are data-efficient learners for self-supervised video pre-training. Advances in neural information processing systems , 35:10078–10093, 2022

work page 2022
[11]

Revisiting feature prediction for learning visual representations from video, 2024

Adrien Bardes, Quentin Garrido, Jean Ponce, Xinlei Chen, Michael Rabbat, Yann LeCun, Mahmoud Assran, and Nicolas Ballas. Revisiting feature prediction for learning visual representations from video, 2024

work page 2024
[12]

HaoChen, Adrien Gaidon, and Tengyu Ma

Hong Liu, Jeff Z. HaoChen, Adrien Gaidon, and Tengyu Ma. Self-supervised learning is more robust to dataset imbalance, 2022

work page 2022
[13]

The only EKG book you’ll ever need

Malcolm S Thaler. The only EKG book you’ll ever need. Lippincott Williams & Wilkins, 2021

work page 2021
[14]

Anselm Blumer, Andrzej Ehrenfeucht, David Haussler, and Manfred Warmuth

Randall Balestriero, Mark Ibrahim, Vlad Sobal, Ari Morcos, Shashank Shekhar, Tom Goldstein, Florian Bordes, Adrien Bardes, Gregoire Mialon, Yuandong Tian, Avi Schwarzschild, Andrew Gordon Wilson, Jonas Geiping, Quentin Garrido, Pierre Fernandez, Amir Bar, Hamed Pirsiavash, Yann LeCun, and Micah Goldblum. A cookbook of self-supervised learning, 2023. URL h...

work page arXiv 2023
[15]

Extracting and composing robust features with denoising autoencoders

Pascal Vincent, Hugo Larochelle, Yoshua Bengio, and Pierre-Antoine Manzagol. Extracting and composing robust features with denoising autoencoders. In Proceedings of the 25th international conference on Machine learning, pages 1096–1103, 2008

work page 2008
[16]

Learning by reconstruction produces uninformative features for perception,

Randall Balestriero and Yann LeCun. Learning by reconstruction produces uninformative features for perception,

work page
[17]

URL https://arxiv.org/abs/2402.11337

work page arXiv
[18]

Bootstrap your own latent-a new approach to self-supervised learning

Jean-Bastien Grill, Florian Strub, Florent Altché, Corentin Tallec, Pierre Richemond, Elena Buchatskaya, Carl Doersch, Bernardo Avila Pires, Zhaohan Guo, Mohammad Gheshlaghi Azar, et al. Bootstrap your own latent-a new approach to self-supervised learning. Advances in neural information processing systems, 33:21271–21284, 2020

work page 2020
[19]

VICReg: Variance-Invariance-Covariance Regularization for Self-Supervised Learning

Adrien Bardes, Jean Ponce, and Yann LeCun. Vicreg: Variance-invariance-covariance regularization for self- supervised learning, 2022. URL https://arxiv.org/abs/2105.04906

work page internal anchor Pith review Pith/arXiv arXiv 2022
[20]

Exploring simple siamese representation learning, 2020

Xinlei Chen and Kaiming He. Exploring simple siamese representation learning, 2020. URL https://arxiv. org/abs/2011.10566

work page arXiv 2020
[21]

A path towards autonomous machine intelligence version 0.9

Yann LeCun. A path towards autonomous machine intelligence version 0.9. 2, 2022-06-27. https:// openreview.net/forum?id=BZ5a1r-kVsf, 2022. Accessed: 2024-06-01

work page 2022
[22]

Clocs: Contrastive learning of cardiac signals across space, time, and patients

Dani Kiyasseh, Tingting Zhu, and David A Clifton. Clocs: Contrastive learning of cardiac signals across space, time, and patients. In International Conference on Machine Learning, pages 5606–5615. PMLR, 2021

work page 2021
[23]

Representation learning with contrastive predictive coding,

Aaron van den Oord, Yazhe Li, and Oriol Vinyals. Representation learning with contrastive predictive coding,

work page
[24]

URL https://arxiv.org/abs/1807.03748

work page internal anchor Pith review Pith/arXiv arXiv
[25]

Self-supervised representation learning from 12-lead ecg data

Temesgen Mehari and Nils Strodthoff. Self-supervised representation learning from 12-lead ecg data. Computers in biology and medicine, 141:105114, 2022

work page 2022
[26]

Maefe: Masked autoencoders family of electrocardiogram for self-supervised pretraining and transfer learning

Huaicheng Zhang, Wenhan Liu, Jiguang Shi, Sheng Chang, Hao Wang, Jin He, and Qijun Huang. Maefe: Masked autoencoders family of electrocardiogram for self-supervised pretraining and transfer learning. IEEE Transactions on Instrumentation and Measurement, 72:1–15, 2022. 14

work page 2022
[27]

Guiding masked representation learning to capture spatio-temporal relationship of electrocardiogram, 2024

Yeongyeon Na, Minje Park, Yunwon Tae, and Sunghoon Joo. Guiding masked representation learning to capture spatio-temporal relationship of electrocardiogram, 2024

work page 2024
[28]

A 12-lead electrocardiogram database for arrhythmia research covering more than 10,000 patients

Jianwei Zheng, Jianming Zhang, Sidy Danioko, Hai Yao, Hangyuan Guo, and Cyril Rakovski. A 12-lead electrocardiogram database for arrhythmia research covering more than 10,000 patients. Scientific data, 7(1):48, 2020

work page 2020
[29]

Optimal multi-stage arrhythmia classification approach

Jianwei Zheng, Huimin Chu, Daniele Struppa, Jianming Zhang, Sir Magdi Yacoub, Hesham El-Askary, Anthony Chang, Louis Ehwerhemuepha, Islam Abudayyeh, Alexander Barrett, et al. Optimal multi-stage arrhythmia classification approach. Scientific reports, 10(1):2898, 2020

work page 2020
[30]

Large-scale classification of 12-lead ecg with deep learning

Yu-Jhen Chen, Chien-Liang Liu, Vincent S Tseng, Yu-Feng Hu, and Shih-Ann Chen. Large-scale classification of 12-lead ecg with deep learning. In 2019 IEEE EMBS international conference on biomedical & health informatics (BHI), pages 1–4. IEEE, 2019

work page 2019
[31]

Ptb-xl, a large publicly available electrocardiography dataset

Patrick Wagner, Nils Strodthoff, Ralf-Dieter Bousseljot, Dieter Kreiseler, Fatima I Lunze, Wojciech Samek, and Tobias Schaeffter. Ptb-xl, a large publicly available electrocardiography dataset. Scientific data, 7(1):1–15, 2020

work page 2020
[32]

An open access database for evaluating the algorithms of electrocardiogram rhythm and morphology abnormality detection

Feifei Liu, Chengyu Liu, Lina Zhao, Xiangyu Zhang, Xiaoling Wu, Xiaoyan Xu, Yulin Liu, Caiyun Ma, Shoushui Wei, Zhiqiang He, et al. An open access database for evaluating the algorithms of electrocardiogram rhythm and morphology abnormality detection. Journal of Medical Imaging and Health Informatics, 8(7):1368–1373, 2018

work page 2018
[33]

Ecg segmentation by neural networks: Errors and correction

Iana Sereda, Sergey Alekseev, Aleksandra Koneva, Roman Kataev, and Grigory Osipov. Ecg segmentation by neural networks: Errors and correction. In 2019 International Joint Conference on Neural Networks (IJCNN), pages 1–7. IEEE, 2019

work page 2019
[34]

Deep learning for ecg segmentation

Viktor Moskalenko, Nikolai Zolotykh, and Grigory Osipov. Deep learning for ecg segmentation. In Advances in Neural Computation, Machine Learning, and Cognitive Research III: Selected Papers from the XXI International Conference on Neuroinformatics, October 7-11, 2019, Dolgoprudny, Moscow Region, Russia, pages 246–254. Springer, 2020

work page 2019
[35]

Post-processing refined ecg delineation based on 1d-unet

Zhenqin Chen, Mengying Wang, Meiyu Zhang, Wei Huang, Hanjie Gu, and Jinshan Xu. Post-processing refined ecg delineation based on 1d-unet. Biomedical Signal Processing and Control, 79:104106, 2023

work page 2023
[36]

Deep learning based ecg segmentation for delineation of diverse arrhythmias

Chankyu Joung, Mijin Kim, Taejin Paik, Seong-Ho Kong, Seung-Young Oh, Won Kyeong Jeon, Jae-hu Jeon, Joong-Sik Hong, Wan-Joong Kim, Woong Kook, et al. Deep learning based ecg segmentation for delineation of diverse arrhythmias. PloS one, 19(6):e0303178, 2024

work page 2024
[37]

Accurate, Large Minibatch SGD: Training ImageNet in 1 Hour

Priya Goyal, Piotr Dollár, Ross Girshick, Pieter Noordhuis, Lukasz Wesolowski, Aapo Kyrola, Andrew Tulloch, Yangqing Jia, and Kaiming He. Accurate, large minibatch sgd: Training imagenet in 1 hour, 2018. URL https://arxiv.org/abs/1706.02677

work page internal anchor Pith review Pith/arXiv arXiv 2018
[38]

Deep residual learning for image recognition

Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 770–778, 2016. 15

work page 2016

[1] [1]

Cardiologist-level arrhythmia detection and classification in ambulatory electrocardiograms using a deep neural network

Awni Y Hannun, Pranav Rajpurkar, Masoumeh Haghpanahi, Geoffrey H Tison, Codie Bourn, Mintu P Turakhia, and Andrew Y Ng. Cardiologist-level arrhythmia detection and classification in ambulatory electrocardiograms using a deep neural network. Nature medicine, 25(1):65–69, 2019

work page 2019

[2] [2]

Automatic diagnosis of the 12-lead ecg using a deep neural network

Antônio H Ribeiro, Manoel Horta Ribeiro, Gabriela MM Paixão, Derick M Oliveira, Paulo R Gomes, Jéssica A Canazart, Milton PS Ferreira, Carl R Andersson, Peter W Macfarlane, Wagner Meira Jr, et al. Automatic diagnosis of the 12-lead ecg using a deep neural network. Nature communications, 11(1):1760, 2020

work page 2020

[3] [3]

Artificial intelligence-enhanced electrocardiography in cardiovascular disease management

Konstantinos C Siontis, Peter A Noseworthy, Zachi I Attia, and Paul A Friedman. Artificial intelligence-enhanced electrocardiography in cardiovascular disease management. Nature Reviews Cardiology, 18(7):465–478, 2021

work page 2021

[4] [4]

Bert: Pre-training of deep bidirectional transformers for language understanding, 2019

Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. Bert: Pre-training of deep bidirectional transformers for language understanding, 2019. 13

work page 2019

[5] [5]

Language models are few-shot learners

Tom Brown, Benjamin Mann, Nick Ryder, Melanie Subbiah, Jared D Kaplan, Prafulla Dhariwal, Arvind Nee- lakantan, Pranav Shyam, Girish Sastry, Amanda Askell, et al. Language models are few-shot learners. Advances in neural information processing systems, 33:1877–1901, 2020

work page 1901

[6] [6]

Llama: Open and efficient foundation language models, 2023

Hugo Touvron, Thibaut Lavril, Gautier Izacard, Xavier Martinet, Marie-Anne Lachaux, Timothée Lacroix, Baptiste Rozière, Naman Goyal, Eric Hambro, Faisal Azhar, Aurelien Rodriguez, Armand Joulin, Edouard Grave, and Guillaume Lample. Llama: Open and efficient foundation language models, 2023

work page 2023

[7] [7]

A simple framework for contrastive learning of visual representations

Ting Chen, Simon Kornblith, Mohammad Norouzi, and Geoffrey Hinton. A simple framework for contrastive learning of visual representations. In International conference on machine learning, pages 1597–1607. PMLR, 2020

work page 2020

[8] [8]

Masked autoencoders are scalable vision learners

Kaiming He, Xinlei Chen, Saining Xie, Yanghao Li, Piotr Dollár, and Ross Girshick. Masked autoencoders are scalable vision learners. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 16000–16009, 2022

work page 2022

[9] [9]

Self-supervised learning from images with a joint-embedding predictive architecture

Mahmoud Assran, Quentin Duval, Ishan Misra, Piotr Bojanowski, Pascal Vincent, Michael Rabbat, Yann LeCun, and Nicolas Ballas. Self-supervised learning from images with a joint-embedding predictive architecture. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 15619–15629, 2023

work page 2023

[10] [10]

Videomae: Masked autoencoders are data-efficient learners for self-supervised video pre-training

Zhan Tong, Yibing Song, Jue Wang, and Limin Wang. Videomae: Masked autoencoders are data-efficient learners for self-supervised video pre-training. Advances in neural information processing systems , 35:10078–10093, 2022

work page 2022

[11] [11]

Revisiting feature prediction for learning visual representations from video, 2024

Adrien Bardes, Quentin Garrido, Jean Ponce, Xinlei Chen, Michael Rabbat, Yann LeCun, Mahmoud Assran, and Nicolas Ballas. Revisiting feature prediction for learning visual representations from video, 2024

work page 2024

[12] [12]

HaoChen, Adrien Gaidon, and Tengyu Ma

Hong Liu, Jeff Z. HaoChen, Adrien Gaidon, and Tengyu Ma. Self-supervised learning is more robust to dataset imbalance, 2022

work page 2022

[13] [13]

The only EKG book you’ll ever need

Malcolm S Thaler. The only EKG book you’ll ever need. Lippincott Williams & Wilkins, 2021

work page 2021

[14] [14]

Anselm Blumer, Andrzej Ehrenfeucht, David Haussler, and Manfred Warmuth

Randall Balestriero, Mark Ibrahim, Vlad Sobal, Ari Morcos, Shashank Shekhar, Tom Goldstein, Florian Bordes, Adrien Bardes, Gregoire Mialon, Yuandong Tian, Avi Schwarzschild, Andrew Gordon Wilson, Jonas Geiping, Quentin Garrido, Pierre Fernandez, Amir Bar, Hamed Pirsiavash, Yann LeCun, and Micah Goldblum. A cookbook of self-supervised learning, 2023. URL h...

work page arXiv 2023

[15] [15]

Extracting and composing robust features with denoising autoencoders

Pascal Vincent, Hugo Larochelle, Yoshua Bengio, and Pierre-Antoine Manzagol. Extracting and composing robust features with denoising autoencoders. In Proceedings of the 25th international conference on Machine learning, pages 1096–1103, 2008

work page 2008

[16] [16]

Learning by reconstruction produces uninformative features for perception,

Randall Balestriero and Yann LeCun. Learning by reconstruction produces uninformative features for perception,

work page

[17] [17]

URL https://arxiv.org/abs/2402.11337

work page arXiv

[18] [18]

Bootstrap your own latent-a new approach to self-supervised learning

Jean-Bastien Grill, Florian Strub, Florent Altché, Corentin Tallec, Pierre Richemond, Elena Buchatskaya, Carl Doersch, Bernardo Avila Pires, Zhaohan Guo, Mohammad Gheshlaghi Azar, et al. Bootstrap your own latent-a new approach to self-supervised learning. Advances in neural information processing systems, 33:21271–21284, 2020

work page 2020

[19] [19]

VICReg: Variance-Invariance-Covariance Regularization for Self-Supervised Learning

Adrien Bardes, Jean Ponce, and Yann LeCun. Vicreg: Variance-invariance-covariance regularization for self- supervised learning, 2022. URL https://arxiv.org/abs/2105.04906

work page internal anchor Pith review Pith/arXiv arXiv 2022

[20] [20]

Exploring simple siamese representation learning, 2020

Xinlei Chen and Kaiming He. Exploring simple siamese representation learning, 2020. URL https://arxiv. org/abs/2011.10566

work page arXiv 2020

[21] [21]

A path towards autonomous machine intelligence version 0.9

Yann LeCun. A path towards autonomous machine intelligence version 0.9. 2, 2022-06-27. https:// openreview.net/forum?id=BZ5a1r-kVsf, 2022. Accessed: 2024-06-01

work page 2022

[22] [22]

Clocs: Contrastive learning of cardiac signals across space, time, and patients

Dani Kiyasseh, Tingting Zhu, and David A Clifton. Clocs: Contrastive learning of cardiac signals across space, time, and patients. In International Conference on Machine Learning, pages 5606–5615. PMLR, 2021

work page 2021

[23] [23]

Representation learning with contrastive predictive coding,

Aaron van den Oord, Yazhe Li, and Oriol Vinyals. Representation learning with contrastive predictive coding,

work page

[24] [24]

URL https://arxiv.org/abs/1807.03748

work page internal anchor Pith review Pith/arXiv arXiv

[25] [25]

Self-supervised representation learning from 12-lead ecg data

Temesgen Mehari and Nils Strodthoff. Self-supervised representation learning from 12-lead ecg data. Computers in biology and medicine, 141:105114, 2022

work page 2022

[26] [26]

Maefe: Masked autoencoders family of electrocardiogram for self-supervised pretraining and transfer learning

Huaicheng Zhang, Wenhan Liu, Jiguang Shi, Sheng Chang, Hao Wang, Jin He, and Qijun Huang. Maefe: Masked autoencoders family of electrocardiogram for self-supervised pretraining and transfer learning. IEEE Transactions on Instrumentation and Measurement, 72:1–15, 2022. 14

work page 2022

[27] [27]

Guiding masked representation learning to capture spatio-temporal relationship of electrocardiogram, 2024

Yeongyeon Na, Minje Park, Yunwon Tae, and Sunghoon Joo. Guiding masked representation learning to capture spatio-temporal relationship of electrocardiogram, 2024

work page 2024

[28] [28]

A 12-lead electrocardiogram database for arrhythmia research covering more than 10,000 patients

Jianwei Zheng, Jianming Zhang, Sidy Danioko, Hai Yao, Hangyuan Guo, and Cyril Rakovski. A 12-lead electrocardiogram database for arrhythmia research covering more than 10,000 patients. Scientific data, 7(1):48, 2020

work page 2020

[29] [29]

Optimal multi-stage arrhythmia classification approach

Jianwei Zheng, Huimin Chu, Daniele Struppa, Jianming Zhang, Sir Magdi Yacoub, Hesham El-Askary, Anthony Chang, Louis Ehwerhemuepha, Islam Abudayyeh, Alexander Barrett, et al. Optimal multi-stage arrhythmia classification approach. Scientific reports, 10(1):2898, 2020

work page 2020

[30] [30]

Large-scale classification of 12-lead ecg with deep learning

Yu-Jhen Chen, Chien-Liang Liu, Vincent S Tseng, Yu-Feng Hu, and Shih-Ann Chen. Large-scale classification of 12-lead ecg with deep learning. In 2019 IEEE EMBS international conference on biomedical & health informatics (BHI), pages 1–4. IEEE, 2019

work page 2019

[31] [31]

Ptb-xl, a large publicly available electrocardiography dataset

Patrick Wagner, Nils Strodthoff, Ralf-Dieter Bousseljot, Dieter Kreiseler, Fatima I Lunze, Wojciech Samek, and Tobias Schaeffter. Ptb-xl, a large publicly available electrocardiography dataset. Scientific data, 7(1):1–15, 2020

work page 2020

[32] [32]

An open access database for evaluating the algorithms of electrocardiogram rhythm and morphology abnormality detection

Feifei Liu, Chengyu Liu, Lina Zhao, Xiangyu Zhang, Xiaoling Wu, Xiaoyan Xu, Yulin Liu, Caiyun Ma, Shoushui Wei, Zhiqiang He, et al. An open access database for evaluating the algorithms of electrocardiogram rhythm and morphology abnormality detection. Journal of Medical Imaging and Health Informatics, 8(7):1368–1373, 2018

work page 2018

[33] [33]

Ecg segmentation by neural networks: Errors and correction

Iana Sereda, Sergey Alekseev, Aleksandra Koneva, Roman Kataev, and Grigory Osipov. Ecg segmentation by neural networks: Errors and correction. In 2019 International Joint Conference on Neural Networks (IJCNN), pages 1–7. IEEE, 2019

work page 2019

[34] [34]

Deep learning for ecg segmentation

Viktor Moskalenko, Nikolai Zolotykh, and Grigory Osipov. Deep learning for ecg segmentation. In Advances in Neural Computation, Machine Learning, and Cognitive Research III: Selected Papers from the XXI International Conference on Neuroinformatics, October 7-11, 2019, Dolgoprudny, Moscow Region, Russia, pages 246–254. Springer, 2020

work page 2019

[35] [35]

Post-processing refined ecg delineation based on 1d-unet

Zhenqin Chen, Mengying Wang, Meiyu Zhang, Wei Huang, Hanjie Gu, and Jinshan Xu. Post-processing refined ecg delineation based on 1d-unet. Biomedical Signal Processing and Control, 79:104106, 2023

work page 2023

[36] [36]

Deep learning based ecg segmentation for delineation of diverse arrhythmias

Chankyu Joung, Mijin Kim, Taejin Paik, Seong-Ho Kong, Seung-Young Oh, Won Kyeong Jeon, Jae-hu Jeon, Joong-Sik Hong, Wan-Joong Kim, Woong Kook, et al. Deep learning based ecg segmentation for delineation of diverse arrhythmias. PloS one, 19(6):e0303178, 2024

work page 2024

[37] [37]

Accurate, Large Minibatch SGD: Training ImageNet in 1 Hour

Priya Goyal, Piotr Dollár, Ross Girshick, Pieter Noordhuis, Lukasz Wesolowski, Aapo Kyrola, Andrew Tulloch, Yangqing Jia, and Kaiming He. Accurate, large minibatch sgd: Training imagenet in 1 hour, 2018. URL https://arxiv.org/abs/1706.02677

work page internal anchor Pith review Pith/arXiv arXiv 2018

[38] [38]

Deep residual learning for image recognition

Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 770–778, 2016. 15

work page 2016