pith. machine review for the scientific record. sign in

arxiv: 2604.02501 · v1 · submitted 2026-04-02 · 📡 eess.SP

Recognition: 2 theorem links

· Lean Theorem

ECG Foundation Models and Medical LLMs for Agentic Cardiovascular Intelligence at the Edge: A Review and Outlook

Authors on Pith no claims yet

Pith reviewed 2026-05-13 20:19 UTC · model grok-4.3

classification 📡 eess.SP
keywords ECG foundation modelsmedical LLMsagentic AIedge computingcardiovascular intelligenceself-supervised learningmodel optimizationwearable health monitoring
0
0 comments X

The pith

Next-generation cardiovascular AI will combine ECG foundation models with medical LLMs to create agentic, on-device systems for real-time heart monitoring and decision support.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper argues that cardiovascular AI must shift from narrow, task-specific tools to inherently agentic systems that can interpret signals and reason about clinical context. ECG foundation models serve as generalist interpreters by learning rich representations from large unlabeled waveform data through self-supervised and multimodal pretraining. Medical LLMs supply the complementary backbone of biomedical knowledge for guideline alignment, inference, and report generation. The review examines how these two classes can be jointly optimized through quantization, pruning, and distillation to run efficiently on resource-limited wearables while preserving privacy and low latency. If the integration holds, it would enable continuous, contextual cardiovascular intelligence embedded directly in consumer devices rather than relying on cloud services.

Core claim

The central thesis is that next-generation cardiovascular AI systems will be inherently agentic, requiring the synergistic integration of ECG foundation models that act as signal-level interpreters learning rich electrophysiological representations via self-supervised and multimodal pretraining, and medical LLMs trained on biomedical text that function as knowledge-based reasoning backbones for contextual inference, guideline alignment, and clinical decision support, all while being optimized for deployment on edge devices such as smartwatches.

What carries the argument

The synergistic integration of ECG foundation models (signal interpreters via self-supervised pretraining and multimodal alignment) and medical LLMs (reasoning backbones for clinical context), jointly optimized with quantization, pruning, and distillation for edge constraints.

If this is right

  • Enables zero-shot ECG classification, automated clinical report generation, and longitudinal risk modeling directly from waveform data.
  • Supports real-time, guideline-aligned decision support and contextual inference without transmitting raw patient data to the cloud.
  • Allows low-latency, energy-efficient operation on wearables through techniques such as quantization, pruning, and small language model distillation.
  • Outlines pathways for multimodal ECG-language models that combine signal interpretation with textual reasoning for explainable outputs.
  • Promotes privacy-preserving, secure cardiovascular analytics embedded in everyday consumer electronics ecosystems.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • Successful integration could shift cardiovascular care from episodic clinic visits to continuous, proactive monitoring embedded in daily-worn devices.
  • The approach may extend naturally to fusing ECG with other wearable signals such as photoplethysmography or accelerometer data for broader physiological agents.
  • Regulatory pathways for on-device medical agents will need new evaluation standards focused on combined signal-plus-reasoning performance rather than isolated model accuracy.
  • Edge deployment could reduce healthcare system load by handling routine monitoring locally while escalating only high-uncertainty cases to clinicians.

Load-bearing premise

ECG foundation models and medical LLMs can be integrated and optimized for resource-constrained edge devices while maintaining high performance, zero-shot capabilities, and clinical reliability without major trade-offs in accuracy or latency.

What would settle it

A side-by-side test on a consumer smartwatch processor showing that any integrated ECG foundation model plus medical LLM system either exceeds 100 ms end-to-end latency or drops diagnostic accuracy by more than 5 percent relative to its cloud counterpart would falsify the feasibility of practical edge deployment.

Figures

Figures reproduced from arXiv: 2604.02501 by Ahmad Nayfeh, Ali Ahmad Al-Shaikhi, Mudassir Hasan Khan, Mudassir Masood, Muhammad Mahboob Ur Rahman, Tareq Y. Al-Naffouri.

Figure 1
Figure 1. Figure 1: Evolution of AI-based ECG intelligence. A. Large-Scale Pre-training and ECG Foundation Models Large-scale pre-training on unlabeled ECG recordings has emerged as the cornerstone of modern ECG intelligence. ECGFounder [42], trained on 10.77 million ECGs from 1.82 million patients in the Harvard–Emory ECG Database, em￾ploys a RegNet-based encoder and positive label augmentation to address incomplete clinical… view at source ↗
Figure 2
Figure 2. Figure 2: The current research landscape of the use of Foundation models and LLMs for ECG analysis. [PITH_FULL_IMAGE:figures/full_fig_p009_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Model optimization, hardware-software co-design, and privacy-aware AI techniques to realize Agentic ECG intelligence [PITH_FULL_IMAGE:figures/full_fig_p012_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: Outlook for edge-aware Agentic AI ECG intelligence. [PITH_FULL_IMAGE:figures/full_fig_p014_4.png] view at source ↗
read the original abstract

Electrocardiogram (ECG) foundation models represent a paradigm shift from task-specific pipelines to generalizable architectures pre-trained on large-scale unlabeled waveform data. This survey presents a unified and deployment-aware review of foundation models and medical large language models (LLMs) for ECG intelligence in cardiovascular disease (CVD) diagnosis, monitoring, and clinical decision support. The central thesis of this survey paper is that next-generation cardiovascular AI systems will be inherently agentic, requiring the synergistic integration of two complementary model classes: (i) ECG foundation models that act as signal-level interpreters, learning rich electrophysiological representations via self-supervised and multimodal pretraining, and (ii) medical LLMs, trained on biomedical text corpora, that function as knowledge-based reasoning backbones for contextual inference, guideline alignment, and clinical decision support. Thus, the survey systematically reviews existing pool of generalist medical LLMs, as well as ECG foundation models that utilize techniques such as self-supervised learning, multimodal ECG-language alignment, vision transformer architectures, and possess capabilities such as zero-shot classification, automated report generation, and longitudinal risk modeling. Recognizing the constraints of consumer-grade wearable edge devices, we further examine model optimization techniques such as quantization, pruning, knowledge distillation, as well as the role of small language models in enabling low-latency, energy-efficient, and privacy-preserving ECG intelligence on edge platforms such as smartwatches. Finally, we outline future directions in multimodal ECG foundation models, agent-driven monitoring, and explainable, secure edge intelligence, with particular emphasis on real-time, on-device cardiovascular analytics in consumer electronics ecosystems.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 3 minor

Summary. This survey synthesizes ECG foundation models pretrained via self-supervised and multimodal methods on large unlabeled waveform datasets, alongside medical LLMs trained on biomedical text. It advances the thesis that next-generation cardiovascular AI must be agentic, achieved through synergistic integration of signal-level interpreters (ECG models) and knowledge-reasoning backbones (medical LLMs), with explicit attention to optimization for resource-constrained edge devices such as wearables via quantization, pruning, and distillation, plus future directions in multimodal alignment, agent-driven monitoring, and explainable on-device analytics.

Significance. If the synthesis is accurate, the paper supplies a coherent roadmap for shifting from task-specific ECG pipelines to generalist, deployable agentic systems. It usefully highlights complementary strengths—rich electrophysiological representations from foundation models and guideline-aligned inference from LLMs—while addressing practical constraints of latency, energy, and privacy on consumer-grade hardware. The review of zero-shot classification, automated reporting, and longitudinal modeling provides a useful organizing frame for ongoing work in edge cardiovascular intelligence.

major comments (2)
  1. [§3] §3 (ECG foundation models): the claim that multimodal ECG-language alignment enables robust zero-shot classification is presented without quantitative cross-study benchmarks or failure-mode analysis; this weakens the load-bearing assertion that such models can serve as reliable signal interpreters in agentic edge pipelines without accuracy trade-offs.
  2. [§5] §5 (edge optimization and integration): the discussion of quantization, pruning, and small language models for low-latency deployment asserts feasibility for consumer wearables but does not address concrete latency-accuracy Pareto fronts or clinical validation requirements, which are central to the paper's outlook on privacy-preserving real-time analytics.
minor comments (3)
  1. [Abstract / Introduction] The abstract and introduction repeat the central thesis almost verbatim; a single crisp statement would improve readability.
  2. [§3] Several citations to recent ECG foundation model papers (e.g., those using vision transformers) appear without explicit comparison tables of pretraining objectives or dataset scales; adding such a table would strengthen the synthesis.
  3. [Throughout] Notation for model classes (e.g., “ECG-FM” vs. “Med-LLM”) is introduced inconsistently across sections; a short nomenclature table would aid clarity.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive review and the recommendation for minor revision. We address each major comment point by point below, with revisions planned to strengthen the synthesis where appropriate.

read point-by-point responses
  1. Referee: [§3] §3 (ECG foundation models): the claim that multimodal ECG-language alignment enables robust zero-shot classification is presented without quantitative cross-study benchmarks or failure-mode analysis; this weakens the load-bearing assertion that such models can serve as reliable signal interpreters in agentic edge pipelines without accuracy trade-offs.

    Authors: We appreciate this observation. As a survey, §3 synthesizes results reported across the cited literature on multimodal ECG-language models rather than presenting new benchmarks. Several referenced works do demonstrate competitive zero-shot performance, but we agree that the section would benefit from greater transparency. In the revised manuscript we will add a consolidated table summarizing key quantitative metrics (e.g., zero-shot accuracy on standard ECG benchmarks), datasets, and any failure modes or limitations explicitly noted in the original studies. This will provide readers with a clearer view of the current evidence base without altering the review’s scope. revision: yes

  2. Referee: [§5] §5 (edge optimization and integration): the discussion of quantization, pruning, and small language models for low-latency deployment asserts feasibility for consumer wearables but does not address concrete latency-accuracy Pareto fronts or clinical validation requirements, which are central to the paper's outlook on privacy-preserving real-time analytics.

    Authors: We thank the referee for highlighting this practical gap. The current text in §5 reviews optimization techniques drawn from the literature but does not aggregate specific latency-accuracy measurements or discuss clinical validation status. We will revise the section to include reported Pareto-front data from relevant edge-deployment studies on quantized ECG foundation models and small medical LLMs. We will also add a short discussion noting the limited availability of prospective clinical validation for on-device cardiovascular analytics and will frame this as a key open challenge for the field. These additions directly support the paper’s emphasis on privacy-preserving real-time deployment. revision: yes

Circularity Check

0 steps flagged

No significant circularity; survey paper with no derivations or predictions

full rationale

This paper is a literature survey synthesizing existing work on ECG foundation models (self-supervised, multimodal) and medical LLMs. It advances no original mathematical derivations, equations, fitted parameters, or falsifiable predictions that could reduce to inputs by construction. The central thesis is explicitly framed as a forward-looking perspective on agentic integration for edge devices, not a claim derived from internal computations or self-citations. No load-bearing self-citation chains, self-definitional steps, or renamed known results appear. The paper is self-contained as a review and receives the default non-circularity finding.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

The paper introduces no free parameters, axioms, or invented entities because it is a review that organizes existing research without original derivations or new postulates.

pith-pipeline@v0.9.0 · 5621 in / 1249 out tokens · 41918 ms · 2026-05-13T20:19:19.891407+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

164 extracted references · 164 canonical work pages · 3 internal anchors

  1. [1]

    The global burden of cardiovascular disease,

    C. Deaton, E. S. Froelicher, L. H. Wu, C. Ho, K. Shishani, and T. Jaarsma, “The global burden of cardiovascular disease,”European Journal of Cardiovascular Nursing, vol. 10, no. 2 suppl, pp. S5–S13, 2011

  2. [2]

    Prognostic value of ecg findings for total, cardiovascular disease, and coronary heart disease death in men and women,

    D. De Bacquer, G. De Backer, M. Kornitzer, and H. Blackburn, “Prognostic value of ecg findings for total, cardiovascular disease, and coronary heart disease death in men and women,”Heart, vol. 80, no. 6, pp. 570–577, 1998

  3. [3]

    Ecg interpretation: clinical relevance, challenges, and advances,

    N. Rafie, A. H. Kashou, and P. A. Noseworthy, “Ecg interpretation: clinical relevance, challenges, and advances,”Hearts, vol. 2, no. 4, pp. 505–513, 2021

  4. [4]

    Detection of cardiovascular diseases in ecg images using machine learning and deep learning methods,

    M. B. Abubaker and B. Babayi ˘git, “Detection of cardiovascular diseases in ecg images using machine learning and deep learning methods,”IEEE transactions on artificial intelligence, vol. 4, no. 2, pp. 373–382, 2022

  5. [5]

    Advancements in artificial intelligence for ecg signal analysis and arrhythmia detection: A review,

    F. Kazemi Lichaeeet al., “Advancements in artificial intelligence for ecg signal analysis and arrhythmia detection: A review,”International Journal of Cardiovascular Practice, 2024

  6. [6]

    An automatic coronary microvascular dysfunction classification method based on hybrid ecg features and expert features,

    M. Jiang, F. Bian, J. Zhang, Z. Pu, H. Li, Y . Zhang, Y . Chu, Y . Fan, and J. Jiang, “An automatic coronary microvascular dysfunction classification method based on hybrid ecg features and expert features,” IEEE Journal of Biomedical and Health Informatics, 2024

  7. [7]

    Prediction of short-term mortality of cardiac care unit patients using image-transformed ecg waveforms,

    T. Kondo, A. Teramoto, E. Watanabe, Y . Sobue, H. Izawa, K. Saito, and H. Fujita, “Prediction of short-term mortality of cardiac care unit patients using image-transformed ecg waveforms,”IEEE Journal of Translational Engineering in Health and Medicine, vol. 11, pp. 191– 198, 2023

  8. [8]

    Advances in deep learning for personalized ecg diagnostics: A systematic review addressing inter-patient variability and generalization constraints,

    C. Ding, T. Yao, C. Wu, and J. Ni, “Advances in deep learning for personalized ecg diagnostics: A systematic review addressing inter-patient variability and generalization constraints,”Biosensors and Bioelectronics, vol. 271, p. 117073, 2025

  9. [9]

    Deep learning and electrocardiography: systematic review of current techniques in cardiovascular disease diagnosis and management,

    Z. Wu and C. Guo, “Deep learning and electrocardiography: systematic review of current techniques in cardiovascular disease diagnosis and management,”BioMedical Engineering OnLine, 2025

  10. [10]

    Artificial intelligence in ecg diagnos- tics: where are we now?

    E. Androulakis and C. Fielder, “Artificial intelligence in ecg diagnos- tics: where are we now?”European Society of Cardiology, 2024

  11. [11]

    Health-llm: Large language models for health prediction via wearable sensor data,

    Y . Kim, X. Xu, D. McDuff, C. Breazeal, and H. W. Park, “Health-llm: Large language models for health prediction via wearable sensor data,” arXiv preprint arXiv:2401.06866, 2024

  12. [12]

    A comprehensive survey of foundation models in medicine,

    W. Khan, S. Leem, K. B. See, J. K. Wong, S. Zhang, and R. Fang, “A comprehensive survey of foundation models in medicine,”IEEE Reviews in Biomedical Engineering, 2025

  13. [13]

    Large language models encode clinical knowledge,

    K. Singhal, S. Azizi, T. Tu, S. S. Mahdavi, J. Wei, H. W. Chung, N. Scales, A. Tanwani, H. Cole-Lewis, S. Pfohl,et al., “Large language models encode clinical knowledge,”Nature, vol. 620, no. 7972, pp. 172–180, 2023

  14. [14]

    arXiv preprint arXiv:2305.09617 , year=

    K. Singhal, T. Tu, J. Gottweis, R. Sayres, E. Wulczyn, L. Hou, K. Clark, S. Pfohl, H. Cole-Lewis, D. Neal,et al., “Towards expert- level medical question answering with large language models,”arXiv preprint arXiv:2305.09617, 2023

  15. [15]

    H., Romanou, A., Bonnet, A., Ma- toba, K., Salvi, F., Pagliardini, M., Fan, S., K ¨opf, A., Mohtashami, A., et al

    Z. Chen, A. H. Cano, A. Romanou, A. Bonnet, K. Matoba, F. Salvi, M. Pagliardini, S. Fan, A. K¨opf, A. Mohtashami,et al., “Meditron-70b: Scaling medical pretraining for large language models,”arXiv preprint arXiv:2311.16079, 2023

  16. [16]

    Biomistral: A collection of open-source pretrained large language models for medical domains

    Y . Labrak, A. Bazoge, E. Morin, P.-A. Gourraud, M. Rouvier, and R. Dufour, “Biomistral: A collection of open-source pre- trained large language models for medical domains,”arXiv preprint arXiv:2402.10373, 2024

  17. [17]

    Med42-v2: A suite of clinical llms.arXiv preprint arXiv:2408.06142.2024

    C. Christophe, P. K. Kanithi, T. Raha, S. Khan, and M. A. Pimentel, “Med42-v2: A suite of clinical llms,”arXiv preprint arXiv:2408.06142, 2024

  18. [18]

    Ecg semantic integrator (esi): A foundation ecg model pretrained with llm-enhanced cardiological text,

    H. Yu, P. Guo, and A. Sano, “Ecg semantic integrator (esi): A foundation ecg model pretrained with llm-enhanced cardiological text,” arXiv preprint arXiv:2405.19366, 2024

  19. [19]

    Medtsllm: Leveraging llms for multimodal medical time series analysis.arXiv preprint arXiv:2408.07773, 2024

    N. Chan, F. Parker, W. Bennett, T. Wu, M. Y . Jia, J. Fackler, and K. Ghobadi, “Medtsllm: Leveraging llms for multimodal medical time series analysis,”arXiv preprint arXiv:2408.07773, 2024

  20. [20]

    The potential for large language models to transform cardiovascular medicine,

    G. Quer and E. J. Topol, “The potential for large language models to transform cardiovascular medicine,”The Lancet Digital Health, vol. 6, no. 10, pp. e767–e771, 2024

  21. [21]

    Mobile edge intelligence for large language models: A contemporary survey,

    G. Qu, Q. Chen, W. Wei, Z. Lin, X. Chen, and K. Huang, “Mobile edge intelligence for large language models: A contemporary survey,”IEEE Communications Surveys & Tutorials, vol. 27, no. 6, pp. 3820–3860, 2025

  22. [22]

    Artificial intelligence methods for analysis of electrocardiogram signals for early diagnosis of cardiac diseases,

    S. K. Saini and R. Gupta, “Artificial intelligence methods for analysis of electrocardiogram signals for early diagnosis of cardiac diseases,” SN Comput. Sci., vol. 3, pp. 1522–1565, 2022

  23. [23]

    Machine learning models for automated interpretation of 12-lead electrocardio- graphic signals: a narrative review of techniques, challenges, achieve- ments and clinical relevance,

    P. Pantelidis, M. Bampa, E. Oikonomou, and P. Papapetrou, “Machine learning models for automated interpretation of 12-lead electrocardio- graphic signals: a narrative review of techniques, challenges, achieve- ments and clinical relevance,”J Med Artif Intell, vol. 6, 2023

  24. [24]

    Revolutionizing electrocardiography: The role of artificial intelligence in ecg analysis and interpretation,

    S. N. Qayyum, I. Ullah, M. Riaz, M. K. Khan, G. B. Khan, R. Riaz, R. S. Anjum, and S. Noori, “Revolutionizing electrocardiography: The role of artificial intelligence in ecg analysis and interpretation,”Annals of Medicine & Surgery, vol. 87, pp. 161–170, 2025

  25. [25]

    Artificial intelligence-enhanced electrocardiography in cardiovascular disease management,

    K. C. Siontis, P. A. Noseworthy, Z. I. Attia, and P. A. Friedman, “Artificial intelligence-enhanced electrocardiography in cardiovascular disease management,”Nat Rev Cardiol, vol. 18, pp. 465–478, 2021

  26. [26]

    Ai-enhanced ecg applications in cardiology: Comprehensive insights from the current literature with a focus on covid-19 and multiple cardiovascular conditions,

    L. C. Nechita, A. Nechita, A. E. V oipan, D. V oipan, M. Debita, A. Fulga, I. Fulga, and C. L. Musat, “Ai-enhanced ecg applications in cardiology: Comprehensive insights from the current literature with a focus on covid-19 and multiple cardiovascular conditions,”Diagnostics, vol. 14, p. 1839, 2024

  27. [27]

    The pulse of artificial intelligence in cardiology: a comprehensive evaluation of state-of-the-art large language models for potential use in clinical cardiology,

    A. Novak, I. Zeljkovi ´c, F. Rode, A. Lisi ˇci´c, I. A. Nola, N. Pavlovi ´c, and ˇS. Manola, “The pulse of artificial intelligence in cardiology: a comprehensive evaluation of state-of-the-art large language models for potential use in clinical cardiology,”medRxiv, pp. 2023–08, 2023

  28. [28]

    Foundation models in electrocardiogram: A review,

    Y . Han, X. Liu, X. Zhang, and C. Ding, “Foundation models in electrocardiogram: A review,”arXiv preprint arXiv:2410.19877, 2024

  29. [29]

    Foundation models for biosignals: A survey,

    X. Gu, “Foundation models for biosignals: A survey,”Techrxiv, Aug. 2025. [Online]. Available: http://dx.doi.org/10.36227/techrxiv.175606236.62808131/v1

  30. [30]

    Deep learning for ECG arrhythmia detection and classification: an overview of progress for period 2017–2023,

    Y . Ansari, O. Mourad, K. Qaraqe, and E. Serpedin, “Deep learning for ECG arrhythmia detection and classification: an overview of progress for period 2017–2023,”Frontiers in Physiology, vol. 14, no. 1246746, 2023

  31. [31]

    Deep learning-based ECG arrhythmia classification: A systematic review,

    X. Li, L. Xu, X. Yao, S. Cheng, X. Yao, C. Jiang, and Z. Tang, “Deep learning-based ECG arrhythmia classification: A systematic review,” Applied Sciences, vol. 13, no. 8, 2023, art. 4964

  32. [32]

    A survey of model compression techniques: past, present, and future,

    A. Zhou, Y . Ma, J. Zhu, J. Liu, Z. Zhang, K. Yuan, and W. Sun, “A survey of model compression techniques: past, present, and future,” Neurocomputing, vol. 589, 2024, art. 127705

  33. [33]

    Empowering edge intelligence: A comprehensive survey on on-device AI models,

    X. Wanget al., “Empowering edge intelligence: A comprehensive survey on on-device AI models,”ACM Computing Surveys, 2025

  34. [34]

    Edge deep learning in computer vision and medical diagnostics: a comprehensive survey,

    A. Kumaret al., “Edge deep learning in computer vision and medical diagnostics: a comprehensive survey,”Artificial Intelligence Review, vol. 58, no. 33, 2025

  35. [35]

    Biogpt: generative pre-trained transformer for biomedical text generation and mining,

    R. Luo, L. Sun, Y . Xia, T. Qin, S. Zhang, H. Poon, and T.-Y . Liu, “Biogpt: generative pre-trained transformer for biomedical text generation and mining,”Briefings in bioinformatics, vol. 23, no. 6, p. bbac409, 2022

  36. [36]

    Towards generalist biomedical ai,

    T. Tu, S. Azizi, D. Driess, M. Schaekermann, M. Amin, P.-C. Chang, A. Carroll, C. Lau, R. Tanno, I. Ktena,et al., “Towards generalist biomedical ai,”NEJM AI, vol. 1, no. 3, p. AIoa2300138, 2024

  37. [37]

    A survey of transformers and large language models for ecg diagnosis: advances, challenges, and future directions,

    M. Y . Ansariet al., “A survey of transformers and large language models for ecg diagnosis: advances, challenges, and future directions,” Artificial Intelligence Review, 2025

  38. [38]

    ECG-LM: Understanding electrocardiogram with a large language model,

    K. Yanget al., “ECG-LM: Understanding electrocardiogram with a large language model,”Health Data Science, vol. 5, no. 0221, 2025

  39. [39]

    Teach multimodal LLMs to comprehend electrocardiographic images,

    R. Liu, Y . Bai, X. Yue, and P. Zhang, “Teach multimodal LLMs to comprehend electrocardiographic images,”npj Digital Medicine, 2025

  40. [40]

    Medical multimodal foundation models in clinical diagnosis and treatment: Applications, challenges, and future directions,

    K. Sun, S. Xue, F. Sun, H. Sun, Y . Luo, L. Wang, S. Wang, N. Guo, L. Liu, T. Zhao, X. Wang, L. Yang, S. Jin, J. Yan, and J. Dong, “Medical multimodal foundation models in clinical diagnosis and treatment: Applications, challenges, and future directions,”Artificial Intelligence in Medicine, vol. 170, p. 103265, 2025. [Online]. Available: https://www.scien...

  41. [41]

    Gatortron: A large clinical language model to unlock patient infor- mation from unstructured electronic health records,

    X. Yang, A. Chen, N. PourNejatian, H. C. Shin, K. E. Smith, C. Parisien, C. Compas, C. Martin, M. G. Flores, Y . Zhang,et al., “Gatortron: A large clinical language model to unlock patient infor- mation from unstructured electronic health records,”arXiv preprint arXiv:2203.03540, 2022

  42. [42]

    An electrocardiogram foundation model built on over 10 million recordings,

    J. Liet al., “An electrocardiogram foundation model built on over 10 million recordings,”NEJM AI, vol. 2, no. 2, 2025

  43. [43]

    ECGFM: A foundation model for ECG analysis trained on a multi-center million-ECG dataset,

    H. Liuet al., “ECGFM: A foundation model for ECG analysis trained on a multi-center million-ECG dataset,”Information Fusion, vol. 124, no. 103410, 2025

  44. [44]

    HuBERT-ECG: A self-supervised foundation model for broad and scalable cardiac applications,

    D. Mazzoleniet al., “HuBERT-ECG: A self-supervised foundation model for broad and scalable cardiac applications,”medRxiv, 2024

  45. [45]

    ECG-FM: An open electrocardiogram foundation model,

    K. McKeenet al., “ECG-FM: An open electrocardiogram foundation model,”JAMIA Open, vol. 8, no. 5, 2025, art. ooaf122

  46. [46]

    Electrocardiogram foundation model us- 16 ing temporally-augmented patient-contrastive learning,

    A. Sharmaet al., “Electrocardiogram foundation model us- 16 ing temporally-augmented patient-contrastive learning,”OpenReview, 2024

  47. [47]

    CLOCS: Contrastive learning of cardiac signals across space, time, and patients,

    D. Kiyasseh, T. Zhu, and D. A. Clifton, “CLOCS: Contrastive learning of cardiac signals across space, time, and patients,” inProceedings of the International Conference on Machine Learning (ICML), 2021, pp. 5606–5615

  48. [48]

    Guiding masked representation learning to capture spatio-temporal relationship of electrocardiogram,

    Y . Na, M. Park, Y . Tae, and S. Joo, “Guiding masked representation learning to capture spatio-temporal relationship of electrocardiogram,” inProceedings of the International Conference on Learning Represen- tations (ICLR), 2024

  49. [49]

    Zero-shot ecg diagnosis with large language models and retrieval-augmented generation,

    H. Yu, P. Guo, and A. Sano, “Zero-shot ecg diagnosis with large language models and retrieval-augmented generation,” inMachine Learning for Health (ML4H). PMLR, 2023, pp. 650–663

  50. [50]

    Zero-shot ECG classification with multimodal learning and test-time clinical knowledge enhancement,

    C. Liuet al., “Zero-shot ECG classification with multimodal learning and test-time clinical knowledge enhancement,” inProceedings of the International Conference on Machine Learning (ICML), vol. 41, 2024, pp. 31 949–31 963

  51. [51]

    Etp: Learning transferable ecg representations via ecg-text pre-training,

    C. Liu, Z. Wan, S. Cheng, M. Zhang, and R. Arcucci, “Etp: Learning transferable ecg representations via ecg-text pre-training,” inICASSP 2024-2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 2024, pp. 8230–8234

  52. [52]

    Electrocardiogram- language model for few-shot question answering with meta learning,

    J. Tang, T. Xia, Y . Lu, C. Mascolo, and A. Saeed, “Electrocardiogram- language model for few-shot question answering with meta learning,” arXiv preprint arXiv:2410.14464, 2024

  53. [53]

    Transfer knowledge from natural language to electro- cardiography: Can we detect cardiovascular disease through language models?

    J. Qiu, W. Han, J. Zhu, M. Xu, M. Rosenberg, E. Liu, D. Weber, and D. Zhao, “Transfer knowledge from natural language to electro- cardiography: Can we detect cardiovascular disease through language models?”arXiv preprint arXiv:2301.09017, 2023

  54. [54]

    Foundation model of ECG diagnosis: Diagnostics and explanations of any form and rhythm on ECG,

    Z. Wanget al., “Foundation model of ECG diagnosis: Diagnostics and explanations of any form and rhythm on ECG,”Cell Reports Medicine, vol. 5, no. 101875, 2024

  55. [55]

    Knowledge-enhanced multimodal ECG representation learning,

    Y . Zhanget al., “Knowledge-enhanced multimodal ECG representation learning,” inFindings of the Association for Computational Linguistics: EMNLP, 2025

  56. [56]

    From token to rhythm: A multi-scale approach for ECG-language pretraining,

    Z. Wanget al., “From token to rhythm: A multi-scale approach for ECG-language pretraining,”arXiv preprint arXiv:2506.21803, 2025

  57. [57]

    Fine-grained ECG-text contrastive learning via waveform understanding enhancement,

    X. Liuet al., “Fine-grained ECG-text contrastive learning via waveform understanding enhancement,”arXiv preprint arXiv:2505.11939, 2025

  58. [58]

    SuPreME: A supervised pre-training frame- work for multimodal ECG representation learning,

    J. Tanget al., “SuPreME: A supervised pre-training frame- work for multimodal ECG representation learning,”arXiv preprint arXiv:2502.19668, 2024

  59. [59]

    A foundational vision transformer improves diagnostic performance for electrocardiograms,

    A. Vaidet al., “A foundational vision transformer improves diagnostic performance for electrocardiograms,”npj Digital Medicine, vol. 6, no. 108, 2023

  60. [60]

    Biosignal copilot: Leveraging the power of llms in drafting reports for biomedical signals,

    C. Liu, Y . Ma, K. Kothur, A. Nikpour, and O. Kavehei, “Biosignal copilot: Leveraging the power of llms in drafting reports for biomedical signals,”medRxiv, pp. 2023–06, 2023

  61. [61]

    Ecg-chat: A large ecg-language model for cardiac disease diagnosis,

    Y . Zhao, T. Zhang, X. Wang, P. Han, T. Chen, L. Huang, Y . Jin, and J. Kang, “Ecg-chat: A large ecg-language model for cardiac disease diagnosis,”arXiv preprint arXiv:2408.08849, 2024

  62. [62]

    Electrocardiogram instruction tuning for report generation,

    Z. Wan, C. Liu, X. Wang, C. Tao, H. Shen, Z. Peng, J. Fu, R. Arcucci, H. Yao, and M. Zhang, “Electrocardiogram instruction tuning for report generation,”arXiv preprint arXiv:2403.04945, 2024

  63. [63]

    Q-HEART: ECG question answering via knowledge- informed multimodal LLMs,

    H. Nguyenet al., “Q-HEART: ECG question answering via knowledge- informed multimodal LLMs,”arXiv preprint arXiv:2505.06296, 2025

  64. [64]

    GEM: Empowering MLLM for grounded ECG understanding with time series and images,

    Y . Zhanget al., “GEM: Empowering MLLM for grounded ECG understanding with time series and images,” inProceedings of the Neural Information Processing Systems (NeurIPS), 2025

  65. [65]

    Large language model-informed ecg dual attention network for heart failure risk prediction,

    C. Chen, L. Li, M. Beetz, A. Banerjee, R. Gupta, and V . Grau, “Large language model-informed ecg dual attention network for heart failure risk prediction,”arXiv preprint arXiv:2403.10581, 2024

  66. [66]

    Large language models for cuffless blood pressure measurement from wearable biosignals,

    Z. Liu, C. Chen, J. Cao, M. Pan, J. Liu, N. Li, F. Miao, and Y . Li, “Large language models for cuffless blood pressure measurement from wearable biosignals,” inProceedings of the 15th ACM International Conference on Bioinformatics, Computational Biology and Health Informatics, 2024, pp. 1–11

  67. [67]

    In-situ dehydration monitoring via a stable diffusion-aided single-lead ecg iomt: Ml/dl models shine while llms hallucinate,

    L. Perzhilla, S. Siyoucef, R. Al-Aslani, M. M. U. Rahman, and T. Y . Al- Naffouri, “In-situ dehydration monitoring via a stable diffusion-aided single-lead ecg iomt: Ml/dl models shine while llms hallucinate,”IEEE Internet of Things Journal, 2025

  68. [68]

    PEFT QLORA-based fine-tuning of foundational models for vitals estimation using PPG and ECG-based medical IoT data: A feasibility study,

    S. A. Ali, M. W. Nawaz, J. Rashid, A. H. Mahmood, J. Kim, M. M. U. Rahman, and Q. H. Abbasi, “PEFT QLORA-based fine-tuning of foundational models for vitals estimation using PPG and ECG-based medical IoT data: A feasibility study,” inIEEE International Confer- ence on Data Mining Workshops (ICDMW). IEEE, 2025

  69. [69]

    TolerantECG: A foundation model for imperfect electrocardiogram,

    M. Nguyenet al., “TolerantECG: A foundation model for imperfect electrocardiogram,”arXiv preprint arXiv:2507.09887, 2024

  70. [70]

    Contrastive multi-modal training with electrocardio- graphy and natural language echocardiography reports for zero-shot prediction of structural heart disease,

    Y . Zhouet al., “Contrastive multi-modal training with electrocardio- graphy and natural language echocardiography reports for zero-shot prediction of structural heart disease,”medRxiv, 2024

  71. [71]

    Assessing the performance of zero-shot visual question answering in multimodal large language models for 12-lead ecg image interpretation,

    T. Seki, Y . Kawazoe, Y . Akagi, T. Takiguchi, and K. Ohe, “Assessing the performance of zero-shot visual question answering in multimodal large language models for 12-lead ecg image interpretation,”medRxiv, pp. 2024–03, 2024

  72. [72]

    The impact of the mit-bih arrhythmia database,

    G. Moody and R. Mark, “The impact of the mit-bih arrhythmia database,”IEEE Engineering in Medicine and Biology Magazine, vol. 20, no. 3, pp. 45–50, 2001

  73. [73]

    A new method for detecting atrial fibrillation using rr intervals,

    G. Moody, “A new method for detecting atrial fibrillation using rr intervals,”Proc. Comput. Cardiol., vol. 10, pp. 227–230, 1983

  74. [74]

    Mit-bih supraventricular arrhythmia database,

    R. Mark, G. Moody, and S. Greenwald, “Mit-bih supraventricular arrhythmia database,” 1990

  75. [75]

    S. D. Greenwald, R. S. Patil, and R. G. Mark,Improved detection and classification of arrhythmias in noise-corrupted electrocardiograms using contextual information. IEEE, 1990

  76. [76]

    Long-term st database: a reference for the development and evaluation of automated ischaemia detectors and for the study of the dynamics of myocardial ischaemia,

    F. Jager, A. Taddei, G. B. Moody, M. Emdin, G. Antoli ˇc, R. Dorn, A. Smrdel, C. Marchesi, and R. G. Mark, “Long-term st database: a reference for the development and evaluation of automated ischaemia detectors and for the study of the dynamics of myocardial ischaemia,” Medical and Biological Engineering and Computing, vol. 41, no. 2, pp. 172–182, 2003

  77. [77]

    Biometric human identification based on ecg,

    T. S. Lugovaya, “Biometric human identification based on ecg,”Phy- sioNet, 2005

  78. [78]

    Nutzung der ekg- signaldatenbank cardiodat der ptb ¨uber das internet,

    R. Bousseljot, D. Kreiseler, and A. Schnabel, “Nutzung der ekg- signaldatenbank cardiodat der ptb ¨uber das internet,” 1995

  79. [79]

    PTB-XL, a large publicly available electrocardiography dataset,

    P. Wagner, N. Strodthoff, R.-D. Bousseljot, W. Samek, and T. Schaeffter, “PTB-XL, a large publicly available electrocardiography dataset,”PhysioNet, Apr. 2020, version 1.0.1. [Online]. Available: https://doi.org/10.13026/x4td-x982

  80. [80]

    Ptb-xl, a large publicly available electrocardiography dataset,

    P. Wagner, N. Strodthoff, R.-D. Bousseljot, D. Kreiseler, F. I. Lunze, W. Samek, and T. Schaeffter, “Ptb-xl, a large publicly available electrocardiography dataset,”Scientific data, vol. 7, no. 1, p. 154, 2020

Showing first 80 references.