PulseLM: A Foundation Dataset and Benchmark for PPG-Text Learning

Aaqib Saeed; Bin Zhu; Dong Ma; Hung Manh Pham; Jinyang Wu; Xiao Ma; Yiming Zhang; Yixin Xu; Zhou Pan

arxiv: 2603.03331 · v2 · submitted 2026-02-10 · 💻 cs.CL · cs.AI

PulseLM: A Foundation Dataset and Benchmark for PPG-Text Learning

Hung Manh Pham , Jinyang Wu , Xiao Ma , Yiming Zhang , Yixin Xu , Aaqib Saeed , Bin Zhu , Zhou Pan

show 1 more author

Dong Ma

This is my paper

Pith reviewed 2026-05-16 02:13 UTC · model grok-4.3

classification 💻 cs.CL cs.AI

keywords photoplethysmographyPPGquestion answeringmultimodal learningphysiological monitoringdatasetlanguage modelsbiosignals

0 comments

The pith

PulseLM reformats over a million PPG segments from sixteen sources into nearly 2.5 million natural-language question-answer pairs.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces PulseLM as a large dataset that connects raw photoplethysmography waveforms directly to text by turning existing numerical annotations into question-answer pairs. It pools recordings from sixteen public sources, standardizes them into more than one million 10-second segments, and produces almost 2.5 million QA pairs spread across twelve tasks. This QA structure lets multimodal large language models perform language-based inference on physiological signals instead of working only with numbers. The authors supply the data, pipelines, training recipes, and evaluation protocols so that different teams can run comparable experiments. A reader would care because the format makes it feasible to build intuitive text interfaces for continuous health monitoring that today depend on separate numerical models.

Core claim

PulseLM aggregates PPG recordings from sixteen publicly available sources and harmonizes heterogeneous annotations into 12 downstream tasks. The resulting dataset contains over 1 million standardized 10-second PPG segments paired with nearly 2.5 million question-answer pairs. The authors define reproducible data pipelines, training procedures, and evaluation protocols, then establish baseline benchmarks with multimodal PPG-aware large language models. This supplies a standardized foundation for language-grounded physiological inference, cross-dataset generalization, and scalable benchmarking of PPG-based multimodal models.

What carries the argument

The unified question-answering formulation that converts heterogeneous PPG numerical labels and measurements into natural-language question-answer pairs across twelve tasks.

If this is right

Multimodal models can be trained end-to-end to answer natural-language questions about PPG waveforms.
The single dataset format supports direct measurement of how well models generalize across different PPG collection devices and settings.
Reproducible training and evaluation protocols allow consistent comparison of new PPG-text methods against the provided baselines.
The 2.5 million QA pairs supply sufficient scale for fine-tuning or instruction-tuning large language models on physiological signals.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Voice assistants or chat interfaces could eventually query wearable devices about real-time cardiovascular state using the same QA format.
The harmonized data may surface signal features that remain stable across clinical, lab, and consumer-grade PPG sensors.
Extending the QA pairs to include forward-looking questions could support predictive tasks such as estimating future blood-pressure trends from current waveforms.

Load-bearing premise

Reformatting numerical PPG labels from many different sources into a single question-answer format preserves enough clinical meaning for language models to perform accurate physiological inference.

What would settle it

If a multimodal model trained on PulseLM achieves no higher accuracy on the original numerical tasks than task-specific models when both are tested on held-out segments from the source datasets, the QA conversion would have lost critical information.

Figures

Figures reproduced from arXiv: 2603.03331 by Aaqib Saeed, Bin Zhu, Dong Ma, Hung Manh Pham, Jinyang Wu, Xiao Ma, Yiming Zhang, Yixin Xu, Zhou Pan.

**Figure 1.** Figure 1: Overview of our dataset study. intervals, variable pulse amplitudes, and disrupted waveform morphology captured by PPG. These tasks require models to capture fine-grained morphological and rhythm structure and longer-range temporal dependencies within the signal. More recently, PPG has been explored in a range of non-traditional and higher-level inference domains. Studies have demonstrated its utility fo… view at source ↗

**Figure 2.** Figure 2: Illustration of benchmarking PPG language modeling. [PITH_FULL_IMAGE:figures/full_fig_p007_2.png] view at source ↗

**Figure 3.** Figure 3: Demonstration of label distributions in PulseLM dataset. [PITH_FULL_IMAGE:figures/full_fig_p008_3.png] view at source ↗

read the original abstract

Photoplethysmography (PPG) is a widely used non-invasive sensing modality for continuous cardiovascular and physiological monitoring across clinical, laboratory, and wearable settings. While existing PPG datasets support a broad range of downstream tasks, they typically provide supervision in the form of numerical measurements or task-specific labels, limiting their compatibility with language-based interfaces and multimodal foundation models. In this work, we introduce PulseLM, a large-scale PPG-text question-answering dataset that bridges raw PPG waveforms and natural language through a unified question-answering (QA) formulation. PulseLM aggregates PPG recordings from sixteen publicly available sources and harmonizes heterogeneous annotations into 12 downstream tasks. The dataset comprises over 1 million standardized 10-second PPG segments, associated with nearly 2.5 million question-answer pairs. We further define reproducible data pipeline, training, and evaluation protocols and establish baseline benchmarks using multimodal PPG-aware large language models. PulseLM provides a standardized foundation for studying language-grounded physiological inference, cross-dataset generalization, and scalable benchmarking of PPG-based multimodal models. We publicly release the dataset and code at https://huggingface.co/datasets/Manhph2211/PulseLM and https://github.com/manhph2211/PULSE-LM, respectively.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

PulseLM aggregates sixteen PPG sources into a 2.5M-pair QA dataset with released code, which fills a practical gap but leaves the harmonization step unvalidated.

read the letter

PulseLM turns existing PPG recordings from sixteen public datasets into a single question-answering collection. The authors standardize more than a million 10-second segments and produce nearly 2.5 million QA pairs across twelve tasks, then release both the data on Hugging Face and the full pipeline on GitHub. That scale and the explicit training and evaluation protocols are the concrete advance. Earlier PPG work stayed in raw numbers or narrow labels, so a shared text interface lets people test language models on physiological signals without starting from scratch each time. The baselines with multimodal LLMs are a reasonable starting point for others to compare against. The soft spot is the conversion step itself. Different source datasets come from varied devices, cohorts, and original tasks, and the paper gives no quantitative checks on whether the resulting questions and answers keep the original clinical meaning or simply carry forward device-specific artifacts. Without inter-source consistency tests or expert review of the generated pairs, it is unclear how much noise the harmonization introduces. This paper is for researchers who need a ready benchmark for PPG-text or multimodal health models. Anyone already working on wearable sensing or language-grounded physiological inference will get immediate use from the released resource. It deserves peer review because the artifact is concrete and reproducible; reviewers will mainly press for evidence that the QA pairs are faithful to the source labels.

Referee Report

2 major / 2 minor

Summary. The paper introduces PulseLM, a large-scale PPG-text QA dataset that aggregates recordings from sixteen public sources into over 1 million standardized 10-second segments paired with nearly 2.5 million question-answer pairs spanning 12 downstream tasks. It defines reproducible data pipelines, training protocols, and evaluation benchmarks using multimodal PPG-aware LLMs, with public release of the dataset and code.

Significance. If the harmonization of heterogeneous annotations into QA format is shown to preserve clinical fidelity, PulseLM would provide a valuable standardized foundation for language-grounded physiological inference, cross-dataset generalization, and benchmarking of multimodal models. The public release of data and code plus the emphasis on reproducible pipelines are concrete strengths that would facilitate community adoption.

major comments (2)

[Methods (harmonization pipeline)] The harmonization process that converts numerical labels from heterogeneous sources (varying devices, sampling rates, and cohorts) into unified natural-language QA pairs lacks any reported quantitative checks on label fidelity, inter-source consistency, or expert validation of the generated pairs; this is load-bearing for the claim that the 2.5M pairs support reliable language-grounded inference.
[Experiments and baselines] Baseline results for the 12 tasks are presented without ablation studies isolating the effect of harmonization choices or metrics quantifying noise introduced by label conversion; without these, it is unclear whether downstream performance reflects true physiological signal or source-specific artifacts.

minor comments (2)

[Abstract] The abstract states that the dataset 'bridges raw PPG waveforms and natural language' but the precise mapping from 10-second segments to QA pairs should be illustrated with concrete examples in the main text.
[Dataset description] Table or figure captions describing the 12 tasks should explicitly list the original source labels that were mapped to each task to improve traceability.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive comments on our manuscript. We address each major point below and will revise the paper accordingly to strengthen the validation of the harmonization pipeline and the experimental analyses.

read point-by-point responses

Referee: The harmonization process that converts numerical labels from heterogeneous sources (varying devices, sampling rates, and cohorts) into unified natural-language QA pairs lacks any reported quantitative checks on label fidelity, inter-source consistency, or expert validation of the generated pairs; this is load-bearing for the claim that the 2.5M pairs support reliable language-grounded inference.

Authors: We acknowledge that the current manuscript does not report quantitative validation of the harmonization process. Section 3 details the deterministic rule-based mappings from source annotations to QA pairs, but we agree these lack explicit fidelity checks. In the revision, we will add: (i) inter-source consistency metrics computed on overlapping cohorts (e.g., agreement rates between original labels and QA-derived values), (ii) fidelity scores comparing numerical ground truth to QA interpretations on a 10k-segment held-out set, and (iii) results from expert clinician review of a 500-pair random sample assessing clinical accuracy and natural language quality. These additions will directly support the reliability of the 2.5M pairs. revision: yes
Referee: Baseline results for the 12 tasks are presented without ablation studies isolating the effect of harmonization choices or metrics quantifying noise introduced by label conversion; without these, it is unclear whether downstream performance reflects true physiological signal or source-specific artifacts.

Authors: We agree that the absence of targeted ablations limits interpretability of the baseline results. In the revised version, we will incorporate: (i) ablation experiments comparing multimodal LLM performance on harmonized QA pairs versus direct numerical supervision (where source labels permit), and (ii) noise quantification metrics including cross-source performance variance and label-perturbation sensitivity analysis. These will isolate the impact of harmonization choices and demonstrate that reported performance primarily reflects physiological signal rather than conversion artifacts. revision: yes

Circularity Check

0 steps flagged

No significant circularity in dataset aggregation and release

full rationale

The paper's central contribution is the construction and public release of PulseLM, formed by aggregating 16 existing public PPG sources and converting their heterogeneous numerical annotations into a unified QA format across 12 tasks. No derivations, equations, fitted parameters, or model predictions are present that could reduce to inputs by construction. The work contains no self-citation chains, uniqueness theorems, or ansatzes that bear load on the claims; the harmonization process is described as a reproducible pipeline without invoking prior author results as external justification. This is a standard data-release paper whose validity rests on the transparency of the aggregation steps rather than any self-referential logic.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

No mathematical derivations or fitted parameters; the work rests on the domain assumption that public PPG sources can be harmonized into QA without loss of utility.

axioms (1)

domain assumption Heterogeneous PPG annotations from 16 sources can be reliably mapped to 12 unified downstream tasks via QA formulation.
Invoked in the abstract when describing harmonization of annotations.

pith-pipeline@v0.9.0 · 6677 in / 1139 out tokens · 161126 ms · 2026-05-16T02:13:53.627973+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Foundation/RealityFromDistinction.lean reality_from_one_distinction unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

PulseLM aggregates PPG recordings from fifteen publicly available sources and harmonizes heterogeneous annotations into twelve common physiologically QA tasks... all PPG recordings are standardized through a unified preprocessing pipeline comprising four stages: Resampling... Filtering... Segmentation... Normalization.

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

50 extracted references · 50 canonical work pages · 2 internal anchors

[1]

Apple Watch

Online. Apple Watch. https://www.apple.com/sg/watch/

work page
[2]

EmbracePlus | The world’s most advanced smartwatch for continuous health monitoring

Online. EmbracePlus | The world’s most advanced smartwatch for continuous health monitoring. https://www.empatica.com/en-int/embraceplus/

work page
[3]

polarvantagev3

Online. polarvantagev3. https://www.polar.com/sg-en/vantage/v3

work page
[4]

Sennheiser Momentum Sport

Online. Sennheiser Momentum Sport. https://newsroom.sennheiser.com/the- thrill-of-performance-mltzvt

work page
[5]

Salar Abbaspourazad, Oussama Elachqar, Andrew Miller, Saba Emrani, Udhyaku- mar Nallasamy, and Ian Shapiro. 2024. Large-scale Training of Foundation Models for Wearable Biosignals. InThe Twelfth International Conference on Learning Rep- resentations

work page 2024
[6]

Nicolas Aguirre, Edith Grall-Maës, Leandro J Cymberknop, and Ricardo L Armen- tano. 2021. Blood pressure morphology assessment from photoplethysmogram and demographic information using deep learning with attention mechanism. Sensors21, 6 (2021), 2167

work page 2021
[7]

J Bacevičius, Z Abramikas, I Badaras, M Butkuvien˙e, S Daukantas, E Dvinelis, M Gudauskas, E Jukna, M Kiseli¯ute, R Kundelis, et al. 2024. Long-term electrocar- diogram and wrist-based photoplethysmogram recordings with annotated atrial fibrillation episodes.Dataset on Zenodo(2024)

work page 2024
[8]

Peter H Charlton, Kevin Kotzen, Elisa Mejía-Mejía, Philip J Aston, Karthik Bu- didha, Jonathan Mant, Callum Pettit, Joachim A Behar, and Panicos A Kyriacou

work page
[9]

Detecting beats in the photoplethysmogram: benchmarking open-source algorithms.Physiological Measurement43, 8 (2022), 085007

work page 2022
[10]

S. K. Deric Tang, Y. Y. S. Goh, M. L. D. Wong, and Y. L. E. Lew. 2016. PPG signal reconstruction using a combination of discrete wavelet transform and empirical mode decomposition. IEEE, 1–4

work page 2016
[11]

Ainara Garde, Parastoo Dehkordi, Walter Karlen, David Wensley, J Mark Anser- mino, and Guy A Dumont. 2014. Development of a screening tool for sleep disordered breathing in children using the phone Oximeter™.PloS one9, 11 (2014), e112959

work page 2014
[12]

Sergio González, Wan-Ting Hsieh, and Trista Pei-Chun Chen. 2023. A bench- mark for machine-learning based non-invasive blood pressure estimation using photoplethysmogram.Scientific Data10, 1 (2023), 149

work page 2023
[13]

Matthew Yiwen Ho, Hung Manh Pham, Aaqib Saeed, and Dong Ma. 2025. WF- PPG: A wrist-finger dual-channel dataset for studying the impact of contact pressure on PPG morphology.Scientific Data12, 1 (2025), 200

work page 2025
[14]

Changshuo Hu, Hung Manh Pham, and Dong Ma. 2025. Morphology-Aware HRV Estimation from Wrist PPG in Sedentary Scenarios. InCompanion of the 2025 ACM International Joint Conference on Pervasive and Ubiquitous Computing. 745–750

work page 2025
[15]

Edward J Hu, yelong shen, Phillip Wallis, Zeyuan Allen-Zhu, Yuanzhi Li, Shean Wang, Lu Wang, and Weizhu Chen. 2022. LoRA: Low-Rank Adaptation of Large Language Models. InInternational Conference on Learning Representations. https: //openreview.net/forum?id=nZeVKeeFYf9

work page 2022
[16]

Mohamad Kachuee, Mohammad Kiani, Hoda Mohammadzade, and Mahdi Sha- bany. 2015. Cuff-Less Blood Pressure Estimation. UCI Machine Learning Reposi- tory. doi:10.24432/C5B602

work page doi:10.24432/c5b602 2015
[17]

Mohamad Kachuee, Mohammad Mahdi Kiani, Hoda Mohammadzade, and Mahdi Shabany. 2015. Cuff-less high-accuracy calibration-free blood pressure estimation using pulse transit time. In2015 IEEE international symposium on circuits and systems (ISCAS). IEEE, 1006–1009

work page 2015
[18]

Kianoosh Kazemi, Iman Azimi, Pasi Liljeberg, and Amir M Rahmani. 2025. Respi- ration Rate Estimation via Smartwatch-based Photoplethysmography and Ac- celerometer Data: A Transfer Learning Approach.Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies9, 1 (2025), 1–24

work page 2025
[19]

Hyung-Chul Lee, Yoonsang Park, Soo Bin Yoon, Seong Mi Yang, Dongnyeok Park, and Chul-Woo Jung. 2022. VitalDB, a high-fidelity multi-parameter vital signs database in surgical patients.Scientific Data9, 1 (2022), 279

work page 2022
[20]

Yong-Xian Li, Jiong-Ling Huang, Xin-Yu Yao, Si-Qi Mu, Shou-Xin Zong, and Yan-Fei Shen. 2024. A ballistocardiogram dataset with reference sensor signals in long-term natural sleep environments.Scientific Data11, 1 (2024), 1091

work page 2024
[21]

Yongbo Liang, Zhencheng Chen, Guiyong Liu, and Mohamed Elgendi. 2018. A new, short-recorded photoplethysmogram dataset for blood pressure monitoring in China.Scientific data5, 1 (2018), 1–7

work page 2018
[22]

David Liu, Matthias Görges, and Simon A Jenkins. 2012. University of Queensland vital signs dataset: Development of an accessible repository of anesthesia patient monitoring data for research.Anesthesia & Analgesia114, 3 (2012), 584–589

work page 2012
[23]

Haotian Liu, Chunyuan Li, Qingyang Wu, and Yong Jae Lee. 2023. Visual instruc- tion tuning. 34892–34916 pages

work page 2023
[24]

Zengding Liu, Bin Zhou, Zhiming Jiang, Xi Chen, Ye Li, Min Tang, and Fen Miao

work page
[25]

Multiclass Arrhythmia Detection and Classification From Photoplethys- mography Signals Using a Deep Convolutional Neural Network.Journal of the American Heart Association11, 7 (2022), e023555

work page 2022
[26]

Lau, Jan C

Dominique Makowski, Tam Pham, Zen J. Lau, Jan C. Brammer, François Lespinasse, Hung Pham, Christopher Schölzel, and S. H. Annabel Chen. 2021. NeuroKit2: A Python toolbox for neurophysiological signal processing.Behavior Research Methods53, 4 (feb 2021), 1689–1696. doi:10.3758/s13428-020-01516-y

work page doi:10.3758/s13428-020-01516-y 2021
[27]

Manuel Meier, Berken Utku Demirel, and Christian Holz. 2024. WildPPG: A Real-World PPG Dataset of Long Continuous Recordings.Advances in Neural Information Processing Systems37 (2024), 2246–2266

work page 2024
[28]

Alessandro Montanari, Andrea Ferlini, Ananta Narayanan Balaji, Cecilia Mascolo, and Fahim Kawsar. 2023. Earset: A multi-modal dataset for studying the impact of head and facial movements on in-ear ppg signals.Scientific data10, 1 (2023), 850

work page 2023
[29]

Jungwoo Oh, Gyubok Lee, Seongsu Bae, Joon-myoung Kwon, and Edward Choi

work page
[30]

Ecg-qa: A comprehensive question answering dataset combined with electrocardiogram.Advances in Neural Information Processing Systems36 (2023), 66277–66288

work page 2023
[31]

Jiating Pan, Lishi Liang, Yongbo Liang, Qunfeng Tang, Zhencheng Chen, and Jianming Zhu. 2024. Robust modelling of arterial blood pressure reconstruction from photoplethysmography.Scientific Reports14, 1 (2024), 30333

work page 2024
[32]

Fulai Peng, Zhengbo Zhang, Xiaoming Gou, Hongyun Liu, and Weidong Wang

work page
[33]

BioMedical Engineering Online13, 1 (April 2014)

Motion artifact removal from photoplethysmographic signals by combining temporally constrained independent component analysis and adaptive filter. BioMedical Engineering Online13, 1 (April 2014). doi:10.1186/1475-925x-13-50

work page doi:10.1186/1475-925x-13-50 2014
[34]

Hung Manh Pham, Matthew Yiwen Ho, Yiming Zhang, Dimitris Spathis, Aaqib Saeed, and Dong Ma. 2025. Reliable wrist PPG monitoring by nitigating poor skin sensor contact.Scientific Reports(2025)

work page 2025
[35]

Hung Manh Pham, Jialu Tang, Aaqib Saeed, and Dong Ma. 2025. Q-HEART: ECG Question Answering via Knowledge-Informed Multimodal LLMs. InPro- ceedings of the European Conference on Artificial Intelligence (ECAI) (Fron- tiers in Artificial Intelligence and Applications, Vol. 413). IOS Press, 4545–4552. doi:10.3233/FAIA251356

work page doi:10.3233/faia251356 2025
[36]

Arvind Pillai, Dimitris Spathis, Fahim Kawsar, and Mohammad Malekzadeh

work page
[37]

In The Thirteenth International Conference on Learning Representations, ICLR 2025

PaPaGei: Open Foundation Models for Optical Physiological Signals. In The Thirteenth International Conference on Learning Representations, ICLR 2025. Singapore. [https://arxiv.org/abs/2410.20542](https://arxiv.org/abs/2410.20542) Accepted. arXiv preprint arXiv:2410.20542

work page arXiv 2025
[38]

Aske Plaat, Annie Wong, Suzan Verberne, Joost Broekens, Niki Van Stein, and Thomas Bäck. 2025. Multi-step reasoning with large language models, a survey. Comput. Surveys58, 6 (2025), 1–35

work page 2025
[39]

Attila Reiss, Ina Indlekofer, and Philip Schmidt. 2019. PPG-DaLiA. UCI Machine Learning Repository. DOI: https://doi.org/10.24432/C53890

work page doi:10.24432/c53890 2019
[40]

Attila Reiss, Ina Indlekofer, Philip Schmidt, and Kristof Van Laerhoven. 2019. Deep PPG: Large-scale heart rate estimation with convolutional neural networks. Sensors19, 14 (2019), 3079

work page 2019
[41]

Xiang Yue Ruoqi Liu, Yuelin Bai and Ping Zhang. 2024. Teach Multimodal LLMs to Comprehend Electrocardiographic Images.arXiv preprint arXiv:2410.19008 (2024)

work page arXiv 2024
[42]

Xu, Wanting Mao, Sameer Neupane, James M

Mithun Saha, Maxwell A. Xu, Wanting Mao, Sameer Neupane, James M. Rehg, and Santosh Kumar. 2025. Pulse-PPG: An Open-Source Field-Trained PPG Foundation Model for Wearable Applications across Lab and Field Settings.Proc. ACM Interact. Mob. Wearable Ubiquitous Technol.9, 3, Article 126 (Sept. 2025), 35 pages. doi:10.1145/3749494

work page doi:10.1145/3749494 2025
[43]

Philip Schmidt, Attila Reiss, Robert Duerichen, Claus Marberger, and Kristof Van Laerhoven. 2018. Introducing wesad, a multimodal dataset for wearable stress and affect detection. InProceedings of the 20th ACM international conference on multimodal interaction. 400–408

work page 2018
[44]

Andrew Sellergren, Sahar Kazemzadeh, Tiam Jaroensri, Atilla Kiraly, Madeleine Traverse, Timo Kohlberger, Shawn Xu, Fayaz Jamil, Cían Hughes, Charles Lau, et al. 2025. MedGemma Technical Report.arXiv preprint arXiv:2507.05201(2025)

work page internal anchor Pith review Pith/arXiv arXiv 2025
[45]

Qwen Team. 2025. Qwen3 Technical Report. arXiv:2505.09388 [cs.CL] https: //arxiv.org/abs/2505.09388 Conference acronym ’XX, XX, XXXX

work page internal anchor Pith review Pith/arXiv arXiv 2025
[46]

Min Wang, Zhe Li, Qirui Zhang, and Guoxing Wang. 2019. Removal of Motion Artifacts in Photoplethysmograph Sensors during Intensive Exercise for Accurate Heart Rate Calculation Based on Frequency Estimation and Notch Filtering. Sensors19, 15 (July 2019), 3312. doi:10.3390/s19153312

work page doi:10.3390/s19153312 2019
[47]

Jingye Xu, Yuntong Zhang, Wei Wang, Mimi Xie, and Dakai Zhu. 2025. A Compre- hensive PPG-based Dataset for HR/HRV Studies.arXiv preprint arXiv:2505.18165 (2025)

work page arXiv 2025
[48]

Amir Hosein Afandizadeh Zargari, Seyed Amir Hossein Aqajari, Hadi Khodaban- deh, Amir Rahmani, and Fadi Kurdahi. 2023. An Accurate Non-accelerometer- based PPG Motion Artifact Removal Technique using CycleGAN.ACM Transac- tions on Computing for Healthcare4, 1 (Jan. 2023), 1–14. doi:10.1145/3563949

work page doi:10.1145/3563949 2023
[49]

Yuwei Zhang, Kumar Ayush, Siyuan Qiao, A Ali Heydari, Girish Narayanswamy, Maxwell A Xu, Ahmed A Metwally, Shawn Xu, Jake Garrison, Xuhai Xu, et al

work page
[50]

SensorLM: Learning the Language of Wearable Sensors.arXiv preprint arXiv:2506.09108(2025). PulseLM: A Foundation Dataset and Benchmark for PPG-Text Learning Conference acronym ’XX, XX, XXXX A Appendix A.1 Source Dataset Details In our study, we utilize various of public PPG datasets as the sources to construct the QA dataset. In this section, we will intr...

work page arXiv 2025

[1] [1]

Apple Watch

Online. Apple Watch. https://www.apple.com/sg/watch/

work page

[2] [2]

EmbracePlus | The world’s most advanced smartwatch for continuous health monitoring

Online. EmbracePlus | The world’s most advanced smartwatch for continuous health monitoring. https://www.empatica.com/en-int/embraceplus/

work page

[3] [3]

polarvantagev3

Online. polarvantagev3. https://www.polar.com/sg-en/vantage/v3

work page

[4] [4]

Sennheiser Momentum Sport

Online. Sennheiser Momentum Sport. https://newsroom.sennheiser.com/the- thrill-of-performance-mltzvt

work page

[5] [5]

Salar Abbaspourazad, Oussama Elachqar, Andrew Miller, Saba Emrani, Udhyaku- mar Nallasamy, and Ian Shapiro. 2024. Large-scale Training of Foundation Models for Wearable Biosignals. InThe Twelfth International Conference on Learning Rep- resentations

work page 2024

[6] [6]

Nicolas Aguirre, Edith Grall-Maës, Leandro J Cymberknop, and Ricardo L Armen- tano. 2021. Blood pressure morphology assessment from photoplethysmogram and demographic information using deep learning with attention mechanism. Sensors21, 6 (2021), 2167

work page 2021

[7] [7]

J Bacevičius, Z Abramikas, I Badaras, M Butkuvien˙e, S Daukantas, E Dvinelis, M Gudauskas, E Jukna, M Kiseli¯ute, R Kundelis, et al. 2024. Long-term electrocar- diogram and wrist-based photoplethysmogram recordings with annotated atrial fibrillation episodes.Dataset on Zenodo(2024)

work page 2024

[8] [8]

Peter H Charlton, Kevin Kotzen, Elisa Mejía-Mejía, Philip J Aston, Karthik Bu- didha, Jonathan Mant, Callum Pettit, Joachim A Behar, and Panicos A Kyriacou

work page

[9] [9]

Detecting beats in the photoplethysmogram: benchmarking open-source algorithms.Physiological Measurement43, 8 (2022), 085007

work page 2022

[10] [10]

S. K. Deric Tang, Y. Y. S. Goh, M. L. D. Wong, and Y. L. E. Lew. 2016. PPG signal reconstruction using a combination of discrete wavelet transform and empirical mode decomposition. IEEE, 1–4

work page 2016

[11] [11]

Ainara Garde, Parastoo Dehkordi, Walter Karlen, David Wensley, J Mark Anser- mino, and Guy A Dumont. 2014. Development of a screening tool for sleep disordered breathing in children using the phone Oximeter™.PloS one9, 11 (2014), e112959

work page 2014

[12] [12]

Sergio González, Wan-Ting Hsieh, and Trista Pei-Chun Chen. 2023. A bench- mark for machine-learning based non-invasive blood pressure estimation using photoplethysmogram.Scientific Data10, 1 (2023), 149

work page 2023

[13] [13]

Matthew Yiwen Ho, Hung Manh Pham, Aaqib Saeed, and Dong Ma. 2025. WF- PPG: A wrist-finger dual-channel dataset for studying the impact of contact pressure on PPG morphology.Scientific Data12, 1 (2025), 200

work page 2025

[14] [14]

Changshuo Hu, Hung Manh Pham, and Dong Ma. 2025. Morphology-Aware HRV Estimation from Wrist PPG in Sedentary Scenarios. InCompanion of the 2025 ACM International Joint Conference on Pervasive and Ubiquitous Computing. 745–750

work page 2025

[15] [15]

Edward J Hu, yelong shen, Phillip Wallis, Zeyuan Allen-Zhu, Yuanzhi Li, Shean Wang, Lu Wang, and Weizhu Chen. 2022. LoRA: Low-Rank Adaptation of Large Language Models. InInternational Conference on Learning Representations. https: //openreview.net/forum?id=nZeVKeeFYf9

work page 2022

[16] [16]

Mohamad Kachuee, Mohammad Kiani, Hoda Mohammadzade, and Mahdi Sha- bany. 2015. Cuff-Less Blood Pressure Estimation. UCI Machine Learning Reposi- tory. doi:10.24432/C5B602

work page doi:10.24432/c5b602 2015

[17] [17]

Mohamad Kachuee, Mohammad Mahdi Kiani, Hoda Mohammadzade, and Mahdi Shabany. 2015. Cuff-less high-accuracy calibration-free blood pressure estimation using pulse transit time. In2015 IEEE international symposium on circuits and systems (ISCAS). IEEE, 1006–1009

work page 2015

[18] [18]

Kianoosh Kazemi, Iman Azimi, Pasi Liljeberg, and Amir M Rahmani. 2025. Respi- ration Rate Estimation via Smartwatch-based Photoplethysmography and Ac- celerometer Data: A Transfer Learning Approach.Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies9, 1 (2025), 1–24

work page 2025

[19] [19]

Hyung-Chul Lee, Yoonsang Park, Soo Bin Yoon, Seong Mi Yang, Dongnyeok Park, and Chul-Woo Jung. 2022. VitalDB, a high-fidelity multi-parameter vital signs database in surgical patients.Scientific Data9, 1 (2022), 279

work page 2022

[20] [20]

Yong-Xian Li, Jiong-Ling Huang, Xin-Yu Yao, Si-Qi Mu, Shou-Xin Zong, and Yan-Fei Shen. 2024. A ballistocardiogram dataset with reference sensor signals in long-term natural sleep environments.Scientific Data11, 1 (2024), 1091

work page 2024

[21] [21]

Yongbo Liang, Zhencheng Chen, Guiyong Liu, and Mohamed Elgendi. 2018. A new, short-recorded photoplethysmogram dataset for blood pressure monitoring in China.Scientific data5, 1 (2018), 1–7

work page 2018

[22] [22]

David Liu, Matthias Görges, and Simon A Jenkins. 2012. University of Queensland vital signs dataset: Development of an accessible repository of anesthesia patient monitoring data for research.Anesthesia & Analgesia114, 3 (2012), 584–589

work page 2012

[23] [23]

Haotian Liu, Chunyuan Li, Qingyang Wu, and Yong Jae Lee. 2023. Visual instruc- tion tuning. 34892–34916 pages

work page 2023

[24] [24]

Zengding Liu, Bin Zhou, Zhiming Jiang, Xi Chen, Ye Li, Min Tang, and Fen Miao

work page

[25] [25]

Multiclass Arrhythmia Detection and Classification From Photoplethys- mography Signals Using a Deep Convolutional Neural Network.Journal of the American Heart Association11, 7 (2022), e023555

work page 2022

[26] [26]

Lau, Jan C

Dominique Makowski, Tam Pham, Zen J. Lau, Jan C. Brammer, François Lespinasse, Hung Pham, Christopher Schölzel, and S. H. Annabel Chen. 2021. NeuroKit2: A Python toolbox for neurophysiological signal processing.Behavior Research Methods53, 4 (feb 2021), 1689–1696. doi:10.3758/s13428-020-01516-y

work page doi:10.3758/s13428-020-01516-y 2021

[27] [27]

Manuel Meier, Berken Utku Demirel, and Christian Holz. 2024. WildPPG: A Real-World PPG Dataset of Long Continuous Recordings.Advances in Neural Information Processing Systems37 (2024), 2246–2266

work page 2024

[28] [28]

Alessandro Montanari, Andrea Ferlini, Ananta Narayanan Balaji, Cecilia Mascolo, and Fahim Kawsar. 2023. Earset: A multi-modal dataset for studying the impact of head and facial movements on in-ear ppg signals.Scientific data10, 1 (2023), 850

work page 2023

[29] [29]

Jungwoo Oh, Gyubok Lee, Seongsu Bae, Joon-myoung Kwon, and Edward Choi

work page

[30] [30]

Ecg-qa: A comprehensive question answering dataset combined with electrocardiogram.Advances in Neural Information Processing Systems36 (2023), 66277–66288

work page 2023

[31] [31]

Jiating Pan, Lishi Liang, Yongbo Liang, Qunfeng Tang, Zhencheng Chen, and Jianming Zhu. 2024. Robust modelling of arterial blood pressure reconstruction from photoplethysmography.Scientific Reports14, 1 (2024), 30333

work page 2024

[32] [32]

Fulai Peng, Zhengbo Zhang, Xiaoming Gou, Hongyun Liu, and Weidong Wang

work page

[33] [33]

BioMedical Engineering Online13, 1 (April 2014)

Motion artifact removal from photoplethysmographic signals by combining temporally constrained independent component analysis and adaptive filter. BioMedical Engineering Online13, 1 (April 2014). doi:10.1186/1475-925x-13-50

work page doi:10.1186/1475-925x-13-50 2014

[34] [34]

Hung Manh Pham, Matthew Yiwen Ho, Yiming Zhang, Dimitris Spathis, Aaqib Saeed, and Dong Ma. 2025. Reliable wrist PPG monitoring by nitigating poor skin sensor contact.Scientific Reports(2025)

work page 2025

[35] [35]

Hung Manh Pham, Jialu Tang, Aaqib Saeed, and Dong Ma. 2025. Q-HEART: ECG Question Answering via Knowledge-Informed Multimodal LLMs. InPro- ceedings of the European Conference on Artificial Intelligence (ECAI) (Fron- tiers in Artificial Intelligence and Applications, Vol. 413). IOS Press, 4545–4552. doi:10.3233/FAIA251356

work page doi:10.3233/faia251356 2025

[36] [36]

Arvind Pillai, Dimitris Spathis, Fahim Kawsar, and Mohammad Malekzadeh

work page

[37] [37]

In The Thirteenth International Conference on Learning Representations, ICLR 2025

PaPaGei: Open Foundation Models for Optical Physiological Signals. In The Thirteenth International Conference on Learning Representations, ICLR 2025. Singapore. [https://arxiv.org/abs/2410.20542](https://arxiv.org/abs/2410.20542) Accepted. arXiv preprint arXiv:2410.20542

work page arXiv 2025

[38] [38]

Aske Plaat, Annie Wong, Suzan Verberne, Joost Broekens, Niki Van Stein, and Thomas Bäck. 2025. Multi-step reasoning with large language models, a survey. Comput. Surveys58, 6 (2025), 1–35

work page 2025

[39] [39]

Attila Reiss, Ina Indlekofer, and Philip Schmidt. 2019. PPG-DaLiA. UCI Machine Learning Repository. DOI: https://doi.org/10.24432/C53890

work page doi:10.24432/c53890 2019

[40] [40]

Attila Reiss, Ina Indlekofer, Philip Schmidt, and Kristof Van Laerhoven. 2019. Deep PPG: Large-scale heart rate estimation with convolutional neural networks. Sensors19, 14 (2019), 3079

work page 2019

[41] [41]

Xiang Yue Ruoqi Liu, Yuelin Bai and Ping Zhang. 2024. Teach Multimodal LLMs to Comprehend Electrocardiographic Images.arXiv preprint arXiv:2410.19008 (2024)

work page arXiv 2024

[42] [42]

Xu, Wanting Mao, Sameer Neupane, James M

Mithun Saha, Maxwell A. Xu, Wanting Mao, Sameer Neupane, James M. Rehg, and Santosh Kumar. 2025. Pulse-PPG: An Open-Source Field-Trained PPG Foundation Model for Wearable Applications across Lab and Field Settings.Proc. ACM Interact. Mob. Wearable Ubiquitous Technol.9, 3, Article 126 (Sept. 2025), 35 pages. doi:10.1145/3749494

work page doi:10.1145/3749494 2025

[43] [43]

Philip Schmidt, Attila Reiss, Robert Duerichen, Claus Marberger, and Kristof Van Laerhoven. 2018. Introducing wesad, a multimodal dataset for wearable stress and affect detection. InProceedings of the 20th ACM international conference on multimodal interaction. 400–408

work page 2018

[44] [44]

Andrew Sellergren, Sahar Kazemzadeh, Tiam Jaroensri, Atilla Kiraly, Madeleine Traverse, Timo Kohlberger, Shawn Xu, Fayaz Jamil, Cían Hughes, Charles Lau, et al. 2025. MedGemma Technical Report.arXiv preprint arXiv:2507.05201(2025)

work page internal anchor Pith review Pith/arXiv arXiv 2025

[45] [45]

Qwen Team. 2025. Qwen3 Technical Report. arXiv:2505.09388 [cs.CL] https: //arxiv.org/abs/2505.09388 Conference acronym ’XX, XX, XXXX

work page internal anchor Pith review Pith/arXiv arXiv 2025

[46] [46]

Min Wang, Zhe Li, Qirui Zhang, and Guoxing Wang. 2019. Removal of Motion Artifacts in Photoplethysmograph Sensors during Intensive Exercise for Accurate Heart Rate Calculation Based on Frequency Estimation and Notch Filtering. Sensors19, 15 (July 2019), 3312. doi:10.3390/s19153312

work page doi:10.3390/s19153312 2019

[47] [47]

Jingye Xu, Yuntong Zhang, Wei Wang, Mimi Xie, and Dakai Zhu. 2025. A Compre- hensive PPG-based Dataset for HR/HRV Studies.arXiv preprint arXiv:2505.18165 (2025)

work page arXiv 2025

[48] [48]

Amir Hosein Afandizadeh Zargari, Seyed Amir Hossein Aqajari, Hadi Khodaban- deh, Amir Rahmani, and Fadi Kurdahi. 2023. An Accurate Non-accelerometer- based PPG Motion Artifact Removal Technique using CycleGAN.ACM Transac- tions on Computing for Healthcare4, 1 (Jan. 2023), 1–14. doi:10.1145/3563949

work page doi:10.1145/3563949 2023

[49] [49]

Yuwei Zhang, Kumar Ayush, Siyuan Qiao, A Ali Heydari, Girish Narayanswamy, Maxwell A Xu, Ahmed A Metwally, Shawn Xu, Jake Garrison, Xuhai Xu, et al

work page

[50] [50]

SensorLM: Learning the Language of Wearable Sensors.arXiv preprint arXiv:2506.09108(2025). PulseLM: A Foundation Dataset and Benchmark for PPG-Text Learning Conference acronym ’XX, XX, XXXX A Appendix A.1 Source Dataset Details In our study, we utilize various of public PPG datasets as the sources to construct the QA dataset. In this section, we will intr...

work page arXiv 2025