BCG-FM: A Foundation Model for Ambient Cardiac Health Sensing

Ashish Neupane; David Q. Sun; Haejun Han; Magnus Ruud Kjaer

arxiv: 2606.07692 · v1 · pith:2NSZHWUYnew · submitted 2026-06-05 · 💻 cs.LG · cs.AI· cs.ET

BCG-FM: A Foundation Model for Ambient Cardiac Health Sensing

Magnus Ruud Kjaer , Haejun Han , Ashish Neupane , David Q. Sun This is my paper

Pith reviewed 2026-06-27 22:43 UTC · model grok-4.3

classification 💻 cs.LG cs.AIcs.ET

keywords foundation modelballistocardiographyBCGambient sensingbiological agecontrastive learningbiosignalscontactless

0 comments

The pith

A foundation model pretrained on ambient bed-sensor recordings achieves 3.26-year mean absolute error on biological age estimation with frozen embeddings.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper establishes BCG-FM as the first foundation model for ambient mechanical biosignals captured passively via bed-embedded piezoelectric sensors. It pretrains this model using participant-level contrastive learning on 2.75 million hours of data from 145,985 individuals. The resulting frozen embeddings deliver a 3.26-year MAE for biological age, the lowest reported for contactless ambient methods, and enable discrimination of 15 health conditions across external cohorts. This approach outperforms a fully supervised model trained on more than six times the labeled data. Such passive sensing could support continuous health monitoring without user effort or specialized equipment.

Core claim

BCG-FM demonstrates that participant-level contrastive pretraining on the largest raw BCG waveform corpus produces embeddings that support accurate downstream tasks including biological age estimation at 3.26 years MAE and clinically relevant classification of multiple self-reported health conditions, while generalizing to independent cohorts and scaling with pretraining batch size.

What carries the argument

Participant-level contrastive learning on raw nightly BCG waveforms recorded by piezoelectric sensors embedded in the bed surface, which generates general-purpose embeddings without task-specific labels during pretraining.

If this is right

Representation quality improves log-linearly as the contrastive batch size increases.
Embeddings pretrained on only 500 labeled participants outperform a fully supervised baseline trained on 3,372 participants.
The embeddings maintain performance across three independent external cohorts.
Discrimination is achieved for 15 different self-reported health conditions.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

If the contrastive embeddings truly capture physiological information, they could enable longitudinal tracking of health changes in everyday home settings.
The success with ambient data suggests similar pretraining strategies might apply to other passive biosignal modalities.
Log-linear scaling indicates potential benefits from even larger unlabeled datasets in future work.

Load-bearing premise

The downstream performance on age estimation and health condition discrimination arises from genuine physiological information in the BCG signals rather than from artifacts in the dataset or noise in the self-reported labels.

What would settle it

Training and evaluating the model on a completely new, independent dataset where the mean absolute error for biological age estimation exceeds 5 years or where discrimination of the health conditions falls to chance levels would falsify the claim that the pretrained embeddings carry useful physiological signal.

Figures

Figures reproduced from arXiv: 2606.07692 by Ashish Neupane, David Q. Sun, Haejun Han, Magnus Ruud Kjaer.

**Figure 1.** Figure 1: Overview of BCG-FM. (a) A piezoelectric bed sensor captures nightly BCG recordings, decomposed into breathing and [PITH_FULL_IMAGE:figures/full_fig_p002_1.png] view at source ↗

**Figure 2.** Figure 2: BCG-FM age probe overview: (a) Batch-size scaling [PITH_FULL_IMAGE:figures/full_fig_p005_2.png] view at source ↗

read the original abstract

Foundation models for wearable biosignals have matched or exceeded supervised specialists across a range of clinical tasks, yet all rely on modalities that require deliberate user action--wearing a device or visiting a sleep lab. We introduce BCG-FM, the first foundation model for ambient mechanical biosignals. A piezoelectric sensor embedded in the bed surface records ballistocardiography (BCG) each night without user effort; we pretrain BCG-FM with participant-level contrastive learning and using a total of 2.75 million hours of nightly recordings from 145,985 individuals, the largest raw-waveform biosignal pretraining corpus to date. Frozen BCG-FM embeddings achieve 3.26-year MAE on biological-age estimation (the lowest reported for any ambient, contactless modality) and yield clinically relevant discrimination across 15 self-reported health conditions and three independent external cohorts. Pretrained representations from only 500 labeled participants outperform a fully supervised baseline trained on 3,372, and representation quality scales log-linearly with contrastive batch size. These results establish ambient, longitudinal mechanical biosignals as a viable modality for health foundation models.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

BCG-FM scales pretraining on bed BCG to a new size and gets label-efficient downstream results, but the contrastive setup needs checks against identity leakage before the physiology claims land.

read the letter

The new piece is the scale: 2.75 million hours of raw nightly BCG from nearly 146k people, pretrained with participant-level contrastive learning, then frozen for age regression and 15-condition discrimination. The numbers (3.26-year MAE on age, beating a supervised baseline on 6.7 times more labels, plus transfer to external cohorts) are the first reported at this size for ambient mechanical signals. That part is straightforward and worth noting for anyone tracking contactless biosignals.

The soft spot is the one flagged in the stress test. Participant-level positives can easily encode stable per-person factors (sensor coupling, bed mechanics, nightly habits) instead of cross-person cardiac features. The abstract gives no ablation separating same-subject similarity from physiology, no subject-ID removal test, and no breakdown of how much the external-cohort gains depend on those cues. Without those, the label-efficiency story rests on an untested assumption. Methods details on splits, noise handling in self-reports, and error bars are also missing, which keeps the soundness low until the full paper is checked.

This is for groups already working on biosignal foundation models or ambient sensing. It is coherent on its own terms and shows clear thinking about scaling, so it deserves referee time even if the central assumption needs direct testing. I would send it to review.

Referee Report

2 major / 1 minor

Summary. The paper introduces BCG-FM, the first foundation model for ambient ballistocardiography (BCG) signals captured contactlessly via bed-embedded piezoelectric sensors. It pretrains on 2.75 million hours of raw nightly BCG waveforms from 145,985 participants using participant-level contrastive learning, then shows that frozen embeddings achieve 3.26-year MAE on biological-age regression (lowest reported for ambient modalities), clinically relevant discrimination on 15 self-reported conditions, and strong performance on three external cohorts. Pretrained representations from 500 labels outperform a supervised baseline trained on 3,372 labels, with log-linear scaling in contrastive batch size.

Significance. If the performance claims and generalization hold after proper validation, the work would establish ambient mechanical biosignals as a viable modality for health foundation models, highlighting label efficiency and scaling properties that could enable passive, longitudinal monitoring without user action. The scale of the pretraining corpus and the reported outperformance with fewer labels are notable strengths.

major comments (2)

[Abstract/Methods] Abstract and Methods: The reported MAE of 3.26 years, discrimination results, and outperformance over the supervised baseline (trained on 3,372 vs. 500 labels) are presented without details on validation splits, error bars, how the supervised baseline was trained, or handling of self-report label noise. These omissions are load-bearing for assessing whether the central performance claims are supported.
[Pretraining/Evaluation] Pretraining and Evaluation sections: The participant-level contrastive learning on 2.75M hours is claimed to produce transferable physiological embeddings that generalize to external cohorts. However, no ablation is provided to separate identity leakage (e.g., sensor coupling or nightly routine artifacts) from cardiac mechanics, such as same-participant vs. different-participant similarity breakdowns or adversarial subject-ID removal. This assumption is load-bearing for the generalization and scaling claims.

minor comments (1)

[Methods] Notation for the contrastive loss and embedding dimensions should be defined explicitly in the methods to improve reproducibility.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their detailed and constructive review. We address each of the major comments below and have updated the manuscript to incorporate additional methodological details as requested.

read point-by-point responses

Referee: [Abstract/Methods] Abstract and Methods: The reported MAE of 3.26 years, discrimination results, and outperformance over the supervised baseline (trained on 3,372 vs. 500 labels) are presented without details on validation splits, error bars, how the supervised baseline was trained, or handling of self-report label noise. These omissions are load-bearing for assessing whether the central performance claims are supported.

Authors: We agree that these details were insufficiently described. In the revised version, we have added a dedicated subsection in Methods detailing the participant-level cross-validation procedure (ensuring no data leakage across participants), reporting of mean and standard deviation across folds for all metrics, the exact training procedure for the supervised baseline (identical encoder trained with full supervision on the 3,372 labels), and a note on the limitations of self-reported labels including potential noise. These changes directly address the concerns and improve the clarity of the performance claims. revision: yes
Referee: [Pretraining/Evaluation] Pretraining and Evaluation sections: The participant-level contrastive learning on 2.75M hours is claimed to produce transferable physiological embeddings that generalize to external cohorts. However, no ablation is provided to separate identity leakage (e.g., sensor coupling or nightly routine artifacts) from cardiac mechanics, such as same-participant vs. different-participant similarity breakdowns or adversarial subject-ID removal. This assumption is load-bearing for the generalization and scaling claims.

Authors: We acknowledge the value of such ablations but maintain that the current evidence supports the claims. The external cohorts involve entirely new participants and hardware, providing a natural test against identity-specific leakage; the maintained performance there indicates that the embeddings capture generalizable cardiac signals. We have added text in the Pretraining and Evaluation sections to explicitly discuss this. While we agree an adversarial removal experiment would be informative, it is not feasible within the current computational budget and we do not believe it is required to support the generalization claims given the external validation results. revision: partial

Circularity Check

0 steps flagged

No circularity in derivation chain

full rationale

The paper reports empirical results from participant-level contrastive pretraining on 2.75M hours of BCG waveforms followed by frozen-embedding evaluation on biological-age regression (3.26 yr MAE) and health-condition discrimination across external cohorts. No equations, first-principles derivations, or predictions appear that reduce these metrics to parameters fitted on the same data by construction. No self-citation load-bearing steps, uniqueness theorems, or ansatzes are invoked. The central claims rest on standard ML scaling and independent external validation rather than definitional equivalence, qualifying for score 0.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

Abstract-only review; the central claims rest on the unstated assumption that contrastive pretraining on participant-level BCG produces generalizable representations and that self-reported conditions serve as valid evaluation targets. No free parameters or invented entities are described.

axioms (1)

domain assumption Participant-level contrastive learning on raw BCG waveforms yields embeddings useful for biological age and health condition discrimination
Invoked as the pretraining strategy that enables the reported downstream performance.

pith-pipeline@v0.9.1-grok · 5740 in / 1299 out tokens · 18687 ms · 2026-06-27T22:43:15.794831+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

28 extracted references · 5 canonical work pages

[1]

He, et al

K. He, et al. Momentum contrast for unsupervised visual repre- sentation learning. InCVPR, pages 9729–9738, 2020

2020
[2]

Tan and Q

M. Tan and Q. Le. EfficientNet: Rethinking model scaling for convolutional neural networks. InICML, pages 6105–6114, 2019

2019
[3]

Abbaspourazad, et al

H. Abbaspourazad, et al. Large-scale training of foundation models for wearable biosignals. InICLR, 2024

2024
[4]

A. C. Miller, J. Futoma, S. Abbaspourazad, C. Heinze-Deml, S. Emrani, I. Shapiro, and G. Sapiro. A wearable-based aging clock associates with disease and behavior.Nature Communi- cations, 2025

2025
[5]

Thapa, et al

R. Thapa, et al. A multimodal sleep foundation model for disease prediction.Nature Medicine, 32:752–762, 2026

2026
[6]

Li, et al

J. Li, et al. An electrocardiogram foundation model built on over 10 million recordings with external evaluation across multiple domains.NEJM AI, 2(7), 2025

2025
[7]

Coppola, M

E. Coppola, M. Savardi, M. Massussi, M. Adamo, M. Metra, and A. Signoroni. HuBERT-ECG: A self-supervised foundation model for broad and scalable cardiac applications.medRxiv preprint, 2024

2024
[8]

Pillai, D

A. Pillai, D. Spathis, F. Kawsar, and M. Malekzadeh. PaPaGei: Open foundation models for optical physiological signals. In ICLR, 2025

2025
[9]

Liu, et al

S.-H. Liu, et al. Unobtrusive mattress-based identification of hypertension by integrating classification and association rule mining.Sensors, 19(7):1489, 2019.DOI: 10.3390/s19071489. Dataset: Figshare,DOI: 10.6084/m9.figshare.7594433

work page doi:10.3390/s19071489 2019
[10]

G. Nie, Q. Zhao, G. Tang, Y . Li, and S. Hong. Artificial intelligence-derived photoplethysmography age as a digital biomarker for cardiovascular health.Communications Medicine, 2025

2025
[11]

Abbaspourazad, et al

S. Abbaspourazad, et al. Wearable accelerometer foundation models for health via knowledge distillation.arXiv preprint arXiv:2412.11276, 2025

arXiv 2025
[12]

Narayanswamy, et al

G. Narayanswamy, et al. Scaling wearable foundation models. arXiv preprint arXiv:2410.13638, 2024

arXiv 2024
[13]

Zhang, et al

Y . Zhang, et al. SensorLM: Learning the language of wearable sensors.arXiv preprint arXiv:2506.09108, 2025

arXiv 2025
[14]

Gu, et al

X. Gu, et al. Cardiac health assessment across scenarios and devices using a multimodal foundation model pretrained on data from 1.7 million individuals.Nature Machine Intelligence, 8:220–233, 2026

2026
[15]

Sablayrolles, M

A. Sablayrolles, M. Douze, C. Schmid, and H. Jégou. Spreading vectors for similarity search. InICLR, 2019

2019
[16]

Oquab, et al

M. Oquab, et al. DINOv2: Learning robust visual features without supervision.TMLR, 2024

2024
[17]

O. T. Inan, et al. Ballistocardiography and seismocardiography: A review of recent advances.IEEE J. Biomed. Health Inform., 19(4):1414–1427, 2015

2015
[18]

Brüser, S

C. Brüser, S. Winter, and S. Leonhardt. Robust inter-beat interval estimation in cardiac vibration signals.Physiological Measurement, 34(2):123–138, 2013

2013
[19]

R. Yi, M. Enayati, J. M. Keller, M. Popescu, and M. Sku- bic. Non-invasive in-home sleep stage classification using a ballistocardiography bed sensor. InProc. IEEE Int. Conf. Bioin- formatics and Biomedicine (BIBM), 2019

2019
[20]

Dziuda, F

Ł. Dziuda, F. W. Skibniewski, M. Krej, and P. M. Baran. Bed- embedded heart and respiration rates detection by longitu- dinal ballistocardiography and pattern recognition.Sensors, 19(6):1451, 2019

2019
[21]

J. W. Gordon. On certain molar movements of the human body produced by the circulation of the blood.J. Anat. Physiol., 11:533–536, 1877
[22]

Brink-Kjær, et al

A. Brink-Kjær, et al. Age estimation from sleep studies using deep learning predicts life expectancy.npj Digital Medicine, 5:103, 2022

2022
[23]

J. Zhan, Z. Li, X. Wu, et al. A multi-pathology ballistocardio- gram dataset for cardiac function monitoring and arrhythmia as- sessment.Scientific Data, 12:963, 2025.DOI: 10.1038/s41597- 025-05287-z

work page doi:10.1038/s41597- 2025
[24]

J. Qiu, T. Lyu, L. Liu, et al. A ballistocardiogram dataset with reference ECG signals for bed-based heart rhythm assessment. Scientific Data, 12:1651, 2025.DOI: 10.1038/s41597-025- 05936-3

work page doi:10.1038/s41597-025- 2025
[25]

E. M. Lima, et al. Deep neural network-estimated electrocardio- graphic age as a mortality predictor.Nature Communications, 12:5117, 2021.DOI: 10.1038/s41467-021-25351-7

work page doi:10.1038/s41467-021-25351-7 2021
[26]

Zhang, Y

D. Zhang, Y . She, J. Sun, Y . Cui, X. Yang, X. Zeng, and W. Qin. Brain age estimation from overnight sleep electroencephalogra- phy with multi-flow sequence learning.Nature and Science of Sleep, 16:879–896, 2024.DOI: 10.2147/NSS.S463495

work page doi:10.2147/nss.s463495 2024
[27]

G. Nie, X. Fang, G. Tang, Y . Xiao, J. Li, B. Liu, H. Li, and S. Hong. AnyPPG: An ECG-guided PPG foundation model trained on over 100,000 hours of recordings for holistic health profiling.arXiv preprint arXiv:2511.01747, 2025

Pith/arXiv arXiv 2025
[28]

van den Oord, Y

A. van den Oord, Y . Li, and O. Vinyals. Representation learning with contrastive predictive coding.arXiv preprint arXiv:1807.03748, 2018. A BCG-specific augmentation analysis We evaluate four domain-plausible augmentations for BCG: clipping, sensor cross-talk, Gaussian noise, and temporal dropout. These augmentations are designed to mimic real- istic sig...

Pith/arXiv arXiv 2018

[1] [1]

He, et al

K. He, et al. Momentum contrast for unsupervised visual repre- sentation learning. InCVPR, pages 9729–9738, 2020

2020

[2] [2]

Tan and Q

M. Tan and Q. Le. EfficientNet: Rethinking model scaling for convolutional neural networks. InICML, pages 6105–6114, 2019

2019

[3] [3]

Abbaspourazad, et al

H. Abbaspourazad, et al. Large-scale training of foundation models for wearable biosignals. InICLR, 2024

2024

[4] [4]

A. C. Miller, J. Futoma, S. Abbaspourazad, C. Heinze-Deml, S. Emrani, I. Shapiro, and G. Sapiro. A wearable-based aging clock associates with disease and behavior.Nature Communi- cations, 2025

2025

[5] [5]

Thapa, et al

R. Thapa, et al. A multimodal sleep foundation model for disease prediction.Nature Medicine, 32:752–762, 2026

2026

[6] [6]

Li, et al

J. Li, et al. An electrocardiogram foundation model built on over 10 million recordings with external evaluation across multiple domains.NEJM AI, 2(7), 2025

2025

[7] [7]

Coppola, M

E. Coppola, M. Savardi, M. Massussi, M. Adamo, M. Metra, and A. Signoroni. HuBERT-ECG: A self-supervised foundation model for broad and scalable cardiac applications.medRxiv preprint, 2024

2024

[8] [8]

Pillai, D

A. Pillai, D. Spathis, F. Kawsar, and M. Malekzadeh. PaPaGei: Open foundation models for optical physiological signals. In ICLR, 2025

2025

[9] [9]

Liu, et al

S.-H. Liu, et al. Unobtrusive mattress-based identification of hypertension by integrating classification and association rule mining.Sensors, 19(7):1489, 2019.DOI: 10.3390/s19071489. Dataset: Figshare,DOI: 10.6084/m9.figshare.7594433

work page doi:10.3390/s19071489 2019

[10] [10]

G. Nie, Q. Zhao, G. Tang, Y . Li, and S. Hong. Artificial intelligence-derived photoplethysmography age as a digital biomarker for cardiovascular health.Communications Medicine, 2025

2025

[11] [11]

Abbaspourazad, et al

S. Abbaspourazad, et al. Wearable accelerometer foundation models for health via knowledge distillation.arXiv preprint arXiv:2412.11276, 2025

arXiv 2025

[12] [12]

Narayanswamy, et al

G. Narayanswamy, et al. Scaling wearable foundation models. arXiv preprint arXiv:2410.13638, 2024

arXiv 2024

[13] [13]

Zhang, et al

Y . Zhang, et al. SensorLM: Learning the language of wearable sensors.arXiv preprint arXiv:2506.09108, 2025

arXiv 2025

[14] [14]

Gu, et al

X. Gu, et al. Cardiac health assessment across scenarios and devices using a multimodal foundation model pretrained on data from 1.7 million individuals.Nature Machine Intelligence, 8:220–233, 2026

2026

[15] [15]

Sablayrolles, M

A. Sablayrolles, M. Douze, C. Schmid, and H. Jégou. Spreading vectors for similarity search. InICLR, 2019

2019

[16] [16]

Oquab, et al

M. Oquab, et al. DINOv2: Learning robust visual features without supervision.TMLR, 2024

2024

[17] [17]

O. T. Inan, et al. Ballistocardiography and seismocardiography: A review of recent advances.IEEE J. Biomed. Health Inform., 19(4):1414–1427, 2015

2015

[18] [18]

Brüser, S

C. Brüser, S. Winter, and S. Leonhardt. Robust inter-beat interval estimation in cardiac vibration signals.Physiological Measurement, 34(2):123–138, 2013

2013

[19] [19]

R. Yi, M. Enayati, J. M. Keller, M. Popescu, and M. Sku- bic. Non-invasive in-home sleep stage classification using a ballistocardiography bed sensor. InProc. IEEE Int. Conf. Bioin- formatics and Biomedicine (BIBM), 2019

2019

[20] [20]

Dziuda, F

Ł. Dziuda, F. W. Skibniewski, M. Krej, and P. M. Baran. Bed- embedded heart and respiration rates detection by longitu- dinal ballistocardiography and pattern recognition.Sensors, 19(6):1451, 2019

2019

[21] [21]

J. W. Gordon. On certain molar movements of the human body produced by the circulation of the blood.J. Anat. Physiol., 11:533–536, 1877

[22] [22]

Brink-Kjær, et al

A. Brink-Kjær, et al. Age estimation from sleep studies using deep learning predicts life expectancy.npj Digital Medicine, 5:103, 2022

2022

[23] [23]

J. Zhan, Z. Li, X. Wu, et al. A multi-pathology ballistocardio- gram dataset for cardiac function monitoring and arrhythmia as- sessment.Scientific Data, 12:963, 2025.DOI: 10.1038/s41597- 025-05287-z

work page doi:10.1038/s41597- 2025

[24] [24]

J. Qiu, T. Lyu, L. Liu, et al. A ballistocardiogram dataset with reference ECG signals for bed-based heart rhythm assessment. Scientific Data, 12:1651, 2025.DOI: 10.1038/s41597-025- 05936-3

work page doi:10.1038/s41597-025- 2025

[25] [25]

E. M. Lima, et al. Deep neural network-estimated electrocardio- graphic age as a mortality predictor.Nature Communications, 12:5117, 2021.DOI: 10.1038/s41467-021-25351-7

work page doi:10.1038/s41467-021-25351-7 2021

[26] [26]

Zhang, Y

D. Zhang, Y . She, J. Sun, Y . Cui, X. Yang, X. Zeng, and W. Qin. Brain age estimation from overnight sleep electroencephalogra- phy with multi-flow sequence learning.Nature and Science of Sleep, 16:879–896, 2024.DOI: 10.2147/NSS.S463495

work page doi:10.2147/nss.s463495 2024

[27] [27]

G. Nie, X. Fang, G. Tang, Y . Xiao, J. Li, B. Liu, H. Li, and S. Hong. AnyPPG: An ECG-guided PPG foundation model trained on over 100,000 hours of recordings for holistic health profiling.arXiv preprint arXiv:2511.01747, 2025

Pith/arXiv arXiv 2025

[28] [28]

van den Oord, Y

A. van den Oord, Y . Li, and O. Vinyals. Representation learning with contrastive predictive coding.arXiv preprint arXiv:1807.03748, 2018. A BCG-specific augmentation analysis We evaluate four domain-plausible augmentations for BCG: clipping, sensor cross-talk, Gaussian noise, and temporal dropout. These augmentations are designed to mimic real- istic sig...

Pith/arXiv arXiv 2018