BCG-FM: A Foundation Model for Ambient Cardiac Health Sensing
Pith reviewed 2026-06-27 22:43 UTC · model grok-4.3
The pith
A foundation model pretrained on ambient bed-sensor recordings achieves 3.26-year mean absolute error on biological age estimation with frozen embeddings.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
BCG-FM demonstrates that participant-level contrastive pretraining on the largest raw BCG waveform corpus produces embeddings that support accurate downstream tasks including biological age estimation at 3.26 years MAE and clinically relevant classification of multiple self-reported health conditions, while generalizing to independent cohorts and scaling with pretraining batch size.
What carries the argument
Participant-level contrastive learning on raw nightly BCG waveforms recorded by piezoelectric sensors embedded in the bed surface, which generates general-purpose embeddings without task-specific labels during pretraining.
If this is right
- Representation quality improves log-linearly as the contrastive batch size increases.
- Embeddings pretrained on only 500 labeled participants outperform a fully supervised baseline trained on 3,372 participants.
- The embeddings maintain performance across three independent external cohorts.
- Discrimination is achieved for 15 different self-reported health conditions.
Where Pith is reading between the lines
- If the contrastive embeddings truly capture physiological information, they could enable longitudinal tracking of health changes in everyday home settings.
- The success with ambient data suggests similar pretraining strategies might apply to other passive biosignal modalities.
- Log-linear scaling indicates potential benefits from even larger unlabeled datasets in future work.
Load-bearing premise
The downstream performance on age estimation and health condition discrimination arises from genuine physiological information in the BCG signals rather than from artifacts in the dataset or noise in the self-reported labels.
What would settle it
Training and evaluating the model on a completely new, independent dataset where the mean absolute error for biological age estimation exceeds 5 years or where discrimination of the health conditions falls to chance levels would falsify the claim that the pretrained embeddings carry useful physiological signal.
Figures
read the original abstract
Foundation models for wearable biosignals have matched or exceeded supervised specialists across a range of clinical tasks, yet all rely on modalities that require deliberate user action--wearing a device or visiting a sleep lab. We introduce BCG-FM, the first foundation model for ambient mechanical biosignals. A piezoelectric sensor embedded in the bed surface records ballistocardiography (BCG) each night without user effort; we pretrain BCG-FM with participant-level contrastive learning and using a total of 2.75 million hours of nightly recordings from 145,985 individuals, the largest raw-waveform biosignal pretraining corpus to date. Frozen BCG-FM embeddings achieve 3.26-year MAE on biological-age estimation (the lowest reported for any ambient, contactless modality) and yield clinically relevant discrimination across 15 self-reported health conditions and three independent external cohorts. Pretrained representations from only 500 labeled participants outperform a fully supervised baseline trained on 3,372, and representation quality scales log-linearly with contrastive batch size. These results establish ambient, longitudinal mechanical biosignals as a viable modality for health foundation models.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper introduces BCG-FM, the first foundation model for ambient ballistocardiography (BCG) signals captured contactlessly via bed-embedded piezoelectric sensors. It pretrains on 2.75 million hours of raw nightly BCG waveforms from 145,985 participants using participant-level contrastive learning, then shows that frozen embeddings achieve 3.26-year MAE on biological-age regression (lowest reported for ambient modalities), clinically relevant discrimination on 15 self-reported conditions, and strong performance on three external cohorts. Pretrained representations from 500 labels outperform a supervised baseline trained on 3,372 labels, with log-linear scaling in contrastive batch size.
Significance. If the performance claims and generalization hold after proper validation, the work would establish ambient mechanical biosignals as a viable modality for health foundation models, highlighting label efficiency and scaling properties that could enable passive, longitudinal monitoring without user action. The scale of the pretraining corpus and the reported outperformance with fewer labels are notable strengths.
major comments (2)
- [Abstract/Methods] Abstract and Methods: The reported MAE of 3.26 years, discrimination results, and outperformance over the supervised baseline (trained on 3,372 vs. 500 labels) are presented without details on validation splits, error bars, how the supervised baseline was trained, or handling of self-report label noise. These omissions are load-bearing for assessing whether the central performance claims are supported.
- [Pretraining/Evaluation] Pretraining and Evaluation sections: The participant-level contrastive learning on 2.75M hours is claimed to produce transferable physiological embeddings that generalize to external cohorts. However, no ablation is provided to separate identity leakage (e.g., sensor coupling or nightly routine artifacts) from cardiac mechanics, such as same-participant vs. different-participant similarity breakdowns or adversarial subject-ID removal. This assumption is load-bearing for the generalization and scaling claims.
minor comments (1)
- [Methods] Notation for the contrastive loss and embedding dimensions should be defined explicitly in the methods to improve reproducibility.
Simulated Author's Rebuttal
We thank the referee for their detailed and constructive review. We address each of the major comments below and have updated the manuscript to incorporate additional methodological details as requested.
read point-by-point responses
-
Referee: [Abstract/Methods] Abstract and Methods: The reported MAE of 3.26 years, discrimination results, and outperformance over the supervised baseline (trained on 3,372 vs. 500 labels) are presented without details on validation splits, error bars, how the supervised baseline was trained, or handling of self-report label noise. These omissions are load-bearing for assessing whether the central performance claims are supported.
Authors: We agree that these details were insufficiently described. In the revised version, we have added a dedicated subsection in Methods detailing the participant-level cross-validation procedure (ensuring no data leakage across participants), reporting of mean and standard deviation across folds for all metrics, the exact training procedure for the supervised baseline (identical encoder trained with full supervision on the 3,372 labels), and a note on the limitations of self-reported labels including potential noise. These changes directly address the concerns and improve the clarity of the performance claims. revision: yes
-
Referee: [Pretraining/Evaluation] Pretraining and Evaluation sections: The participant-level contrastive learning on 2.75M hours is claimed to produce transferable physiological embeddings that generalize to external cohorts. However, no ablation is provided to separate identity leakage (e.g., sensor coupling or nightly routine artifacts) from cardiac mechanics, such as same-participant vs. different-participant similarity breakdowns or adversarial subject-ID removal. This assumption is load-bearing for the generalization and scaling claims.
Authors: We acknowledge the value of such ablations but maintain that the current evidence supports the claims. The external cohorts involve entirely new participants and hardware, providing a natural test against identity-specific leakage; the maintained performance there indicates that the embeddings capture generalizable cardiac signals. We have added text in the Pretraining and Evaluation sections to explicitly discuss this. While we agree an adversarial removal experiment would be informative, it is not feasible within the current computational budget and we do not believe it is required to support the generalization claims given the external validation results. revision: partial
Circularity Check
No circularity in derivation chain
full rationale
The paper reports empirical results from participant-level contrastive pretraining on 2.75M hours of BCG waveforms followed by frozen-embedding evaluation on biological-age regression (3.26 yr MAE) and health-condition discrimination across external cohorts. No equations, first-principles derivations, or predictions appear that reduce these metrics to parameters fitted on the same data by construction. No self-citation load-bearing steps, uniqueness theorems, or ansatzes are invoked. The central claims rest on standard ML scaling and independent external validation rather than definitional equivalence, qualifying for score 0.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption Participant-level contrastive learning on raw BCG waveforms yields embeddings useful for biological age and health condition discrimination
Reference graph
Works this paper leans on
-
[1]
He, et al
K. He, et al. Momentum contrast for unsupervised visual repre- sentation learning. InCVPR, pages 9729–9738, 2020
2020
-
[2]
Tan and Q
M. Tan and Q. Le. EfficientNet: Rethinking model scaling for convolutional neural networks. InICML, pages 6105–6114, 2019
2019
-
[3]
Abbaspourazad, et al
H. Abbaspourazad, et al. Large-scale training of foundation models for wearable biosignals. InICLR, 2024
2024
-
[4]
A. C. Miller, J. Futoma, S. Abbaspourazad, C. Heinze-Deml, S. Emrani, I. Shapiro, and G. Sapiro. A wearable-based aging clock associates with disease and behavior.Nature Communi- cations, 2025
2025
-
[5]
Thapa, et al
R. Thapa, et al. A multimodal sleep foundation model for disease prediction.Nature Medicine, 32:752–762, 2026
2026
-
[6]
Li, et al
J. Li, et al. An electrocardiogram foundation model built on over 10 million recordings with external evaluation across multiple domains.NEJM AI, 2(7), 2025
2025
-
[7]
Coppola, M
E. Coppola, M. Savardi, M. Massussi, M. Adamo, M. Metra, and A. Signoroni. HuBERT-ECG: A self-supervised foundation model for broad and scalable cardiac applications.medRxiv preprint, 2024
2024
-
[8]
Pillai, D
A. Pillai, D. Spathis, F. Kawsar, and M. Malekzadeh. PaPaGei: Open foundation models for optical physiological signals. In ICLR, 2025
2025
-
[9]
S.-H. Liu, et al. Unobtrusive mattress-based identification of hypertension by integrating classification and association rule mining.Sensors, 19(7):1489, 2019.DOI: 10.3390/s19071489. Dataset: Figshare,DOI: 10.6084/m9.figshare.7594433
-
[10]
G. Nie, Q. Zhao, G. Tang, Y . Li, and S. Hong. Artificial intelligence-derived photoplethysmography age as a digital biomarker for cardiovascular health.Communications Medicine, 2025
2025
-
[11]
S. Abbaspourazad, et al. Wearable accelerometer foundation models for health via knowledge distillation.arXiv preprint arXiv:2412.11276, 2025
arXiv 2025
-
[12]
G. Narayanswamy, et al. Scaling wearable foundation models. arXiv preprint arXiv:2410.13638, 2024
arXiv 2024
-
[13]
Y . Zhang, et al. SensorLM: Learning the language of wearable sensors.arXiv preprint arXiv:2506.09108, 2025
arXiv 2025
-
[14]
Gu, et al
X. Gu, et al. Cardiac health assessment across scenarios and devices using a multimodal foundation model pretrained on data from 1.7 million individuals.Nature Machine Intelligence, 8:220–233, 2026
2026
-
[15]
Sablayrolles, M
A. Sablayrolles, M. Douze, C. Schmid, and H. Jégou. Spreading vectors for similarity search. InICLR, 2019
2019
-
[16]
Oquab, et al
M. Oquab, et al. DINOv2: Learning robust visual features without supervision.TMLR, 2024
2024
-
[17]
O. T. Inan, et al. Ballistocardiography and seismocardiography: A review of recent advances.IEEE J. Biomed. Health Inform., 19(4):1414–1427, 2015
2015
-
[18]
Brüser, S
C. Brüser, S. Winter, and S. Leonhardt. Robust inter-beat interval estimation in cardiac vibration signals.Physiological Measurement, 34(2):123–138, 2013
2013
-
[19]
R. Yi, M. Enayati, J. M. Keller, M. Popescu, and M. Sku- bic. Non-invasive in-home sleep stage classification using a ballistocardiography bed sensor. InProc. IEEE Int. Conf. Bioin- formatics and Biomedicine (BIBM), 2019
2019
-
[20]
Dziuda, F
Ł. Dziuda, F. W. Skibniewski, M. Krej, and P. M. Baran. Bed- embedded heart and respiration rates detection by longitu- dinal ballistocardiography and pattern recognition.Sensors, 19(6):1451, 2019
2019
-
[21]
J. W. Gordon. On certain molar movements of the human body produced by the circulation of the blood.J. Anat. Physiol., 11:533–536, 1877
-
[22]
Brink-Kjær, et al
A. Brink-Kjær, et al. Age estimation from sleep studies using deep learning predicts life expectancy.npj Digital Medicine, 5:103, 2022
2022
-
[23]
J. Zhan, Z. Li, X. Wu, et al. A multi-pathology ballistocardio- gram dataset for cardiac function monitoring and arrhythmia as- sessment.Scientific Data, 12:963, 2025.DOI: 10.1038/s41597- 025-05287-z
-
[24]
J. Qiu, T. Lyu, L. Liu, et al. A ballistocardiogram dataset with reference ECG signals for bed-based heart rhythm assessment. Scientific Data, 12:1651, 2025.DOI: 10.1038/s41597-025- 05936-3
-
[25]
E. M. Lima, et al. Deep neural network-estimated electrocardio- graphic age as a mortality predictor.Nature Communications, 12:5117, 2021.DOI: 10.1038/s41467-021-25351-7
-
[26]
D. Zhang, Y . She, J. Sun, Y . Cui, X. Yang, X. Zeng, and W. Qin. Brain age estimation from overnight sleep electroencephalogra- phy with multi-flow sequence learning.Nature and Science of Sleep, 16:879–896, 2024.DOI: 10.2147/NSS.S463495
-
[27]
G. Nie, X. Fang, G. Tang, Y . Xiao, J. Li, B. Liu, H. Li, and S. Hong. AnyPPG: An ECG-guided PPG foundation model trained on over 100,000 hours of recordings for holistic health profiling.arXiv preprint arXiv:2511.01747, 2025
Pith/arXiv arXiv 2025
-
[28]
A. van den Oord, Y . Li, and O. Vinyals. Representation learning with contrastive predictive coding.arXiv preprint arXiv:1807.03748, 2018. A BCG-specific augmentation analysis We evaluate four domain-plausible augmentations for BCG: clipping, sensor cross-talk, Gaussian noise, and temporal dropout. These augmentations are designed to mimic real- istic sig...
Pith/arXiv arXiv 2018
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.