pith. sign in

arxiv: 2605.30865 · v1 · pith:4J7VWB65new · submitted 2026-05-29 · 💻 cs.LG

GlucoFM: A Dual-Stream Foundation Model for Continuous Glucose Monitoring

Pith reviewed 2026-06-28 23:22 UTC · model grok-4.3

classification 💻 cs.LG
keywords continuous glucose monitoringfoundation modeldual-stream architectureglycemic dynamicsclinical predictiontransfer learningdiabetes risklinear probing
0
0 comments X

The pith

GlucoFM separates CGM traces into slow physiological state and transient event streams to achieve the best subject-disjoint performance on seven clinical prediction tasks across four cohorts.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper presents GlucoFM as a foundation model that aligns irregular glucose recordings to a daily grid and explicitly decomposes each trace into a slow baseline stream and a transient deviation stream. It pretrains this architecture on over 100,000 hours of unlabeled data using two objectives that operate on fused daily representations and on the separated streams separately. The central claim is that this decomposition supplies an inductive bias that produces more transferable representations than single-stream alternatives, which is tested through linear probing on downstream diabetes-related tasks. A sympathetic reader would care because better transferable CGM representations could support more accurate risk assessment and treatment decisions without requiring large labeled datasets for every new cohort or task.

Core claim

GlucoFM aligns CGM recordings to a 24-hour chronological grid while preserving masks, decomposes glucose dynamics into slow physiological state and transient event streams, and is pretrained with masked contextual latent prediction over fused daily representations plus temporal dynamics prediction over the state and event streams; across four cohorts and seven tasks this yields the strongest subject-disjoint linear-probing results, improving average PR-AUC by 4.1 points over the best prior CGM-specific foundation model and leading on all diabetes-risk and β-cell dysfunction tasks plus three of four insulin-resistance tasks.

What carries the argument

The dual-stream decomposition of glucose dynamics into a slow physiological state stream and a transient event stream, together with the two pretraining objectives that operate on fused and separated representations.

If this is right

  • GlucoFM leads on every diabetes-risk and β-cell dysfunction task and on three of four insulin-resistance tasks.
  • It achieves the best overall cross-dataset transfer performance among evaluated methods.
  • It shows strong few-shot adaptation and consistent gains when multiple days are aggregated for subject-level prediction.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same decomposition could be tested on other physiological time series that contain both slow baselines and acute events, such as heart-rate or activity data.
  • If the transient stream primarily isolates sensor artifacts, downstream models might use it to flag unreliable segments without additional supervision.
  • Aggregating predictions across days as described could be extended to produce subject-level risk scores suitable for longitudinal monitoring.

Load-bearing premise

Explicitly splitting glucose dynamics into slow physiological state and transient event streams supplies a useful inductive bias for transferable representations.

What would settle it

On the same four cohorts and seven tasks, a single-stream model that matches or exceeds GlucoFM's average PR-AUC under identical subject-disjoint linear-probing evaluation would falsify the claimed advantage of the decomposition.

read the original abstract

Continuous glucose monitoring (CGM) provides a dense view of daily metabolic physiology, yet existing generic time-series and CGM-specific foundation models often encode glucose traces as entangled single-stream sequences, leaving the distinct temporal structure of glycemic dynamics only implicitly modeled. We present GlucoFM, a lightweight CGM foundation model that aligns irregular recordings to a 24-hour chronological grid, preserves observation masks, and decomposes glucose dynamics into slow physiological state and transient event streams, capturing low-frequency glycemic baselines and short-term deviations that may reflect acute physiological responses or sensor artifacts. GlucoFM is pretrained on 109,066 hours of unlabeled CGM recordings from 477 subjects with two complementary objectives: masked contextual latent prediction over fused daily representations and temporal dynamics prediction over state and event streams. Across four diverse cohorts and seven clinical prediction tasks, GlucoFM achieves the strongest subject-disjoint linear-probing performance among evaluated baselines, improving average PR-AUC by 4.1 points over the best CGM-specific foundation model. Its gains are most pronounced on core metabolic outcomes, leading PR-AUC on all diabetes-risk and $\beta$-cell dysfunction tasks and on 3 of 4 insulin-resistance tasks. GlucoFM also achieves the best overall cross-dataset transfer performance and strong few-shot adaptation among evaluated methods, and consistent gains when aggregating multiple days for subject-level prediction, highlighting physiology-aware decomposition as an effective inductive bias for transferable CGM representation learning.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The paper introduces GlucoFM, a lightweight foundation model for continuous glucose monitoring (CGM) that aligns irregular recordings to a 24-hour grid, preserves masks, and explicitly decomposes glucose dynamics into slow physiological state and transient event streams. It is pretrained on 109,066 hours from 477 subjects using masked contextual latent prediction and temporal dynamics prediction objectives. Across four cohorts and seven clinical tasks, it reports the strongest subject-disjoint linear-probing performance, with a 4.1-point average PR-AUC gain over the best CGM-specific baseline, plus strong cross-dataset transfer and few-shot results.

Significance. If the reported gains hold under controlled ablations, the explicit state/event decomposition would supply a physiologically motivated inductive bias that improves transferable representations for metabolic prediction tasks, particularly diabetes-risk and insulin-resistance outcomes. The scale of pretraining data, subject-disjoint evaluation, and multi-cohort testing are strengths that would support broader adoption of physiology-aware CGM models if the architectural contribution is isolated.

major comments (2)
  1. [Abstract / architecture section] Abstract and architecture description: the central claim attributes the 4.1-point PR-AUC improvement and superior transfer to the explicit decomposition into slow physiological state and transient event streams (with matching pretraining objectives). However, the reported comparisons are only against external baselines; no internal single-stream control with identical grid alignment, masks, and losses is described, leaving open whether gains arise from the split itself or from data scale and other design choices.
  2. [Evaluation / results] Evaluation section: performance numbers are reported without error bars, ablation details on hyperparameter sensitivity, or verification that gains survive alternative subject-disjoint splits; this weakens confidence that the dual-stream advantage is robust rather than tied to specific choices.
minor comments (2)
  1. [Abstract] Abstract: the phrase 'physiology-aware decomposition as an effective inductive bias' is asserted but would benefit from a brief forward reference to the specific pretraining objectives that enforce the state/event separation.
  2. [Methods] Notation: the distinction between 'fused daily representations' and the separate state/event streams should be clarified with a short diagram or equation in the methods to avoid ambiguity for readers.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive comments, which highlight opportunities to strengthen the isolation of the dual-stream contribution and the robustness of the reported results. We agree with both major points and will revise the manuscript accordingly.

read point-by-point responses
  1. Referee: [Abstract / architecture section] Abstract and architecture description: the central claim attributes the 4.1-point PR-AUC improvement and superior transfer to the explicit decomposition into slow physiological state and transient event streams (with matching pretraining objectives). However, the reported comparisons are only against external baselines; no internal single-stream control with identical grid alignment, masks, and losses is described, leaving open whether gains arise from the split itself or from data scale and other design choices.

    Authors: We acknowledge that an internal single-stream control is necessary to isolate the contribution of the state-event decomposition. In the revised manuscript we will add results from a single-stream variant that retains identical 24-hour grid alignment, observation masks, pretraining objectives, and model capacity but omits the explicit state/event split. This ablation will directly test whether the performance advantage is attributable to the decomposition itself. revision: yes

  2. Referee: [Evaluation / results] Evaluation section: performance numbers are reported without error bars, ablation details on hyperparameter sensitivity, or verification that gains survive alternative subject-disjoint splits; this weakens confidence that the dual-stream advantage is robust rather than tied to specific choices.

    Authors: We agree that error bars, hyperparameter sensitivity, and split robustness checks would increase confidence in the results. The revision will include standard error bars computed across multiple random seeds for all linear-probing experiments, a brief hyperparameter sensitivity analysis for the state-event weighting and pretraining loss coefficients, and evaluation on at least one additional subject-disjoint split to confirm consistency of the reported gains. revision: yes

Circularity Check

0 steps flagged

No circularity in derivation or evaluation chain

full rationale

The paper defines an architecture (dual-stream decomposition aligned to 24-hour grid) and two pretraining objectives on unlabeled CGM data, then reports empirical linear-probing results on subject-disjoint external cohorts. No equation or claim reduces a downstream metric to a fitted parameter by construction, no self-citation is invoked as a uniqueness theorem or load-bearing premise, and the evaluation protocol is independent of the pretraining fit. The central inductive-bias claim is supported only by comparative performance numbers, not by definitional equivalence.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

Only the abstract is available, so the ledger is limited to the modeling choice explicitly highlighted; no free parameters or invented entities are quantifiable from the given text.

axioms (1)
  • domain assumption Decomposing glucose dynamics into slow physiological state and transient event streams captures distinct and useful structure for representation learning
    This premise is invoked to justify the architecture and the two pretraining objectives.

pith-pipeline@v0.9.1-grok · 5852 in / 1189 out tokens · 19184 ms · 2026-06-28T23:22:07.316362+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. RubricsTree: Scalable and Evolving Open-Ended Evaluation of Personal Health Agents across Health Memory and Medical Skills

    cs.CL 2026-06 unverdicted novelty 5.0

    RubricsTree is a scalable, evolving rubric-based evaluation system with adaptive routing that improves physician alignment for health AI responses and enables performance gains when used as feedback or rewards.

Reference graph

Works this paper leans on

62 extracted references · 13 canonical work pages · cited by 1 Pith paper · 7 internal anchors

  1. [1]

    Large-scale training of foundation models for wearable biosignals

    Salar Abbaspourazad, Oussama Elachqar, Andrew Miller, Saba Emrani, Udhyakumar Nallasamy, and Ian Shapiro. Large-scale training of foundation models for wearable biosignals. InThe Twelfth International Conference on Learning Representations, 2024

  2. [2]

    Chronos-2: From Univariate to Universal Forecasting

    Abdul Fatir Ansari, Oleksandr Shchur, Jaris Küken, Andreas Auer, Boran Han, Pedro Mercado, Syama Sundar Rangapuram, Huibin Shen, Lorenzo Stella, Xiyuan Zhang, Mononito Goswami, Shubham Kapoor, Danielle C. Maddix, Pablo Guerron, Tony Hu, Junming Yin, Nick Erickson, Prateek Mutalik Desai, Hao Wang, Huzefa Rangwala, George Karypis, Yuyang Wang, and Michael B...

  3. [3]

    Maddix, Hao Wang, Michael W

    Abdul Fatir Ansari, Lorenzo Stella, Caner Turkmen, Xiyuan Zhang, Pedro Mercado, Huibin Shen, Oleksandr Shchur, Syama Sundar Rangapuram, Sebastian Pineda Arango, Shubham Kapoor, Jasper Zschiegner, Danielle C. Maddix, Hao Wang, Michael W. Mahoney, Kari Torkkola, Andrew Gordon Wilson, Michael Bohlke-Schneider, and Yuyang Wang. Chronos: Learning the language ...

  4. [4]

    Self-supervised learning from images with a joint-embedding predictive architecture

    Mahmoud Assran, Quentin Duval, Ishan Misra, Piotr Bojanowski, Pascal Vincent, Michael Rabbat, Yann LeCun, and Nicolas Ballas. Self-supervised learning from images with a joint-embedding predictive architecture. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 15619–15629, 2023

  5. [5]

    Etienne Becht, Leland McInnes, John Healy, Charles-Antoine Dutertre, Immanuel W. H. Kwok, Lai Guan Ng, Florent Ginhoux, and Evan W. Newell. Dimensionality reduction for visualizing single-cell data using umap.Nature Biotechnology, 2019

  6. [6]

    Glucose management indicator (GMI): A new term for estimating A1C from continuous glucose monitoring.Diabetes Care, 41(11):2275–2280, Nov 2018

    Richard M Bergenstal, Roy W Beck, Kelly L Close, George Grunberger, David B Sacks, Aaron Kowalski,AdamSBrown,LutzHeinemann,GraziaAleppo,DonnaBRyan,TonyaDRiddlesworth, and William T Cefalu. Glucose management indicator (GMI): A new term for estimating A1C from continuous glucose monitoring.Diabetes Care, 41(11):2275–2280, Nov 2018

  7. [7]

    Berry, Ana M

    Sarah E. Berry, Ana M. Valdes, David A. Drew, Francesco Asnicar, Mohsen Mazidi, Jonathan Wolf, Joan Capdevila, George Hadjigeorgiou, Richard Davies, Haya Al Khatib, Christopher Bonnett, Sajaysurya Ganesh, Elco Bakker, Deborah Hart, Massimo Mangino, Jordi Merino, Inbar Linenberg, Patrick Wyatt, Jose M. Ordovas, Christopher D. Gardner, Linda M. Delahanty, A...

  8. [8]

    Brits: Bidirectional recurrent imputation for time series.Advances in neural information processing systems, 31, 2018

    Wei Cao, Dong Wang, Jian Li, Hao Zhou, Lei Li, and Yitan Li. Brits: Bidirectional recurrent imputation for time series.Advances in neural information processing systems, 31, 2018

  9. [9]

    Recurrent neural networks for multivariate time series with missing values.Scientific reports, 8(1):6085, 2018

    Zhengping Che, Sanjay Purushotham, Kyunghyun Cho, David Sontag, and Yan Liu. Recurrent neural networks for multivariate time series with missing values.Scientific reports, 8(1):6085, 2018

  10. [10]

    Co- modo: Cross-modal video-to-imu distillation for efficient egocentric human activity recognition

    Baiyu Chen, Wilson Wongso, Zechen Li, Yonchanok Khaokaew, Hao Xue, and Flora Salim. Co- modo: Cross-modal video-to-imu distillation for efficient egocentric human activity recognition. Proc. ACM Interact. Mob. Wearable Ubiquitous Technol., 2026. 14 GlucoFM: A Dual-Stream Foundation Model for Continuous Glucose Monitoring

  11. [11]

    BIG IDEAs Lab Glycemic Variability and Wearable Device Data.PhysioNet, September 2023

    Peter Cho, Juseong Kim, Brinnae Bent, and Jessilyn Dunn. BIG IDEAs Lab Glycemic Variability and Wearable Device Data.PhysioNet, September 2023. Version 1.1.2

  12. [12]

    AnaColás,LuisVigil,BorjaVargas, DavidCuesta-Frau, andManuelVarela. Detrendedfluctuation analysis in the prediction of type 2 diabetes mellitus in patients at risk: Model optimization and comparison with other metrics.PLOS ONE, 14(12):e0225817, 2019

  13. [13]

    A decoder-only foundation model for time-series forecasting

    Abhimanyu Das, Weihao Kong, Rajat Sen, and Yichen Zhou. A decoder-only foundation model for time-series forecasting. InProceedings of the 41st International Conference on Machine Learning, ICML’24. JMLR.org, 2024

  14. [14]

    Mortazavi

    Anurag Das, David Kerr, Namino Glantz, Wendy Bevier, Rony Santiago, Ricardo Gutierrez-Osuna, and Bobak J. Mortazavi. Cgmacros: a pilot scientific dataset for personalized nutrition and diet monitoring.Scientific Data, 12:1557, 2025

  15. [15]

    Mantis: Lightweight calibrated foundation model for user-friendly time series classification

    Vasilii Feofanov, Marius Alonso, Songkang Wen, Romain Ilbert, Hongbo Guo, Malik Tiomoko, Lujia Pan, Jianfeng Zhang, and Ievgen Redko. Mantis: Lightweight calibrated foundation model for user-friendly time series classification. In1st ICML Workshop on Foundation Models for Structured Data, 2025

  16. [16]

    Mantisv2: Closing the zero-shot gap in time series classification with synthetic data and test-time strategies

    Vasilii Feofanov, Songkang Wen, Jianfeng Zhang, Lujia Pan, and Ievgen Redko. Mantisv2: Closing the zero-shot gap in time series classification with synthetic data and test-time strategies. arXiv preprint arXiv:2602.17868, 2026

  17. [17]

    UniTS: A unified multi-task time series model

    Shanghua Gao, Teddy Koker, Owen Queen, Thomas Hartvigsen, Theodoros Tsiligkaridis, and Marinka Zitnik. UniTS: A unified multi-task time series model. InThe Thirty-eighth Annual Conference on Neural Information Processing Systems, 2024

  18. [18]

    Moment: A family of open time-series foundation models

    Mononito Goswami, Konrad Szafer, Arjun Choudhry, Yifu Cai, Shuo Li, and Artur Dubrawski. Moment: A family of open time-series foundation models. InInternational Conference on Machine Learning, 2024

  19. [19]

    Glucotypes reveal new patterns of glucose dysregulation

    Heather Hall, Dalia Perelman, Alessandra Breschi, Patricia Limcaoco, Ryan Kellogg, Tracey McLaughlin, and Michael Snyder. Glucotypes reveal new patterns of glucose dysregulation. PLoS biology, 16(7):e2005143, 2018

  20. [20]

    One Loss to Rule Them All: Marked Time-to-Event for Structured EHR Foundation Models

    Zilin Jing, Vincent Jeanselme, Yuta Kobayashi, Simon A Lee, Chao Pang, Aparajita Kashyap, Yanwei Li, Xinzhuo Jiang, and Shalmali Joshi. One loss to rule them all: Marked time-to-event for structured ehr foundation models.arXiv preprint arXiv:2602.00541, 2026

  21. [21]

    Cgmap: characterizing continuous glucose monitor data in thousands of non-diabetic individuals.Cell metabolism, 35(5):758–769, 2023

    Ayya Keshet, Smadar Shilo, Anastasia Godneva, Yeela Talmor-Barkan, Yaron Aviv, Eran Segal, and Hagai Rossman. Cgmap: characterizing continuous glucose monitor data in thousands of non-diabetic individuals.Cell metabolism, 35(5):758–769, 2023

  22. [22]

    Clocs: Contrastive learning of cardiac signals across space, time, and patients

    Dani Kiyasseh, Tingting Zhu, and David A Clifton. Clocs: Contrastive learning of cardiac signals across space, time, and patients. InInternational Conference on Machine Learning, pages 5606–5615. PMLR, 2021

  23. [23]

    David C Klonoff, Richard M Bergenstal, Eda Cengiz, Mark A Clements, Daniel Espes, Juan Espinoza, David Kerr, Boris Kovatchev, David M Maahs, Julia K Mader, Nestoras Mathioudakis, AhmedAMetwally,ShahidNShah,BinSheng,MichaelPSnyder,GuillermoUmpierrez,MandyM Shao, Agatha F Scheideman, Alessandra T Ayers, Cindy N Ho, and Elizabeth Healey. Contin- uous glucose...

  24. [24]

    Glicksberg, Hsien-Chin Lee, Sarah Cherng, Giulia Landi, Matteo Danieletto, Joel T

    Ilaria Landi, Benjamin S. Glicksberg, Hsien-Chin Lee, Sarah Cherng, Giulia Landi, Matteo Danieletto, Joel T. Dudley, Cesare Furlanello, and Riccardo Miotto. Deep representation learning of electronic health records to unlock patient stratification at scale.npj Digital Medicine, 3(96), 2020

  25. [25]

    Simon A. Lee, Cyrus Tanade, Hao Zhou, Juhyeon Lee, Megha Thukral, Md Sazzad Hissain Khan, Keum San Chun, Baiying Lu, Migyeong Gwak, Mehrab Bin Morshed, Viswam Nathan, Md Mahbubur Rahman, Li Zhu, Subramaniam Venkatraman, and Sharanya Arcot Desai. HiMAE: Hierarchical masked autoencoders discover resolution-specific structure in wearable time series. InThe F...

  26. [26]

    HEARTS: Benchmarking llm reasoning on health time series.arXiv preprint arXiv:2603.06638, 2026

    Sirui Li, Shuhan Xiao, Mihir Joshi, Ahmed Metwally, Daniel McDuff, Wei Wang, and Yuzhe Yang. HEARTS: Benchmarking llm reasoning on health time series.arXiv preprint arXiv:2603.06638, 2026

  27. [27]

    Behrt: Transformer for electronic health records.Scientific Reports, 10(1):7155, 2020

    Yikuan Li, Shishir Rao, José Roberto Ayala Solares, Abdelaali Hassaine, Rema Ramakrishnan, Dexter Canoy, Yajie Zhu, Kazem Rahimi, and Gholamreza Salimi-Khorshidi. Behrt: Transformer for electronic health records.Scientific Reports, 10(1):7155, 2020

  28. [28]

    Zechen Li, Baiyu Chen, Hao Xue, and Flora D. Salim. Zara: Training-free motion time-series reasoning via evidence-grounded llm agents.arXiv preprint arXiv:2508.04038, 2026

  29. [29]

    Zechen Li, Shohreh Deldari, Linyao Chen, Hao Xue, and Flora D. Salim. SensorLLM: Aligning large language models with motion sensors for human activity recognition. InProceedings of the 2025 Conference on Empirical Methods in Natural Language Processing, pages 354–379, 2025

  30. [30]

    Li, Bin Sheng, Weiping Jia, Luonan Chen, Huating Li, and Yong Wang

    Yurun Lu, Dan Liu, Zhongming Liang, Rui Liu, Pei Chen, Yitong Liu, Jiachen Li, Zhanying Feng, Lei M. Li, Bin Sheng, Weiping Jia, Luonan Chen, Huating Li, and Yong Wang. A pretrained transformermodelfordecodingindividualglucosedynamicsfromcontinuousglucosemonitoring data.National Science Review, 12(5):nwaf039, May 2025

  31. [31]

    A large sensor foundation model pretrained on continuous glucose monitor data for diabetes management.npj Health Systems, 2025

    Junjie Luo, Abhimanyu Kumbara, Mansur Shomali, Rui Han, Anand Iyer, Grazia Aleppo, Ritu Agarwal, and Gordon Gao. A large sensor foundation model pretrained on continuous glucose monitor data for diabetes management.npj Health Systems, 2025

  32. [32]

    Greenfield, Dorit Samocha-Bonet, Raja Dhir, Francisco Gude, Shie Mannor, Eli Meirom, Eric P

    Guy Lutsker, Gal Sapir, Smadar Shilo, Jordi Merino, Anastasia Godneva, Jerry R. Greenfield, Dorit Samocha-Bonet, Raja Dhir, Francisco Gude, Shie Mannor, Eli Meirom, Eric P. Xing, Gal Chechik, Hagai Rossman, and Eran Segal. A foundation model for continuous glucose monitoring data.Nature, pages 1–9, 2026

  33. [33]

    Glucodensity functional profiles outperform traditional continuous glucose monitoring metrics.Scientific Reports, 2025

    Marcos Matabuena, Rahul Ghosal, Javier Enrique Aguilar, Ayya Keshet, Robert Wagner, Carmen Fernández Merino, Juan Sánchez Castro, Vadim Zipunnikov, Jukka-Pekka Onnela, and Francisco Gude. Glucodensity functional profiles outperform traditional continuous glucose monitoring metrics.Scientific Reports, 2025

  34. [34]

    UMAP: Uniform Manifold Approximation and Projection for Dimension Reduction

    Leland McInnes, John Healy, and James Melville. Umap: Uniform manifold approximation and projection for dimension reduction.arXiv preprint arXiv:1802.03426, 2020

  35. [35]

    Metwally, A

    Ahmed A. Metwally, A. Ali Heydari, Daniel McDuff, Alexandru Solot, Zeinab Esmaeilpour, Anthony Z. Faranesh, Menglian Zhou, Girish Narayanswamy, Maxwell A. Xu, Xin Liu, Yuzhe Yang, David B. Savage, Mark Malhotra, Conor Heneghan, Shwetak Patel, Cathy Speed, and Javier L. Prieto. Insulin resistance prediction from wearables and routine blood biomarkers. Natu...

  36. [36]

    Metwally, Heyjun Park, Yue Wu, Tracey McLaughlin, and Michael P

    Ahmed A. Metwally, Heyjun Park, Yue Wu, Tracey McLaughlin, and Michael P. Snyder. Use of continuous glucose monitoring with machine learning to identify metabolic subphenotypes and inform precision lifestyle changes.Journal of Diabetes Science and Technology, 20(3), 2026

  37. [37]

    Metwally, Dalia Perelman, Heyjun Park, Yue Wu, Alokkumar Jha, Seth Sharp, Alessan- draCelli, Ekrem Ayhan, FahimAbbasi, AnnaL.Gloyn, TraceyMcLaughlin, andMichaelP.Snyder

    Ahmed A. Metwally, Dalia Perelman, Heyjun Park, Yue Wu, Alokkumar Jha, Seth Sharp, Alessan- draCelli, Ekrem Ayhan, FahimAbbasi, AnnaL.Gloyn, TraceyMcLaughlin, andMichaelP.Snyder. Prediction of metabolic subphenotypes of type 2 diabetes via continuous glucose monitoring and machine learning.Nature Biomedical Engineering, 9(8):1222–1239, Aug 2025

  38. [38]

    CGM-JEPA: Learning Consistent Continuous Glucose Monitor Representations via Predictive Self-Supervised Pretraining

    Hada Melino Muhammad, Zechen Li, Flora Salim, and Ahmed A. Metwally. Cgm-jepa: Learning consistent continuous glucose monitor representations via predictive self-supervised pretraining. arXiv preprint arXiv:2605.00933, 2026

  39. [39]

    Tailor, Jacob Sunshine, Yun Liu, Tim Althoff, Shrikanth Narayanan, Pushmeet Kohli, Jiening Zhan, Mark Malhotra, Shwetak Patel, Samy Abdel-Ghaffar, and Daniel McDuff

    Girish Narayanswamy, Xin Liu, Kumar Ayush, Yuzhe Yang, Xuhai Xu, Shun Liao, Jake Garri- son, Shyam A. Tailor, Jacob Sunshine, Yun Liu, Tim Althoff, Shrikanth Narayanan, Pushmeet Kohli, Jiening Zhan, Mark Malhotra, Shwetak Patel, Samy Abdel-Ghaffar, and Daniel McDuff. Scaling wearable foundation models. InThe Thirteenth International Conference on Learning...

  40. [40]

    Metwally, Alireza Delfarah, Yue Wu, Dalia Perelman, Caleb Mayer, Curtis McGinity, Majid Rodgar, Alessandra Celli, Tracey McLaughlin, Emmanuel Mignot, and Michael Snyder

    Heyjun Park, Ahmed A. Metwally, Alireza Delfarah, Yue Wu, Dalia Perelman, Caleb Mayer, Curtis McGinity, Majid Rodgar, Alessandra Celli, Tracey McLaughlin, Emmanuel Mignot, and Michael Snyder. High-resolution lifestyle profiling and metabolic subphenotypes of type 2 diabetes.npj Digital Medicine, 8(352), 2025

  41. [41]

    Med-bert: pretrained contextual- ized embeddings on large-scale structured electronic health records for disease prediction.npj Digital Medicine, 4(86), 2021

    Laila Rasmy, Yang Xiang, Ziqian Xie, Cui Tao, and Degui Zhi. Med-bert: pretrained contextual- ized embeddings on large-scale structured electronic health records for disease prediction.npj Digital Medicine, 4(86), 2021

  42. [42]

    Lag-llama: Towards foundation models for probabilistic time series forecasting.arXiv preprint arXiv:2310.08278, 2024

    Kashif Rasul, Arjun Ashok, Andrew Robert Williams, Hena Ghonia, Rishika Bhagwatkar, Arian Khorasani, Mohammad Javad Darvishi Bayazi, George Adamopoulos, Roland Riachi, Nadhir Hassen, Marin Biloš, Sahil Garg, Anderson Schneider, Nicolas Chapados, Alexandre Drouin, Valentina Zantedeschi, Yuriy Nevmyvaka, and Irina Rish. Lag-llama: Towards foundation models ...

  43. [43]

    Latent ordinary differential equations for irregularly-sampled time series.Advances in neural information processing systems, 32, 2019

    Yulia Rubanova, Ricky TQ Chen, and David K Duvenaud. Latent ordinary differential equations for irregularly-sampled time series.Advances in neural information processing systems, 32, 2019

  44. [44]

    Learning the natural history of human disease with generative transformers.Nature, 647(8088):248–256, 2025

    Artem Shmatko, Alexander Wolfgang Jung, Kumar Gaurav, Søren Brunak, Laust Hvas Mortensen, Ewan Birney, Tom Fitzgerald, and Moritz Gerstung. Learning the natural history of human disease with generative transformers.Nature, 647(8088):248–256, 2025

  45. [45]

    OSF: On pre-training and scaling of sleep foundation models.arXiv preprint arXiv:2603.00190, 2026

    Zitao Shuai, Zongzhe Xu, David Yang, Wei Wang, and Yuzhe Yang. OSF: On pre-training and scaling of sleep foundation models.arXiv preprint arXiv:2603.00190, 2026

  46. [46]

    Interpolation-prediction networks for irregularly sampled time series.arXiv preprint arXiv:1909.07782, 2019

    Satya Narayan Shukla and Benjamin M Marlin. Interpolation-prediction networks for irregularly sampled time series.arXiv preprint arXiv:1909.07782, 2019

  47. [47]

    Wareham, and Cecilia Mascolo

    Dimitris Spathis, Ignacio Perez-Pozuelo, Soren Brage, Nicholas J. Wareham, and Cecilia Mascolo. Self-supervised transfer learning of physiological representations from free-living wearable data. InProceedings of the Conference on Health, Inference, and Learning, CHIL ’21, page 69–78, New York, NY, USA, 2021. Association for Computing Machinery

  48. [48]

    Use of continuous glucose monitoring to stratify individuals without diabetes.Communications Medicine, 2026

    Hikaru Sugimoto, Gal Sapir, Ayya Keshet, and Shinya Kuroda. Use of continuous glucose monitoring to stratify individuals without diabetes.Communications Medicine, 2026. 17 GlucoFM: A Dual-Stream Foundation Model for Continuous Glucose Monitoring

  49. [49]

    Unsupervised representation learning for time series with temporal neighborhood coding

    Sana Tonekaboni, Danny Eytan, and Anna Goldenberg. Unsupervised representation learning for time series with temporal neighborhood coding. InInternational Conference on Learning Representations, 2021

  50. [50]

    Continuous glucose monitoring: a review for behavioral researchers.Psychosomatic Medicine, 74(4):356–365, 2012

    Julie Wagner, Howard Tennen, and Howard Wolpert. Continuous glucose monitoring: a review for behavioral researchers.Psychosomatic Medicine, 74(4):356–365, 2012

  51. [51]

    Alexander Wolf, Fiona K

    F. Alexander Wolf, Fiona K. Hamey, Mireya Plass, Jordi Solana, Joakim S. Dahlin, Berthold Göttgens, Nikolaus Rajewsky, Lukas Simon, and Fabian J. Theis. Paga: graph abstraction reconciles clustering with trajectory inference through a topology preserving map of single cells. Genome Biology, 2019

  52. [52]

    Unified training of universal time series forecasting transformers

    Gerald Woo, Chenghao Liu, Akshat Kumar, Caiming Xiong, Silvio Savarese, and Doyen Sahoo. Unified training of universal time series forecasting transformers. InForty-first International Conference on Machine Learning, 2024

  53. [53]

    Metwally, Dalia Perelman, Heyjun Park, Andrew Wallace Brooks, Fahim Abbasi, Basil Michael, Alessandra Celli, Caroline Bejikian, Ekrem Ayhan, Yingzhou Lu, Samuel M

    Yue Wu, Ben Ehlert, Ahmed A. Metwally, Dalia Perelman, Heyjun Park, Andrew Wallace Brooks, Fahim Abbasi, Basil Michael, Alessandra Celli, Caroline Bejikian, Ekrem Ayhan, Yingzhou Lu, Samuel M. Lancaster, Daniel Hornburg, Lucia Ramirez, David Bogumil, Sarah Pollock, Frank Wong, Denver Bradley, Georg Gutjahr, Ekanath Srihari Rangan, Tao Wang, Lettie McGuire...

  54. [54]

    MaxwellA.Xu,GirishNarayanswamy,KumarAyush,DimitrisSpathis,ShunLiao,ShyamA.Tailor, Ahmed Metwally, A. Ali Heydari, Yuwei Zhang, Jake Garrison, Samy Abdel-Ghaffar, Xuhai Xu, Ken Gu, Jacob Sunshine, Ming-Zher Poh, Yun Liu, Tim Althoff, Shrikanth Narayanan, Pushmeet Kohli, Mark Malhotra, Shwetak Patel, Yuzhe Yang, James M. Rehg, Xin Liu, and Daniel McDuff. Ls...

  55. [55]

    SleepLM: Natural-language intelligence for human sleep.arXiv preprint arXiv:2602.23605, 2026

    Zongzhe Xu, Zitao Shuai, Eideen Mozaffari, Ravi S Aysola, Rajesh Kumar, and Yuzhe Yang. SleepLM: Natural-language intelligence for human sleep.arXiv preprint arXiv:2602.23605, 2026

  56. [56]

    Simper: Simple self-supervised learning of periodic targets

    Yuzhe Yang, Xin Liu, Jiang Wu, Silviu Borac, Dina Katabi, Ming-Zher Poh, and Daniel McDuff. Simper: Simple self-supervised learning of periodic targets. InInternational Conference on Learning Representations (ICLR), 2023

  57. [57]

    Smart: Towards pre-trained missing-aware model for patient health status prediction

    Zhihao Yu, Xu Chu, Yujie Jin, Yasha Wang, and Junfeng Zhao. Smart: Towards pre-trained missing-aware model for patient health status prediction. InNeurIPS 2024, 2024

  58. [58]

    Personalized nutrition by prediction of glycemic responses.Cell, 163(5):1079–1094, 2015

    David Zeevi, Tal Korem, Niv Zmora, David Israeli, Daphna Rothschild, Adina Weinberger, Orly Ben-Yacov, Dar Lador, Tali Avnit-Sagi, Maya Lotan-Pompan, Jotham Suez, Jemal Ali Mahdi, Elad Matot, Gal Malka, Noa Kosower, Michal Rein, Gili Zilberman-Schapira, Lenka Dohnalová, Meirav Pevsner-Fischer, Rony Bikovsky, Zamir Halpern, Eran Elinav, and Eran Segal. Per...

  59. [59]

    Yuwei Zhang, Kumar Ayush, Siyuan Qiao, A. Ali Heydari, Girish Narayanswamy, Maxwell A Xu, Ahmed Metwally, Jinhua Xu, Jake Garrison, Xuhai Xu, Tim Althoff, Yun Liu, Pushmeet Kohli, Jiening Zhan, Mark Malhotra, Shwetak Patel, Cecilia Mascolo, Xin Liu, Daniel McDuff, and Yuzhe Yang. SensorLM: Learning the language of wearable sensors. InThe Thirty-ninth Annu...

  60. [60]

    Chinese diabetes datasets for data-driven machine learning.Scientific Data, 10, 2023

    Qinpei Zhao, Jinhao Zhu, Xuan Shen, Chuwen Lin, Yinjia Zhang, Yuxiang Liang, Baige Cao, Jiangfeng Li, Xiang Liu, Weixiong Rao, and Congrong Wang. Chinese diabetes datasets for data-driven machine learning.Scientific Data, 10, 2023

  61. [61]

    Physiology-Aware Masked Cross-Modal Reconstruction for Biosignal Representation Learning

    Hao Zhou, Simon A. Lee, Cyrus Tanade, Keum San Chun, Juhyeon Lee, Migyeong Gwak, Megha Thukral, Justin Sung, Eugene Hwang, Mehrab Bin Morshed, Li Zhu, Viswam Nathan, Md Mahbubur Rahman, Subramaniam Venkatraman, and Sharanya Arcot Desai. Physiology- aware masked cross-modal reconstruction for biosignal representation learning.arXiv preprint arXiv:2605.0097...

  62. [62]

    The masked-context predictor is a lightweight1-layer Transformer, and the temporal dynamics objective uses two lightweight transition heads. During downstream evaluation, only the frozen online branch is retained.GlucoFM has0 .72M trainable parameters and1.18M total parameters during pretraining, mainly due to the additional EMA target branch. C.5. Pretra...