Learning Normal Representations for Blood Biomarkers

Aashna P. Shah; Arjun K. Manrai; Ben Y. Reis; Chirag J. Patel; James A. Diao; Liat F. Antwarg; Michelle M. Li; Morgan Sanchez; Noa Dagan; Ran D. Balicer

arxiv: 2605.18701 · v1 · pith:WTZHVCIInew · submitted 2026-05-18 · 💻 cs.LG · q-bio.QM

Learning Normal Representations for Blood Biomarkers

Aashna P. Shah , Michelle M. Li , Yash Lal , Seffi Cohen , Liat F. Antwarg , Morgan Sanchez , James A. Diao , Chirag J. Patel

show 4 more authors

Ben Y. Reis Ran D. Balicer Noa Dagan Arjun K. Manrai

This is my paper

Pith reviewed 2026-05-20 12:49 UTC · model grok-4.3

classification 💻 cs.LG q-bio.QM

keywords blood biomarkersreference intervalspersonalized medicinetransformer modellongitudinal laboratory dataclinical outcomesmachine learningpopulation priors

0 comments

The pith

A conditional transformer model improves blood biomarker reference intervals by blending individual patient history with population-level normal variation, leading to better prediction of clinical outcomes than purely personalized or fixed-

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

Blood tests use reference intervals to flag abnormal results, but standard population ranges ignore personal baselines while recent personalization efforts using only past tests can overfit and incorrectly label many results as abnormal. Analysis of nearly two billion measurements shows these personalized intervals classify up to 68 percent as abnormal yet lack ties to real outcomes such as mortality or acute kidney injury. The paper presents NORMA, a framework that conditions interval generation on both the patient's own history and broader population patterns of normal values. This hybrid method delivers reference intervals with higher precision for forecasting adverse events. The results indicate that effective interpretation requires anchoring personal data to stable population information rather than relying on either extreme.

Core claim

Laboratory values exhibit substantial individual variation, yet purely personalized reference intervals routinely overfit to sparse data, classifying up to 68% of measurements as abnormal without corresponding associations with adverse clinical outcomes. NORMA addresses this by using a conditional transformer to generate reference intervals conditioned on both a patient's testing history and population-level data about normal variation, resulting in intervals that achieve higher precision for predicting outcomes including mortality, acute kidney injury, and chronic disease. These findings suggest that population-level priors enhance individual trajectory analysis and outperform either pure-

What carries the argument

NORMA, a conditional transformer-based framework that produces reference intervals by conditioning on both patient history and population-level normal variation.

If this is right

Personalized intervals without population conditioning lead to inflated abnormal classifications lacking clinical relevance.
Hybrid conditioning improves precision in outcome prediction for mortality, acute kidney injury, and chronic disease.
Laboratory medicine should moderate the use of purely individual reference intervals.
Anchoring individual data to population priors provides superior performance compared to standalone methods.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Applying similar hybrid conditioning could benefit interpretation of other longitudinal clinical measurements such as imaging or vital signs.
Expanding the approach to include additional patient context like age or comorbidities might further refine interval accuracy.
Deployment in clinical systems could help decrease unnecessary follow-up testing triggered by over-flagged results.
The multi-regional scope of the data suggests potential for more generalizable normality definitions across populations.

Load-bearing premise

The multi-regional dataset of laboratory measurements largely captures stable normal biological variation rather than including substantial unrecognized or subclinical disease that contaminates the population priors.

What would settle it

A prospective validation study that applies NORMA intervals, personalized intervals, and population intervals to new patients and tracks which method most accurately associates flagged abnormalities with subsequent clinical events while minimizing unnecessary alerts.

read the original abstract

Blood-based biomarkers underpin clinical diagnosis and management, yet their interpretation relies largely on fixed population reference intervals that ignore stable, intra-patient variability. As such, population-based interpretation can mask meaningful deviation from an individual's baseline, risking delayed disease detection. To remedy this, there have been increasing efforts to personalize blood biomarker interpretation using individual testing histories. However, these methods may overfit to sparse data, inflating false-positive rates and unnecessary follow-up, and can also unwittingly include unrecognized or subclinical disease. Here, we leverage nearly 2 billion longitudinal laboratory measurements from over 1.6 million individuals across North America, the Middle East, and East Asia, to show that while laboratory values are highly individual, purely personalized intervals routinely overfit, classifying up to 68% of measurements as abnormal, without corresponding associations with adverse clinical outcomes. We then introduce NORMA, a conditional transformer-based framework that generates reference intervals by conditioning on both a patient's history and population-level data about "normal" variation. NORMA-derived intervals achieve higher precision for predicting outcomes, including mortality, acute kidney injury, and chronic disease. These findings caution against over-personalization in laboratory medicine and demonstrate that anchoring individual trajectories to population-level priors outperforms either approach alone. To promote transparency, we publicly release the model, code, and an interactive user interface for accessible, individualized laboratory interpretation.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper uses a massive multi-region dataset to show that purely personal lab reference intervals overflag abnormalities without outcome links, while their hybrid NORMA model blending history and population priors improves prediction of mortality and kidney injury.

read the letter

The main thing to know is that this work leverages nearly 2 billion lab measurements from 1.6 million people to demonstrate a practical limit to personalization in blood biomarker interpretation. Purely individual intervals flag up to 68% of results as abnormal with no corresponding rise in adverse events like death or acute kidney injury. Their NORMA conditional transformer, which anchors personal trajectories to population-level normal variation, delivers tighter intervals that better predict clinical outcomes and chronic disease progression. Releasing the model, code, and an interactive interface adds real value for anyone wanting to test or extend the approach. The scale across North America, the Middle East, and East Asia is a genuine strength that prior personalized reference work rarely matches. The caution against over-personalization is grounded in the empirical observation rather than theory alone. The soft spots center on validation and assumptions. The abstract leaves architecture, training details, and exact statistical comparisons thin, which makes it hard to judge robustness without the full methods. More importantly, the population priors used for conditioning could embed subclinical disease from the large cohort, which would blunt the claimed advantage of the hybrid setup over either extreme. The paper flags overfitting risks for pure personalization but needs clearer checks that the population component stays clean. This is for readers in clinical machine learning, laboratory medicine, or health data modeling who care about reference intervals and conditional generation. Anyone working on personalized diagnostics or outcome prediction from longitudinal labs would get concrete takeaways. It deserves serious peer review given the dataset size, the public release, and the direct clinical stakes, even if revisions will likely focus on method transparency and prior contamination tests.

Referee Report

2 major / 2 minor

Summary. The manuscript introduces NORMA, a conditional transformer framework that generates blood biomarker reference intervals by conditioning on both individual patient testing histories and population-level priors learned from nearly 2 billion longitudinal measurements across 1.6 million individuals in multiple regions. It shows that purely personalized intervals overfit (classifying up to 68% of values as abnormal without corresponding adverse outcomes) and claims that NORMA-derived intervals yield higher precision for predicting mortality, acute kidney injury, and chronic disease, outperforming either pure personalization or fixed population intervals alone. The work publicly releases the model, code, and an interactive UI.

Significance. If the central results hold after addressing validation gaps, the paper would have substantial clinical significance by providing evidence-based guidance against over-personalization in laboratory medicine and demonstrating the value of hybrid conditioning on stable population priors. The scale of the dataset and the public release of code and UI are clear strengths that support reproducibility and potential adoption.

major comments (2)

[Methods (population prior construction)] The manuscript provides no explicit validation or sensitivity analysis showing that the learned population-level priors are uncontaminated by subclinical or unrecognized disease; this assumption is load-bearing for the claim that NORMA's hybrid conditioning outperforms personalization by avoiding the overfitting and contamination issues acknowledged for individual histories.
[Results (outcome prediction experiments)] Outcome prediction results lack reported details on the exact statistical tests, baseline comparisons (e.g., standard reference intervals or simple history-based models), cross-validation strategy, and effect sizes for the claimed precision gains on mortality, AKI, and chronic disease endpoints.

minor comments (2)

[Abstract] The abstract states headline precision improvements without defining the precise metrics (e.g., AUC, precision-recall, or calibration) or providing quantitative tables comparing NORMA to the two baselines.
[Model description] Notation for the conditional transformer inputs (history embedding vs. population prior embedding) is introduced without a clear diagram or equation showing the fusion mechanism.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their constructive and detailed review of our manuscript. The comments highlight important areas for improving methodological transparency and rigor, and we address each point below with plans for revision.

read point-by-point responses

Referee: [Methods (population prior construction)] The manuscript provides no explicit validation or sensitivity analysis showing that the learned population-level priors are uncontaminated by subclinical or unrecognized disease; this assumption is load-bearing for the claim that NORMA's hybrid conditioning outperforms personalization by avoiding the overfitting and contamination issues acknowledged for individual histories.

Authors: We agree this is a substantive concern and that the population prior's robustness to subclinical disease is central to interpreting the hybrid conditioning advantage. While the dataset's scale and geographic diversity provide some inherent protection against systematic contamination, we acknowledge the need for explicit checks. In the revised manuscript we will add a sensitivity analysis subsection in Methods that retrains the population prior after (a) excluding patients with any recorded ICD codes for relevant conditions and (b) restricting to the first two measurements per individual. We will report the resulting changes in downstream precision metrics and discuss residual limitations. revision: yes
Referee: [Results (outcome prediction experiments)] Outcome prediction results lack reported details on the exact statistical tests, baseline comparisons (e.g., standard reference intervals or simple history-based models), cross-validation strategy, and effect sizes for the claimed precision gains on mortality, AKI, and chronic disease endpoints.

Authors: We thank the referee for noting these omissions, which reduce reproducibility. In the revised Results section we will explicitly state: the statistical tests (log-rank for time-to-event, logistic regression with Wald tests and 95% CIs), all baseline comparators (fixed population intervals, per-patient mean±2SD, and a simple autoregressive history model), the cross-validation procedure (patient-stratified 5-fold CV with temporal hold-out), and effect sizes (precision-recall AUC deltas and hazard ratios with confidence intervals). These additions will be accompanied by updated tables and supplementary figures. revision: yes

Circularity Check

0 steps flagged

No significant circularity; derivation is self-contained and empirical

full rationale

The paper introduces NORMA as a conditional transformer framework that generates reference intervals by conditioning on both individual history and population-level priors derived from nearly 2 billion measurements. Claims of superior precision for outcome prediction (mortality, AKI, chronic disease) rest on empirical comparisons showing that purely personalized intervals overfit (classifying up to 68% as abnormal without outcome associations) while the hybrid approach outperforms. No equations, self-citations, or steps reduce outputs by construction to fitted inputs or prior definitions; the model is presented as data-driven with public release of code and interface for external verification. The derivation chain is independent of the target results and does not invoke uniqueness theorems or ansatzes from self-citations.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central claim depends on the dataset representing clean normal variation and on the transformer successfully learning useful population priors; no free parameters or invented entities are described in the abstract.

axioms (1)

domain assumption The multi-regional dataset of nearly 2 billion measurements primarily reflects stable normal biological variation without substantial unrecognized or subclinical disease contamination
This premise is required to justify using population-level data as reliable priors for conditioning the reference intervals.

pith-pipeline@v0.9.0 · 5819 in / 1370 out tokens · 59462 ms · 2026-05-20T12:49:27.082515+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

NORMA is a conditional, decoder-only transformer that models the distribution of a patient’s next laboratory value given their longitudinal measurement history and a query specifying a future health state... conditioning on a future normal laboratory state.
IndisputableMonolith/Foundation/RealityFromDistinction.lean reality_from_one_distinction unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

anchoring individual trajectories to population-level priors outperforms either approach alone

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

56 extracted references · 56 canonical work pages

[1]

Towards better test utilization - strategies to improve physician ordering and their impact on patient outcomes.EJIFCC, 26(1):15–30, January 2015

Danielle B Freedman. Towards better test utilization - strategies to improve physician ordering and their impact on patient outcomes.EJIFCC, 26(1):15–30, January 2015

work page 2015
[2]

Laboratory diagnosis of iron- deficiency anemia: an overview.J

G H Guyatt, A D Oxman, M Ali, A Willan, W McIlroy, and C Patterson. Laboratory diagnosis of iron- deficiency anemia: an overview.J. Gen. Intern. Med., 7(2):145–153, March 1992

work page 1992
[3]

Guidelines on the management of abnormal liver blood tests.Gut, 67:6–19, November 2017

P Newsome, R Cramb, S Davison, J Dillon, M Foulerton, E Godfrey, Richard Hall, Ulrike Harrower, M Hudson, A Langford, A Mackie, R Mitchell-Thain, K Sennett, N Sheron, J Verne, Martine Walmsley, and A Y eoman. Guidelines on the management of abnormal liver blood tests.Gut, 67:6–19, November 2017

work page 2017
[4]

Clinical practice

Silvio E Inzucchi. Clinical practice. diagnosis of diabetes.N. Engl. J. Med., 367(6):542–550, August 2012

work page 2012
[5]

Enhancing the clinical value of medical laboratory testing.Clin

Kenneth A Sikaris. Enhancing the clinical value of medical laboratory testing.Clin. Biochem. Rev., 38(3): 107–114, November 2017

work page 2017
[6]

Reference intervals: the way forward.Ann

Ferruccio Ceriotti, Rolf Hinzmann, and Mauro Panteghini. Reference intervals: the way forward.Ann. Clin. Biochem., 46(Pt 1):8–17, January 2009

work page 2009
[7]

normal ranges

Richard C Friedberg, Rhona Souers, Elizabeth A Wagar, Ana K Stankovic, Paul N Valenstein, and College of American Pathologists. The origin of reference intervals: A college of american pathologists Q-probes study of “normal ranges” used in 163 clinical laboratories.Arch. Pathol. Lab. Med., 131(3):348–357, March 2007

work page 2007
[8]

Overuse of diagnostic testing in healthcare: a systematic review.BMJ Qual

Joris L J M Müskens, Rudolf Bertijn Kool, Simone A van Dulmen, and Gert P Westert. Overuse of diagnostic testing in healthcare: a systematic review.BMJ Qual. Saf., 31(1):54–63, January 2022

work page 2022
[9]

Low-density lipoproteins cause atherosclerotic cardiovascular disease

Brian A Ference, Henry N Ginsberg, Ian Graham, Kausik K Ray, Chris J Packard, Eric Bruckert, Robert A Hegele, Ronald M Krauss, Frederick J Raal, Heribert Schunkert, Gerald F Watts, Jan Borén, Sergio Fazio, Jay D Horton, Luis Masana, Stephen J Nicholls, Børge G Nordestgaard, Bart van de Sluis, Marja-Riitta Taskinen, Lale Tokgözoglu, Ulf Landmesser, Ulrich ...

work page 2017
[10]

Metabolomic profiles predict individual multidisease outcomes.Nat

Thore Buergel, Jakob Steinfeldt, Greg Ruyoga, Maik Pietzner, Daniele Bizzarri, Dina Vojinovic, Julius Upmeier Zu Belzen, Lukas Loock, Paul Kittner, Lara Christmann, Noah Hollmann, Henrik Strangalies, Jana M Braunger, Benjamin Wild, Scott T Chiesa, Joachim Spranger, Fabian Klostermann, Erik B van den Akker, Stella Trompet, Simon P Mooijaart, Naveed Sattar,...

work page 2022
[11]

Liver enzyme alteration: a guide for clinicians

Edoardo G Giannini, Roberto Testa, and Vincenzo Savarino. Liver enzyme alteration: a guide for clinicians. CMAJ, 172(3):367–379, February 2005

work page 2005
[12]

Interpretation of the complete blood count.Pediatr

M C Walters and H T Abelson. Interpretation of the complete blood count.Pediatr. Clin. North Am., 43(3): 599–622, June 1996

work page 1996
[13]

Defining laboratory reference values and decision limits: populations, intervals, and interpretations.Asian J

James C Boyd. Defining laboratory reference values and decision limits: populations, intervals, and interpretations.Asian J. Androl., 12(1):83–90, January 2010

work page 2010
[14]

In the era of precision medicine and big data, who is normal?JAMA, 319(19):1981–1982, May 2018

Arjun K Manrai, Chirag J Patel, and John P A Ioannidis. In the era of precision medicine and big data, who is normal?JAMA, 319(19):1981–1982, May 2018

work page 1981
[15]

Monthly intra-individual variation in lipids over a 12 1-year period in 22 normal subjects.Clin

D J Nazir, R S Roberts, S A Hill, and M J McQueen. Monthly intra-individual variation in lipids over a 12 1-year period in 22 normal subjects.Clin. Biochem., 32(5):381–389, July 1999

work page 1999
[16]

Haematological setpoints are a stable and patient-specific deep phenotype.Nature, 637(8045):430–438, January 2025

Brody H Foy, Rachel Petherbridge, Maxwell T Roth, Cindy Zhang, Daniel C De Souza, Christopher Mow, Hasmukh R Patel, Chhaya H Patel, Samantha N Ho, Evie Lam, Camille E Powe, Robert P Hasserjian, Konrad J Karczewski, Veronica Tozzo, and John M Higgins. Haematological setpoints are a stable and patient-specific deep phenotype.Nature, 637(8045):430–438, January 2025

work page 2025
[17]

Annual biological variation and personalized reference intervals of clinical chemistry and hematology analytes.Clin

Shuo Wang, Min Zhao, Zihan Su, and Runqing Mu. Annual biological variation and personalized reference intervals of clinical chemistry and hematology analytes.Clin. Chem. Lab. Med., 60(4):606–617, March 2022

work page 2022
[18]

Personalized reference intervals - statistical approaches and considerations.Clin

Abdurrahman Coskun, Sverre Sandberg, Ibrahim Unsal, Fulya G Y avuz, Coskun Cavusoglu, Mustafa Serteser, Meltem Kilercik, and Aasne K Aarsand. Personalized reference intervals - statistical approaches and considerations.Clin. Chem. Lab. Med., 60(4):629–635, March 2022

work page 2022
[19]

Personalized reference intervals in laboratory medicine: A new model based on within-subject biological variation.Clin

Abdurrahman Co¸ skun, Sverre Sandberg, Ibrahim Unsal, Coskun Cavusoglu, Mustafa Serteser, Meltem Kilercik, and Aasne K Aarsand. Personalized reference intervals in laboratory medicine: A new model based on within-subject biological variation.Clin. Chem., 67(2):374–384, January 2021

work page 2021
[20]

Data mining approaches to reference interval studies.Clinical Chemistry, 67(9):1175–1181, 2021

A E Obstfeld, K Patel, J C Boyd, J Drees, D T Holmes, J P Ioannidis, and A K Manrai. Data mining approaches to reference interval studies.Clinical Chemistry, 67(9):1175–1181, 2021

work page 2021
[21]

Association of sickle cell trait with hemoglobin A1c in african americans.JAMA, 317(5):507–515, February 2017

Mary E Lacy, Gregory A Wellenius, Anne E Sumner, Adolfo Correa, Mercedes R Carnethon, Robert I Liem, James G Wilson, David B Sacks, David R Jacobs, Jr, April P Carson, Xi Luo, Annie Gjelsvik, Alexander P Reiner, Rakhi P Naik, Simin Liu, Solomon K Musani, Charles B Eaton, and Wen-Chih Wu. Association of sickle cell trait with hemoglobin A1c in african amer...

work page 2017
[22]

Guidelines for the management of high blood cholesterol

Kenneth R Feingold. Guidelines for the management of high blood cholesterol. InEndotext [Internet]. MDText. com, Inc., 2025

work page 2025
[23]

Evaluation of hemoglobin cutoff levels to define anemia among healthy individuals.JAMA Netw

O Y aw Addo, Emma X Yu, Anne M Williams, Melissa Fox Y oung, Andrea J Sharma, Zuguo Mei, Nicholas J Kassebaum, Maria Elena D Jefferds, and Parminder S Suchdev. Evaluation of hemoglobin cutoff levels to define anemia among healthy individuals.JAMA Netw. Open, 4(8):e2119123, August 2021

work page 2021
[24]

Why should women have lower reference limits for haemoglobin and ferritin concentrations than men?BMJ, 322(7298):1355–1357, June 2001

D H Rushton, R Dover, A W Sainsbury, M J Norris, J J Gilkes, and I D Ramsay. Why should women have lower reference limits for haemoglobin and ferritin concentrations than men?BMJ, 322(7298):1355–1357, June 2001

work page 2001
[25]

Implications of race adjustment in lung-function equations.N

James A Diao, Yixuan He, Rohan Khazanchi, Max Jordan Nguemeni Tiako, Jonathan I Witonsky, Emma Pierson, Pranav Rajpurkar, Jennifer R Elhawary, Luke Melas-Kyriazi, Albert Y en, Alicia R Martin, Sean Levy, Chirag J Patel, Maha Farhat, Luisa N Borrell, Michael H Cho, Edwin K Silverman, Esteban G Burchard, and Arjun K Manrai. Implications of race adjustment i...

work page 2083
[26]

Hidden in plain sight—reconsidering the use of race correction in clinical algorithms.New England Journal of Medicine, 383(9):874–882, 2020

Darshali A Vyas, Leo G Eisenstein, and David S Jones. Hidden in plain sight—reconsidering the use of race correction in clinical algorithms.New England Journal of Medicine, 383(9):874–882, 2020

work page 2020
[27]

Disentangling proxies of demographic adjustments in clinical equations.arXiv [q-bio.QM], November 2025

Aashna P Shah, James A Diao, Emma Pierson, Chirag J Patel, and Arjun K Manrai. Disentangling proxies of demographic adjustments in clinical equations.arXiv [q-bio.QM], November 2025

work page 2025
[28]

Personalized statistical learning algorithms to improve the early detection of cancer using longitudinal biomarkers.Cancer Biomark., 33(2):199–210, 2022

Nabihah Tayob and Ziding Feng. Personalized statistical learning algorithms to improve the early detection of cancer using longitudinal biomarkers.Cancer Biomark., 33(2):199–210, 2022

work page 2022
[29]

The incidentalome: a threat to genomic medicine

Isaac S Kohane, Daniel R Masys, and Russ B Altman. The incidentalome: a threat to genomic medicine. JAMA, 296(2):212–215, July 2006

work page 2006
[30]

The frequency of unnecessary testing in hospitalized patients.Am

Christina Koch, Katherine Roberts, Christopher Petruccelli, and Daniel J Morgan. The frequency of unnecessary testing in hospitalized patients.Am. J. Med., 131(5):500–503, May 2018. 13

work page 2018
[31]

Blood tests - too much of a good thing.Scand

Henrik L Jørgensen and Bent S Lind. Blood tests - too much of a good thing.Scand. J. Prim. Health Care, 40(2):165–166, June 2022

work page 2022
[32]

More than half of abnormal results from laboratory tests ordered by family physicians could be false-positive.Can

Christopher Naugler and Irene Ma. More than half of abnormal results from laboratory tests ordered by family physicians could be false-positive.Can. Fam. Physician, 64(3):202–203, March 2018

work page 2018
[33]

Laboratory reference intervals - history and modern approaches for improved utility.Scand

Tony Badrick, Joe M El-Khoury, and Elvar Theodorsson. Laboratory reference intervals - history and modern approaches for improved utility.Scand. J. Clin. Lab. Invest., 85(4):229–241, June 2025

work page 2025
[34]

A comparison of methods to generate adaptive reference ranges in longitudinal monitoring

Davood Roshan, John Ferguson, Charles R Pedlar, Andrew Simpkin, William Wyns, Frank Sullivan, and John Newell. A comparison of methods to generate adaptive reference ranges in longitudinal monitoring. PLoS One, 16(2):e0247338, February 2021

work page 2021
[35]

Scalable and accurate deep learning with electronic health records.NPJ Digit

Alvin Rajkomar, Eyal Oren, Kai Chen, Andrew M Dai, Nissan Hajaj, Michaela Hardt, Peter J Liu, Xiaobing Liu, Jake Marcus, Mimi Sun, Patrik Sundberg, Hector Y ee, Kun Zhang, Yi Zhang, Gerardo Flores, Gavin E Duggan, Jamie Irvine, Quoc Le, Kurt Litsch, Alexander Mossin, Justin Tansuwan, De Wang, James Wexler, Jimbo Wilson, Dana Ludwig, Samuel L Volchenboum, ...

work page 2018
[36]

Event stream GPT: A data pre- processing and modeling library for generative, pre-trained transformers over continuous-time sequences of complex events.arXiv [cs.LG], June 2023

Matthew B A McDermott, Bret Nestor, Peniel Argaw, and Isaac Kohane. Event stream GPT: A data pre- processing and modeling library for generative, pre-trained transformers over continuous-time sequences of complex events.arXiv [cs.LG], June 2023

work page 2023
[37]

TransformEHR: transformer-based encoder-decoder generative model to enhance prediction of disease outcomes using electronic health records.Nat

Zhichao Y ang, Avijit Mitra, Weisong Liu, Dan Berlowitz, and Hong Yu. TransformEHR: transformer-based encoder-decoder generative model to enhance prediction of disease outcomes using electronic health records.Nat. Commun., 14(1):7857, November 2023

work page 2023
[38]

Health system-scale language models are all-purpose prediction engines.Nature, 619(7969):357–362, July 2023

Lavender Y ao Jiang, Xujin Chris Liu, Nima Pour Nejatian, Mustafa Nasir-Moin, Duo Wang, Anas Abidin, Kevin Eaton, Howard Antony Riina, Ilya Laufer, Paawan Punjabi, Madeline Miceli, Nora C Kim, Cordelia Orillac, Zane Schnurman, Christopher Livia, Hannah Weiss, David Kurland, Sean Neifert, Y osef Dasta- girzada, Douglas Kondziolka, Alexander T M Cheung, Gra...

work page 2023
[39]

Learning the natural history of human disease with generative transformers.Nature, 647(8088):248–256, November 2025

Artem Shmatko, Alexander Wolfgang Jung, Kumar Gaurav, Søren Brunak, Laust Hvas Mortensen, Ewan Birney, Tom Fitzgerald, and Moritz Gerstung. Learning the natural history of human disease with generative transformers.Nature, 647(8088):248–256, November 2025

work page 2025
[40]

Generative medical event models improve with scale.arXiv [cs.LG], November 2025

Shane Waxler, Paul Blazek, Davis White, Daniel Sneider, Kevin Chung, Mani Nagarathnam, Patrick Williams, Hank Voeller, Karen Wong, Matthew Swanhorst, Sheng Zhang, Naoto Usuyama, Cliff Wong, Tristan Naumann, Hoifung Poon, Andrew Loza, Daniella Meeker, Seth Hain, and Rahul Shah. Generative medical event models improve with scale.arXiv [cs.LG], November 2025

work page 2025
[41]

Zero shot health trajectory prediction using transformer.NPJ Digit

Pawel Renc, Yugang Jia, Anthony E Samir, Jaroslaw Was, Quanzheng Li, David W Bates, and Arkadiusz Sitek. Zero shot health trajectory prediction using transformer.NPJ Digit. Med., 7(1):256, September 2024

work page 2024
[42]

A multimodal and temporal foundation model for virtual patient representations at healthcare system scale.arXiv [cs.LG], April 2026

Andrew Zhang, Tong Ding, Sophia J Wagner, Caiwei Tian, Ming Y Lu, Rowland Pettit, Joshua E Lewis, Alexandre Misrahi, Dandan Mo, Long Phi Le, and Faisal Mahmood. A multimodal and temporal foundation model for virtual patient representations at healthcare system scale.arXiv [cs.LG], April 2026

work page 2026
[43]

A foundation model for continuous glucose monitoring data.Nature, 650 14 (8103):978–986, February 2026

Guy Lutsker, Gal Sapir, Smadar Shilo, Jordi Merino, Anastasia Godneva, Jerry R Greenfield, Dorit Samocha-Bonet, Raja Dhir, Francisco Gude, Shie Mannor, Eli Meirom, Eric P Xing, Gal Chechik, Hagai Rossman, and Eran Segal. A foundation model for continuous glucose monitoring data.Nature, 650 14 (8103):978–986, February 2026

work page 2026
[44]

Insulin resistance prediction from wearables and routine blood biomarkers.Nature, March 2026

Ahmed A Metwally, A Ali Heydari, Daniel McDuff, Alexandru Solot, Zeinab Esmaeilpour, Anthony Z Faranesh, Menglian Zhou, Girish Narayanswamy, Maxwell A Xu, Xin Liu, Yuzhe Y ang, David B Savage, Mark Malhotra, Conor Heneghan, Shwetak Patel, Cathy Speed, and Javier L Prieto. Insulin resistance prediction from wearables and routine blood biomarkers.Nature, March 2026

work page 2026
[45]

Causal transformer for estimating counter- factual outcomes.arXiv [cs.LG], April 2022

Valentyn Melnychuk, Dennis Frauen, and Stefan Feuerriegel. Causal transformer for estimating counter- factual outcomes.arXiv [cs.LG], April 2022

work page 2022
[46]

Controllable sequence editing for biological and clinical trajectories.arXiv [cs.LG], February 2025

Michelle M Li, Kevin Li, Y asha Ektefaie, Ying Jin, Y epeng Huang, Shvat Messica, Tianxi Cai, and Marinka Zitnik. Controllable sequence editing for biological and clinical trajectories.arXiv [cs.LG], February 2025

work page 2025
[47]

SSM-CGM: Interpretable state-space forecasting model of continuous glucose monitoring for personalized diabetes management.arXiv [cs.LG], October 2025

Shakson Isaac, Y entl Collin, and Chirag Patel. SSM-CGM: Interpretable state-space forecasting model of continuous glucose monitoring for personalized diabetes management.arXiv [cs.LG], October 2025

work page 2025
[48]

Generating longitudinal screening algorithms using novel biomarkers for disease.Cancer Epidemiol

Martin W McIntosh, Nicole Urban, and Beth Karlan. Generating longitudinal screening algorithms using novel biomarkers for disease.Cancer Epidemiol. Biomarkers Prev., 11(2):159–166, February 2002

work page 2002
[49]

ClinVec: Unified embeddings of clinical codes enable knowledge-grounded AI in medicine.medRxiv, May 2025

Ruth Johnson, Uri Gottlieb, Galit Shaham, Lihi Eisen, Jacob Waxman, Stav Devons-Sberro, Curtis R Ginder, Peter Hong, Raheel Sayeed, Xiaorui Su, Ben Y Reis, Ran D Balicer, Noa Dagan, and Marinka Zitnik. ClinVec: Unified embeddings of clinical codes enable knowledge-grounded AI in medicine.medRxiv, May 2025

work page 2025
[50]

Reducing health disparities: strategy planning and implementation in israel’s largest health care organization.Health Serv

Ran D Balicer, Efrat Shadmi, Nicky Lieberman, Sari Greenberg-Dotan, Margalit Goldfracht, Liora Jana, Arnon D Cohen, Sigal Regev-Rosenberg, and Orit Jacobson. Reducing health disparities: strategy planning and implementation in israel’s largest health care organization.Health Serv. Res., 46(4): 1281–1299, August 2011

work page 2011
[51]

The eICU collaborative research database, a freely available multi-center database for critical care research

Tom J Pollard, Alistair E W Johnson, Jesse D Raffa, Leo A Celi, Roger G Mark, and Omar Badawi. The eICU collaborative research database, a freely available multi-center database for critical care research. Sci. Data, 5(1):180178, September 2018

work page 2018
[52]

INSPIRE, a publicly available research dataset for perioperative medicine.Sci

Leerang Lim, Hyeonhoon Lee, Chul-Woo Jung, Dayeon Sim, Xavier Borrat, Tom J Pollard, Leo A Celi, Roger G Mark, Simon T Vistisen, and Hyung-Chul Lee. INSPIRE, a publicly available research dataset for perioperative medicine.Sci. Data, 11(1):655, June 2024

work page 2024
[53]

LOINC, a universal standard for identifying laboratory observations: a 5-year update.Clin

Clement J McDonald, Stanley M Huff, Jeffrey G Suico, Gilbert Hill, Dennis Leavelle, Raymond Aller, Arden Forrey, Kathy Mercer, Georges DeMoor, John Hook, Warren Williams, James Case, and Pat Maloney. LOINC, a universal standard for identifying laboratory observations: a 5-year update.Clin. Chem., 49(4): 624–633, April 2003

work page 2003
[54]

ABIM laboratory test reference ranges

American Board of Internal Medicine. ABIM laboratory test reference ranges. Technical report, January 2025

work page 2025
[55]

MIMIC-IV, October 2024

Alistair Johnson, Lucas Bulgarelli, Tom Pollard, Brian Gow, Benjamin Moody, Steven Horng, Leo Anthony Celi, and Roger Mark. MIMIC-IV, October 2024

work page 2024
[56]

Abnormal

Michael Wornow, Rahul Thapa, Ethan Steinberg, Jason A Fries, and Nigam H Shah. EHRSHOT: An EHR benchmark for few-shot evaluation of foundation models.arXiv [cs.LG], July 2023. 15 Figures b Training and Validation Cohorts Speciﬁcity Sensitivity 19% 20%4 Lipid Panel Metabolic Function Hepatic Function Complete Blood Count PopRI PerRI NORMARI Population-Base...

work page 2023

[1] [1]

Towards better test utilization - strategies to improve physician ordering and their impact on patient outcomes.EJIFCC, 26(1):15–30, January 2015

Danielle B Freedman. Towards better test utilization - strategies to improve physician ordering and their impact on patient outcomes.EJIFCC, 26(1):15–30, January 2015

work page 2015

[2] [2]

Laboratory diagnosis of iron- deficiency anemia: an overview.J

G H Guyatt, A D Oxman, M Ali, A Willan, W McIlroy, and C Patterson. Laboratory diagnosis of iron- deficiency anemia: an overview.J. Gen. Intern. Med., 7(2):145–153, March 1992

work page 1992

[3] [3]

Guidelines on the management of abnormal liver blood tests.Gut, 67:6–19, November 2017

P Newsome, R Cramb, S Davison, J Dillon, M Foulerton, E Godfrey, Richard Hall, Ulrike Harrower, M Hudson, A Langford, A Mackie, R Mitchell-Thain, K Sennett, N Sheron, J Verne, Martine Walmsley, and A Y eoman. Guidelines on the management of abnormal liver blood tests.Gut, 67:6–19, November 2017

work page 2017

[4] [4]

Clinical practice

Silvio E Inzucchi. Clinical practice. diagnosis of diabetes.N. Engl. J. Med., 367(6):542–550, August 2012

work page 2012

[5] [5]

Enhancing the clinical value of medical laboratory testing.Clin

Kenneth A Sikaris. Enhancing the clinical value of medical laboratory testing.Clin. Biochem. Rev., 38(3): 107–114, November 2017

work page 2017

[6] [6]

Reference intervals: the way forward.Ann

Ferruccio Ceriotti, Rolf Hinzmann, and Mauro Panteghini. Reference intervals: the way forward.Ann. Clin. Biochem., 46(Pt 1):8–17, January 2009

work page 2009

[7] [7]

normal ranges

Richard C Friedberg, Rhona Souers, Elizabeth A Wagar, Ana K Stankovic, Paul N Valenstein, and College of American Pathologists. The origin of reference intervals: A college of american pathologists Q-probes study of “normal ranges” used in 163 clinical laboratories.Arch. Pathol. Lab. Med., 131(3):348–357, March 2007

work page 2007

[8] [8]

Overuse of diagnostic testing in healthcare: a systematic review.BMJ Qual

Joris L J M Müskens, Rudolf Bertijn Kool, Simone A van Dulmen, and Gert P Westert. Overuse of diagnostic testing in healthcare: a systematic review.BMJ Qual. Saf., 31(1):54–63, January 2022

work page 2022

[9] [9]

Low-density lipoproteins cause atherosclerotic cardiovascular disease

Brian A Ference, Henry N Ginsberg, Ian Graham, Kausik K Ray, Chris J Packard, Eric Bruckert, Robert A Hegele, Ronald M Krauss, Frederick J Raal, Heribert Schunkert, Gerald F Watts, Jan Borén, Sergio Fazio, Jay D Horton, Luis Masana, Stephen J Nicholls, Børge G Nordestgaard, Bart van de Sluis, Marja-Riitta Taskinen, Lale Tokgözoglu, Ulf Landmesser, Ulrich ...

work page 2017

[10] [10]

Metabolomic profiles predict individual multidisease outcomes.Nat

Thore Buergel, Jakob Steinfeldt, Greg Ruyoga, Maik Pietzner, Daniele Bizzarri, Dina Vojinovic, Julius Upmeier Zu Belzen, Lukas Loock, Paul Kittner, Lara Christmann, Noah Hollmann, Henrik Strangalies, Jana M Braunger, Benjamin Wild, Scott T Chiesa, Joachim Spranger, Fabian Klostermann, Erik B van den Akker, Stella Trompet, Simon P Mooijaart, Naveed Sattar,...

work page 2022

[11] [11]

Liver enzyme alteration: a guide for clinicians

Edoardo G Giannini, Roberto Testa, and Vincenzo Savarino. Liver enzyme alteration: a guide for clinicians. CMAJ, 172(3):367–379, February 2005

work page 2005

[12] [12]

Interpretation of the complete blood count.Pediatr

M C Walters and H T Abelson. Interpretation of the complete blood count.Pediatr. Clin. North Am., 43(3): 599–622, June 1996

work page 1996

[13] [13]

Defining laboratory reference values and decision limits: populations, intervals, and interpretations.Asian J

James C Boyd. Defining laboratory reference values and decision limits: populations, intervals, and interpretations.Asian J. Androl., 12(1):83–90, January 2010

work page 2010

[14] [14]

In the era of precision medicine and big data, who is normal?JAMA, 319(19):1981–1982, May 2018

Arjun K Manrai, Chirag J Patel, and John P A Ioannidis. In the era of precision medicine and big data, who is normal?JAMA, 319(19):1981–1982, May 2018

work page 1981

[15] [15]

Monthly intra-individual variation in lipids over a 12 1-year period in 22 normal subjects.Clin

D J Nazir, R S Roberts, S A Hill, and M J McQueen. Monthly intra-individual variation in lipids over a 12 1-year period in 22 normal subjects.Clin. Biochem., 32(5):381–389, July 1999

work page 1999

[16] [16]

Haematological setpoints are a stable and patient-specific deep phenotype.Nature, 637(8045):430–438, January 2025

Brody H Foy, Rachel Petherbridge, Maxwell T Roth, Cindy Zhang, Daniel C De Souza, Christopher Mow, Hasmukh R Patel, Chhaya H Patel, Samantha N Ho, Evie Lam, Camille E Powe, Robert P Hasserjian, Konrad J Karczewski, Veronica Tozzo, and John M Higgins. Haematological setpoints are a stable and patient-specific deep phenotype.Nature, 637(8045):430–438, January 2025

work page 2025

[17] [17]

Annual biological variation and personalized reference intervals of clinical chemistry and hematology analytes.Clin

Shuo Wang, Min Zhao, Zihan Su, and Runqing Mu. Annual biological variation and personalized reference intervals of clinical chemistry and hematology analytes.Clin. Chem. Lab. Med., 60(4):606–617, March 2022

work page 2022

[18] [18]

Personalized reference intervals - statistical approaches and considerations.Clin

Abdurrahman Coskun, Sverre Sandberg, Ibrahim Unsal, Fulya G Y avuz, Coskun Cavusoglu, Mustafa Serteser, Meltem Kilercik, and Aasne K Aarsand. Personalized reference intervals - statistical approaches and considerations.Clin. Chem. Lab. Med., 60(4):629–635, March 2022

work page 2022

[19] [19]

Personalized reference intervals in laboratory medicine: A new model based on within-subject biological variation.Clin

Abdurrahman Co¸ skun, Sverre Sandberg, Ibrahim Unsal, Coskun Cavusoglu, Mustafa Serteser, Meltem Kilercik, and Aasne K Aarsand. Personalized reference intervals in laboratory medicine: A new model based on within-subject biological variation.Clin. Chem., 67(2):374–384, January 2021

work page 2021

[20] [20]

Data mining approaches to reference interval studies.Clinical Chemistry, 67(9):1175–1181, 2021

A E Obstfeld, K Patel, J C Boyd, J Drees, D T Holmes, J P Ioannidis, and A K Manrai. Data mining approaches to reference interval studies.Clinical Chemistry, 67(9):1175–1181, 2021

work page 2021

[21] [21]

Association of sickle cell trait with hemoglobin A1c in african americans.JAMA, 317(5):507–515, February 2017

Mary E Lacy, Gregory A Wellenius, Anne E Sumner, Adolfo Correa, Mercedes R Carnethon, Robert I Liem, James G Wilson, David B Sacks, David R Jacobs, Jr, April P Carson, Xi Luo, Annie Gjelsvik, Alexander P Reiner, Rakhi P Naik, Simin Liu, Solomon K Musani, Charles B Eaton, and Wen-Chih Wu. Association of sickle cell trait with hemoglobin A1c in african amer...

work page 2017

[22] [22]

Guidelines for the management of high blood cholesterol

Kenneth R Feingold. Guidelines for the management of high blood cholesterol. InEndotext [Internet]. MDText. com, Inc., 2025

work page 2025

[23] [23]

Evaluation of hemoglobin cutoff levels to define anemia among healthy individuals.JAMA Netw

O Y aw Addo, Emma X Yu, Anne M Williams, Melissa Fox Y oung, Andrea J Sharma, Zuguo Mei, Nicholas J Kassebaum, Maria Elena D Jefferds, and Parminder S Suchdev. Evaluation of hemoglobin cutoff levels to define anemia among healthy individuals.JAMA Netw. Open, 4(8):e2119123, August 2021

work page 2021

[24] [24]

Why should women have lower reference limits for haemoglobin and ferritin concentrations than men?BMJ, 322(7298):1355–1357, June 2001

D H Rushton, R Dover, A W Sainsbury, M J Norris, J J Gilkes, and I D Ramsay. Why should women have lower reference limits for haemoglobin and ferritin concentrations than men?BMJ, 322(7298):1355–1357, June 2001

work page 2001

[25] [25]

Implications of race adjustment in lung-function equations.N

James A Diao, Yixuan He, Rohan Khazanchi, Max Jordan Nguemeni Tiako, Jonathan I Witonsky, Emma Pierson, Pranav Rajpurkar, Jennifer R Elhawary, Luke Melas-Kyriazi, Albert Y en, Alicia R Martin, Sean Levy, Chirag J Patel, Maha Farhat, Luisa N Borrell, Michael H Cho, Edwin K Silverman, Esteban G Burchard, and Arjun K Manrai. Implications of race adjustment i...

work page 2083

[26] [26]

Hidden in plain sight—reconsidering the use of race correction in clinical algorithms.New England Journal of Medicine, 383(9):874–882, 2020

Darshali A Vyas, Leo G Eisenstein, and David S Jones. Hidden in plain sight—reconsidering the use of race correction in clinical algorithms.New England Journal of Medicine, 383(9):874–882, 2020

work page 2020

[27] [27]

Disentangling proxies of demographic adjustments in clinical equations.arXiv [q-bio.QM], November 2025

Aashna P Shah, James A Diao, Emma Pierson, Chirag J Patel, and Arjun K Manrai. Disentangling proxies of demographic adjustments in clinical equations.arXiv [q-bio.QM], November 2025

work page 2025

[28] [28]

Personalized statistical learning algorithms to improve the early detection of cancer using longitudinal biomarkers.Cancer Biomark., 33(2):199–210, 2022

Nabihah Tayob and Ziding Feng. Personalized statistical learning algorithms to improve the early detection of cancer using longitudinal biomarkers.Cancer Biomark., 33(2):199–210, 2022

work page 2022

[29] [29]

The incidentalome: a threat to genomic medicine

Isaac S Kohane, Daniel R Masys, and Russ B Altman. The incidentalome: a threat to genomic medicine. JAMA, 296(2):212–215, July 2006

work page 2006

[30] [30]

The frequency of unnecessary testing in hospitalized patients.Am

Christina Koch, Katherine Roberts, Christopher Petruccelli, and Daniel J Morgan. The frequency of unnecessary testing in hospitalized patients.Am. J. Med., 131(5):500–503, May 2018. 13

work page 2018

[31] [31]

Blood tests - too much of a good thing.Scand

Henrik L Jørgensen and Bent S Lind. Blood tests - too much of a good thing.Scand. J. Prim. Health Care, 40(2):165–166, June 2022

work page 2022

[32] [32]

More than half of abnormal results from laboratory tests ordered by family physicians could be false-positive.Can

Christopher Naugler and Irene Ma. More than half of abnormal results from laboratory tests ordered by family physicians could be false-positive.Can. Fam. Physician, 64(3):202–203, March 2018

work page 2018

[33] [33]

Laboratory reference intervals - history and modern approaches for improved utility.Scand

Tony Badrick, Joe M El-Khoury, and Elvar Theodorsson. Laboratory reference intervals - history and modern approaches for improved utility.Scand. J. Clin. Lab. Invest., 85(4):229–241, June 2025

work page 2025

[34] [34]

A comparison of methods to generate adaptive reference ranges in longitudinal monitoring

Davood Roshan, John Ferguson, Charles R Pedlar, Andrew Simpkin, William Wyns, Frank Sullivan, and John Newell. A comparison of methods to generate adaptive reference ranges in longitudinal monitoring. PLoS One, 16(2):e0247338, February 2021

work page 2021

[35] [35]

Scalable and accurate deep learning with electronic health records.NPJ Digit

Alvin Rajkomar, Eyal Oren, Kai Chen, Andrew M Dai, Nissan Hajaj, Michaela Hardt, Peter J Liu, Xiaobing Liu, Jake Marcus, Mimi Sun, Patrik Sundberg, Hector Y ee, Kun Zhang, Yi Zhang, Gerardo Flores, Gavin E Duggan, Jamie Irvine, Quoc Le, Kurt Litsch, Alexander Mossin, Justin Tansuwan, De Wang, James Wexler, Jimbo Wilson, Dana Ludwig, Samuel L Volchenboum, ...

work page 2018

[36] [36]

Event stream GPT: A data pre- processing and modeling library for generative, pre-trained transformers over continuous-time sequences of complex events.arXiv [cs.LG], June 2023

Matthew B A McDermott, Bret Nestor, Peniel Argaw, and Isaac Kohane. Event stream GPT: A data pre- processing and modeling library for generative, pre-trained transformers over continuous-time sequences of complex events.arXiv [cs.LG], June 2023

work page 2023

[37] [37]

TransformEHR: transformer-based encoder-decoder generative model to enhance prediction of disease outcomes using electronic health records.Nat

Zhichao Y ang, Avijit Mitra, Weisong Liu, Dan Berlowitz, and Hong Yu. TransformEHR: transformer-based encoder-decoder generative model to enhance prediction of disease outcomes using electronic health records.Nat. Commun., 14(1):7857, November 2023

work page 2023

[38] [38]

Health system-scale language models are all-purpose prediction engines.Nature, 619(7969):357–362, July 2023

Lavender Y ao Jiang, Xujin Chris Liu, Nima Pour Nejatian, Mustafa Nasir-Moin, Duo Wang, Anas Abidin, Kevin Eaton, Howard Antony Riina, Ilya Laufer, Paawan Punjabi, Madeline Miceli, Nora C Kim, Cordelia Orillac, Zane Schnurman, Christopher Livia, Hannah Weiss, David Kurland, Sean Neifert, Y osef Dasta- girzada, Douglas Kondziolka, Alexander T M Cheung, Gra...

work page 2023

[39] [39]

Learning the natural history of human disease with generative transformers.Nature, 647(8088):248–256, November 2025

Artem Shmatko, Alexander Wolfgang Jung, Kumar Gaurav, Søren Brunak, Laust Hvas Mortensen, Ewan Birney, Tom Fitzgerald, and Moritz Gerstung. Learning the natural history of human disease with generative transformers.Nature, 647(8088):248–256, November 2025

work page 2025

[40] [40]

Generative medical event models improve with scale.arXiv [cs.LG], November 2025

Shane Waxler, Paul Blazek, Davis White, Daniel Sneider, Kevin Chung, Mani Nagarathnam, Patrick Williams, Hank Voeller, Karen Wong, Matthew Swanhorst, Sheng Zhang, Naoto Usuyama, Cliff Wong, Tristan Naumann, Hoifung Poon, Andrew Loza, Daniella Meeker, Seth Hain, and Rahul Shah. Generative medical event models improve with scale.arXiv [cs.LG], November 2025

work page 2025

[41] [41]

Zero shot health trajectory prediction using transformer.NPJ Digit

Pawel Renc, Yugang Jia, Anthony E Samir, Jaroslaw Was, Quanzheng Li, David W Bates, and Arkadiusz Sitek. Zero shot health trajectory prediction using transformer.NPJ Digit. Med., 7(1):256, September 2024

work page 2024

[42] [42]

A multimodal and temporal foundation model for virtual patient representations at healthcare system scale.arXiv [cs.LG], April 2026

Andrew Zhang, Tong Ding, Sophia J Wagner, Caiwei Tian, Ming Y Lu, Rowland Pettit, Joshua E Lewis, Alexandre Misrahi, Dandan Mo, Long Phi Le, and Faisal Mahmood. A multimodal and temporal foundation model for virtual patient representations at healthcare system scale.arXiv [cs.LG], April 2026

work page 2026

[43] [43]

A foundation model for continuous glucose monitoring data.Nature, 650 14 (8103):978–986, February 2026

Guy Lutsker, Gal Sapir, Smadar Shilo, Jordi Merino, Anastasia Godneva, Jerry R Greenfield, Dorit Samocha-Bonet, Raja Dhir, Francisco Gude, Shie Mannor, Eli Meirom, Eric P Xing, Gal Chechik, Hagai Rossman, and Eran Segal. A foundation model for continuous glucose monitoring data.Nature, 650 14 (8103):978–986, February 2026

work page 2026

[44] [44]

Insulin resistance prediction from wearables and routine blood biomarkers.Nature, March 2026

Ahmed A Metwally, A Ali Heydari, Daniel McDuff, Alexandru Solot, Zeinab Esmaeilpour, Anthony Z Faranesh, Menglian Zhou, Girish Narayanswamy, Maxwell A Xu, Xin Liu, Yuzhe Y ang, David B Savage, Mark Malhotra, Conor Heneghan, Shwetak Patel, Cathy Speed, and Javier L Prieto. Insulin resistance prediction from wearables and routine blood biomarkers.Nature, March 2026

work page 2026

[45] [45]

Causal transformer for estimating counter- factual outcomes.arXiv [cs.LG], April 2022

Valentyn Melnychuk, Dennis Frauen, and Stefan Feuerriegel. Causal transformer for estimating counter- factual outcomes.arXiv [cs.LG], April 2022

work page 2022

[46] [46]

Controllable sequence editing for biological and clinical trajectories.arXiv [cs.LG], February 2025

Michelle M Li, Kevin Li, Y asha Ektefaie, Ying Jin, Y epeng Huang, Shvat Messica, Tianxi Cai, and Marinka Zitnik. Controllable sequence editing for biological and clinical trajectories.arXiv [cs.LG], February 2025

work page 2025

[47] [47]

SSM-CGM: Interpretable state-space forecasting model of continuous glucose monitoring for personalized diabetes management.arXiv [cs.LG], October 2025

Shakson Isaac, Y entl Collin, and Chirag Patel. SSM-CGM: Interpretable state-space forecasting model of continuous glucose monitoring for personalized diabetes management.arXiv [cs.LG], October 2025

work page 2025

[48] [48]

Generating longitudinal screening algorithms using novel biomarkers for disease.Cancer Epidemiol

Martin W McIntosh, Nicole Urban, and Beth Karlan. Generating longitudinal screening algorithms using novel biomarkers for disease.Cancer Epidemiol. Biomarkers Prev., 11(2):159–166, February 2002

work page 2002

[49] [49]

ClinVec: Unified embeddings of clinical codes enable knowledge-grounded AI in medicine.medRxiv, May 2025

Ruth Johnson, Uri Gottlieb, Galit Shaham, Lihi Eisen, Jacob Waxman, Stav Devons-Sberro, Curtis R Ginder, Peter Hong, Raheel Sayeed, Xiaorui Su, Ben Y Reis, Ran D Balicer, Noa Dagan, and Marinka Zitnik. ClinVec: Unified embeddings of clinical codes enable knowledge-grounded AI in medicine.medRxiv, May 2025

work page 2025

[50] [50]

Reducing health disparities: strategy planning and implementation in israel’s largest health care organization.Health Serv

Ran D Balicer, Efrat Shadmi, Nicky Lieberman, Sari Greenberg-Dotan, Margalit Goldfracht, Liora Jana, Arnon D Cohen, Sigal Regev-Rosenberg, and Orit Jacobson. Reducing health disparities: strategy planning and implementation in israel’s largest health care organization.Health Serv. Res., 46(4): 1281–1299, August 2011

work page 2011

[51] [51]

The eICU collaborative research database, a freely available multi-center database for critical care research

Tom J Pollard, Alistair E W Johnson, Jesse D Raffa, Leo A Celi, Roger G Mark, and Omar Badawi. The eICU collaborative research database, a freely available multi-center database for critical care research. Sci. Data, 5(1):180178, September 2018

work page 2018

[52] [52]

INSPIRE, a publicly available research dataset for perioperative medicine.Sci

Leerang Lim, Hyeonhoon Lee, Chul-Woo Jung, Dayeon Sim, Xavier Borrat, Tom J Pollard, Leo A Celi, Roger G Mark, Simon T Vistisen, and Hyung-Chul Lee. INSPIRE, a publicly available research dataset for perioperative medicine.Sci. Data, 11(1):655, June 2024

work page 2024

[53] [53]

LOINC, a universal standard for identifying laboratory observations: a 5-year update.Clin

Clement J McDonald, Stanley M Huff, Jeffrey G Suico, Gilbert Hill, Dennis Leavelle, Raymond Aller, Arden Forrey, Kathy Mercer, Georges DeMoor, John Hook, Warren Williams, James Case, and Pat Maloney. LOINC, a universal standard for identifying laboratory observations: a 5-year update.Clin. Chem., 49(4): 624–633, April 2003

work page 2003

[54] [54]

ABIM laboratory test reference ranges

American Board of Internal Medicine. ABIM laboratory test reference ranges. Technical report, January 2025

work page 2025

[55] [55]

MIMIC-IV, October 2024

Alistair Johnson, Lucas Bulgarelli, Tom Pollard, Brian Gow, Benjamin Moody, Steven Horng, Leo Anthony Celi, and Roger Mark. MIMIC-IV, October 2024

work page 2024

[56] [56]

Abnormal

Michael Wornow, Rahul Thapa, Ethan Steinberg, Jason A Fries, and Nigam H Shah. EHRSHOT: An EHR benchmark for few-shot evaluation of foundation models.arXiv [cs.LG], July 2023. 15 Figures b Training and Validation Cohorts Speciﬁcity Sensitivity 19% 20%4 Lipid Panel Metabolic Function Hepatic Function Complete Blood Count PopRI PerRI NORMARI Population-Base...

work page 2023