pith. machine review for the scientific record. sign in

arxiv: 2502.17403 · v5 · submitted 2025-02-24 · 💻 cs.LG · cs.AI· cs.CL

Recognition: unknown

Large Language Models are Powerful Electronic Health Record Encoders

Authors on Pith no claims yet
classification 💻 cs.LG cs.AIcs.CL
keywords datamodelsembeddingsllm-basedtasksaccessclinicalelectronic
0
0 comments X
read the original abstract

Electronic Health Records (EHRs) offer considerable potential for clinical prediction, but their complexity and heterogeneity challenge traditional machine learning. Domain-specific EHR foundation models trained on unlabeled EHR data have shown improved predictive accuracy and generalization. However, their development is constrained by limited data access and site-specific vocabularies. We convert EHR data into plain text by replacing medical codes with natural-language descriptions, enabling general-purpose Large Language Models (LLMs) to produce high-dimensional embeddings for downstream prediction tasks without access to private medical training data. LLM-based embeddings perform on par with a specialized EHR foundation model, CLMBR-T-Base, across 15 clinical tasks from the EHRSHOT benchmark. In an external validation using the UK Biobank, an LLM-based model shows statistically significant improvements for some tasks, which we attribute to higher vocabulary coverage and slightly better generalization. Overall, we reveal a trade-off between the computational efficiency of specialized EHR models and the portability and data independence of LLM-based embeddings.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Forward citations

Cited by 2 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. Hidden in the Multiplicative Interaction: Uncovering Fragility in Multimodal Contrastive Learning

    cs.LG 2026-04 unverdicted novelty 7.0

    Multimodal contrastive learning using multilinear products is fragile to single bad modalities, and a gated version improves top-1 retrieval accuracy on synthetic and real trimodal data.

  2. Representation Before Training: A Fixed-Budget Benchmark for Generative Medical Event Models

    cs.LG 2026-04 unverdicted novelty 5.0

    Fused code-value tokenization improves mortality AUROC from 0.891 to 0.915 and other clinical outcome predictions, while certain temporal encodings like event order match or exceed time tokens with shorter sequences.