One Loss to Rule Them All: Marked Time-to-Event for Structured EHR Foundation Models

Aparajita Kashyap; Chao Pang; Shalmali Joshi; Simon A. Lee; Vincent Jeanselme; Xinzhuo Jiang; Yanwei Li; Yuta Kobayashi; Zilin Jing

arxiv: 2602.00541 · v2 · pith:STFRLY5Qnew · submitted 2026-01-31 · 💻 cs.LG

One Loss to Rule Them All: Marked Time-to-Event for Structured EHR Foundation Models

Zilin Jing , Vincent Jeanselme , Yuta Kobayashi , Simon A. Lee , Chao Pang , Aparajita Kashyap , Yanwei Li , Xinzhuo Jiang

show 1 more author

Shalmali Joshi

This is my paper

classification 💻 cs.LG

keywords eventsdownstreameventmeasurementsmodelsobjectivepredictionpretraining

0 comments

read the original abstract

Clinical events captured in Electronic Health Records (EHR) are irregularly sampled and may consist of a mixture of discrete events and numerical measurements, such as laboratory values or treatment dosages. The sequential nature of EHR, analogous to natural language, has motivated the use of next-token prediction to train prior EHR Foundation Models (FMs) over events. However, this training fails to capture the full structure of EHR. When a given event occurs must be captured, but the event value (abnormal lab) also modulates the likelihood of other clinical events. Most existing EHR FMs do not jointly model this likelihood and are unable to capture the full observation process, impacting downstream capabilities. We propose ORA, a marked time-to-event pretraining objective that jointly models event timing and associated measurements. Across multiple datasets, downstream tasks, and model backbones, this objective consistently yields more generalizable representations than next-token prediction and pretraining losses that ignore continuous measurements. Importantly, the proposed objective yields improvements beyond traditional classification evaluation, including better regression and time-to-event prediction. Beyond introducing a new family of FMs, our ablations suggest a broader takeaway: pretraining objectives that account for EHR structure are critical for expanding downstream capabilities and generalizability.

This paper has not been read by Pith yet.

discussion (0)

Forward citations

Cited by 3 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

GlucoFM: A Dual-Stream Foundation Model for Continuous Glucose Monitoring
cs.LG 2026-05 unverdicted novelty 7.0

GlucoFM decomposes CGM traces into dual state-event streams, pretrains on 109k hours of unlabeled data, and reports superior subject-disjoint performance on seven clinical tasks across four cohorts.
PORTER: Language-Grounded Event Representations for Portable Structured EHR Foundation Models
cs.CL 2026-06 unverdicted novelty 6.0

PORTER is a language-grounded EHR foundation model that uses text descriptions for events and a numeric pathway, matching fixed-vocabulary performance on 74 tasks while recovering 97.1% AUROC on unseen vocabularies an...
AURORA: Contextual Orthogonalization for Geometric Representation Learning in Healthcare Foundation Models
cs.LG 2026-05 unverdicted novelty 6.0

AURORA is a representation learning framework that uses contextual orthogonalization and relational alignment to create disentangled, geometrically interpretable latent spaces in healthcare foundation models.