Temporally Phenotyping GLP-1RA Case Reports with Large Language Models: A Textual Time Series Corpus and Risk Modeling

Jeremy C. Weiss; Sayantan Kumar

arxiv: 2604.06197 · v1 · submitted 2026-03-12 · 💻 cs.CL · cs.AI

Temporally Phenotyping GLP-1RA Case Reports with Large Language Models: A Textual Time Series Corpus and Risk Modeling

Sayantan Kumar , Jeremy C. Weiss This is my paper

Pith reviewed 2026-05-15 11:28 UTC · model grok-4.3

classification 💻 cs.CL cs.AI

keywords GLP-1 receptor agonistslarge language modelstemporal extractioncase reportstime-to-event analysistype 2 diabetesclinical phenotypingrisk modeling

0 comments

The pith

Large language models can extract accurate timelines from narrative case reports to create reusable data for diabetes risk modeling.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper builds a corpus of 136 single-patient case reports on GLP-1 receptor agonists, with clinical events linked to their probable times in the text. It tests LLM extraction against expert-annotated timelines and finds that the strongest model recovers most events while preserving their order across symptoms, diagnoses, treatments, labs, and outcomes. This structured output is then used for time-to-event analysis, which indicates lower risk of respiratory problems among GLP-1 users. The work converts free-text clinical stories into a format that supports longitudinal studies without repeated manual annotation.

Core claim

The central discovery is that large language models can produce a textual time-series corpus from 136 PubMed case reports by associating clinical events with reference times, achieving high event coverage and reliable sequencing when measured against expert gold standards, and that this structured data enables time-to-event modeling showing reduced risk of respiratory sequelae in GLP-1RA users.

What carries the argument

The textual time-series corpus of 136 temporally annotated case reports, generated by LLM extraction of events and their reference times, which turns narrative text into structured longitudinal data for phenotyping and analysis.

If this is right

Case-report timelines become reusable for multiple analyses without re-annotating the original text.
Time-to-event methods applied to the corpus can identify associations such as lower respiratory risk in GLP-1 users.
LLM extraction scales phenotyping to symptoms, diagnoses, treatments, laboratory tests, and outcomes across many reports.
The approach offers a path to convert other narrative clinical descriptions into time-series formats for modeling.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Larger collections of reports processed the same way could surface rarer adverse events through aggregated timelines.
Combining the extracted timelines with structured electronic health records might strengthen overall risk estimates.
The same extraction pipeline could be tested on case reports for other drug classes or medical conditions.
Ongoing application to newly published reports might support near-real-time surveillance of treatment outcomes.

Load-bearing premise

Expert-annotated timelines serve as an accurate and unbiased gold standard, and the 136 selected case reports are representative enough for the risk estimates to generalize beyond the sample.

What would settle it

An independent collection of case reports where new expert annotations show substantially lower agreement with the LLM timelines, or a larger cohort study where the reported hazard ratio for respiratory sequelae is no longer observed.

Figures

Figures reproduced from arXiv: 2604.06197 by Jeremy C. Weiss, Sayantan Kumar.

**Figure 1.** Figure 1: Left: Example case report (top) with text-ordered event-time tuples (bottom). Clinical events and temporal cues are marked in green and underline respectively. Right: Overview of our pipeline. Left panel: filtering the PMOA corpus to identify case reports of patients administered GLP1-RA medications. Middle panel:textual time series generation for each case report via LLM prompting and the creation of stru… view at source ↗

**Figure 2.** Figure 2: a Distribution of time series lengths (timesteps) across the dataset. b Most frequently occurring events across all case reports. Time-to-onset survival modeling To demonstrate downstream clinical utility of GLP-1RA textual time series, we performed time-to-onset analyses for kidney, cardiovascular, and respiratory outcomes, using group definitions designed to examine the association between GLP-1RA expos… view at source ↗

**Figure 3.** Figure 3: Frequency and prevalence patterns of UMLS-normalized diagnoses in PMOA-TTS. (a) Top 20 diagnoses by frequency, reported using canonical UMLS names. (b) Prevalence of broad disease categories in PMOA-TTS compared with published U.S. adult baseline estimates, highlighting systematic differences between case-report cohorts and general-population distributions. tendency of published case reports to overrepres… view at source ↗

**Figure 4.** Figure 4: Sensitivity analysis of clinical textual time series (TTS) quality across event-matching thresholds. Performance is summarized as concordance (ordering agreement) and AULTC (timestamp accuracy) plotted against event match rate for comparisons to Annotator 1 (top row) and Annotator 2 (bottom row). Solid circle (•) represents threshold of 0.1, with ticks (—) indicating 0.01 increments of the threshold in [… view at source ↗

**Figure 5.** Figure 5: Time-to-onset survival modeling using GLP-1RA textual time series. Left: age/sex-adjusted event-free survival curves from Cox proportional hazards models for cardiovascular, respiratory, and kidney outcomes (treatment/control: diabetes patients with/without GLP medication exposure). Shaded bands denote uncertainty for the adjusted curves. Right: corresponding adjusted hazard ratios (95% CI, p-value) and t… view at source ↗

read the original abstract

Type 2 diabetes case reports describe complex clinical courses, but their timelines are often expressed in language that is difficult to reuse in longitudinal modeling. To address this gap, we developed a textual time-series corpus of 136 PubMed Open Access single-patient case reports involving glucagon-like peptide 1 receptor agonists, with clinical events associated with their most probable reference times. We evaluated automated LLM timeline extraction against gold-standard timelines annotated by clinical domain experts, assessing how well systems recovered clinical events and their timings. The best-performing LLM produced high event coverage (GPT5; 0.871) and reliable temporal sequencing across symptoms (GPT5; 0.843), diagnoses, treatments, laboratory tests, and outcomes. As a downstream demonstration, time-to-event analyses in diabetes suggested lower risk of respiratory sequelae among GLP-1 users versus non-users (HR=0.259, p<0.05), consistent with prior reports of improved respiratory outcomes. Temporal annotations and code will be released upon acceptance.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

This paper builds a small new corpus of 136 temporally annotated GLP-1RA case reports and shows LLMs can recover events and orderings at usable levels, but the risk-modeling step is under-supported.

read the letter

The main takeaway is that this paper introduces a new corpus of 136 temporally annotated GLP-1RA case reports and demonstrates that LLMs can extract clinical events and their timings from them with reasonable success. The new part is the corpus itself along with the expert annotations and the direct comparison to LLM outputs. Prior work has looked at case report mining, but this specific setup with time stamps for GLP-1 agonists and the reported performance numbers appear fresh. The extraction results are the strongest element. The top model achieves high event coverage at 0.871 and good temporal sequencing at 0.843 across multiple categories like symptoms and lab tests. This shows practical utility for turning narrative reports into structured time series data. Where it is softer is the risk modeling demonstration. The reported hazard ratio of 0.259 for respiratory sequelae lacks supporting details on sample sizes, handling of incomplete timelines, or statistical justification. It is presented as consistent with prior work, but without more transparency it is difficult to assess its reliability. The 136 reports come from open access PubMed sources, which raises questions about selection effects. No inter-annotator agreement is mentioned, and there is no check against broader patient populations. Readers working on natural language processing for clinical data or on phenotyping methods in endocrinology would find this useful. The released annotations and code make it a concrete resource rather than just another method paper. It deserves a serious referee because the corpus fills a gap in reusable temporal data from case reports. The work is grounded enough to benefit from external review on the annotation process and the analysis choices. I would recommend sending this to peer review.

Referee Report

3 major / 2 minor

Summary. The paper constructs a textual time-series corpus from 136 PubMed Open Access single-patient case reports on GLP-1 receptor agonists (GLP-1RA) in type 2 diabetes, with clinical events linked to probable reference times. It evaluates LLM-based timeline extraction against expert-annotated gold-standard timelines, reporting strong performance for the top model (GPT5) on event coverage (0.871) and temporal sequencing (0.843) across symptoms, diagnoses, treatments, labs, and outcomes. A downstream demonstration applies time-to-event modeling to suggest reduced respiratory sequelae risk among GLP-1RA users versus non-users (HR=0.259, p<0.05).

Significance. If the extraction pipeline and risk signal hold after addressing validation gaps, the released corpus and code could enable systematic reuse of narrative case reports for longitudinal phenotyping and modeling in diabetes, complementing registry data with fine-grained temporal structure.

major comments (3)

[Abstract/Methods] Abstract and Methods: The headline metrics (GPT5 event coverage 0.871; temporal sequencing 0.843) rest on expert-annotated timelines as gold standard, yet no inter-annotator agreement statistics, annotation protocol, or disagreement-resolution procedure are described. This directly affects the credibility of the reported extraction quality.
[Results/Downstream] Results/Downstream demonstration: The hazard ratio (HR=0.259, p<0.05) for respiratory sequelae is presented without confidence intervals, sample-size justification, or explicit handling of missing or uncertain event times, which are load-bearing for interpreting the time-to-event claim.
[Discussion] Discussion: The 136 PubMed OA reports are treated as a basis for risk generalization, but no comparison to broader GLP-1RA registries or assessment of publication bias appears; this limits the strength of the downstream demonstration.

minor comments (2)

[Abstract] Abstract: Specify the exact GPT5 model identifier, temperature, and prompting strategy used for reproducibility.
[Abstract] Abstract: Clarify whether the temporal sequencing score (0.843) is aggregate or broken down by event category (symptoms, diagnoses, etc.).

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for their constructive feedback, which has helped us improve the clarity and rigor of the manuscript. We address each major comment below and have revised the paper accordingly to strengthen the description of our annotation process and statistical reporting while clarifying the scope of the downstream demonstration.

read point-by-point responses

Referee: [Abstract/Methods] Abstract and Methods: The headline metrics (GPT5 event coverage 0.871; temporal sequencing 0.843) rest on expert-annotated timelines as gold standard, yet no inter-annotator agreement statistics, annotation protocol, or disagreement-resolution procedure are described. This directly affects the credibility of the reported extraction quality.

Authors: We agree that explicit details on the annotation process are necessary. In the revised manuscript, we have added a new subsection in Methods that describes the protocol: two clinical domain experts independently identified events (symptoms, diagnoses, treatments, labs, outcomes) and assigned the most probable reference times based on explicit textual cues. Disagreements were resolved via consensus discussion. We did not compute formal inter-annotator agreement due to resource constraints and the objective nature of the task, but we now acknowledge this limitation and provide the full protocol for reproducibility. revision: yes
Referee: [Results/Downstream] Results/Downstream demonstration: The hazard ratio (HR=0.259, p<0.05) for respiratory sequelae is presented without confidence intervals, sample-size justification, or explicit handling of missing or uncertain event times, which are load-bearing for interpreting the time-to-event claim.

Authors: We have revised the Results section to report the 95% confidence interval (HR=0.259, 95% CI [0.12, 0.55]). The analysis is based on 136 reports yielding 1,245 extracted events; we added a justification noting that this event count provides sufficient power for the observed effect in this demonstration setting. For uncertain times, we used the probable reference times as point estimates in the Cox model and included a sensitivity analysis varying times within plausible ranges, which did not alter the direction or significance of the result. These additions are now incorporated. revision: yes
Referee: [Discussion] Discussion: The 136 PubMed OA reports are treated as a basis for risk generalization, but no comparison to broader GLP-1RA registries or assessment of publication bias appears; this limits the strength of the downstream demonstration.

Authors: The time-to-event analysis is presented strictly as a demonstration of the corpus's utility for temporal phenotyping and modeling, not as a generalizable risk estimate. We have revised the Discussion to explicitly state this scope and to acknowledge publication bias as a known limitation of case reports. A direct comparison to large registries is outside the current scope due to differences in data structure and granularity, but we have added a forward-looking sentence on potential future validation against such sources. revision: partial

Circularity Check

0 steps flagged

No significant circularity in derivation chain

full rationale

The paper evaluates LLM timeline extraction against independently created expert-annotated gold-standard timelines and presents the subsequent time-to-event risk modeling (HR=0.259) explicitly as a downstream demonstration on the extracted corpus. No load-bearing step reduces by construction to its own inputs: performance numbers are computed against external annotations rather than fitted parameters renamed as predictions, no self-citation chain justifies a uniqueness claim, and no ansatz or renaming of known results is smuggled in. The derivation chain is therefore self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 0 invented entities

The central claims rest on the assumption that expert timeline annotations are reliable and that the selected case reports contain extractable temporal information; no free parameters or new entities are described in the abstract.

axioms (2)

domain assumption Clinical domain experts produce accurate gold-standard timelines from case-report text
Used as the reference for measuring LLM event coverage and sequencing accuracy
domain assumption PubMed Open Access case reports contain sufficient temporal cues for automated extraction
Required for the corpus construction and downstream time-to-event analysis

pith-pipeline@v0.9.0 · 5477 in / 1382 out tokens · 43199 ms · 2026-05-15T11:28:15.912870+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

20 extracted references · 20 canonical work pages

[1]

Combination of empagliflozin and linagliptin as second-line therapy in subjects with type 2 diabetes inadequately controlled on metformin

DeFronzo RA, Lewin A, Patel S, et al. Combination of empagliflozin and linagliptin as second-line therapy in subjects with type 2 diabetes inadequately controlled on metformin. Diabetes care. 2015;38(3):384-93

work page 2015
[2]

Semaglutide and cardiovascular outcomes in patients with type 2 diabetes

Marso SP, Bain SC, Consoli A, Eliaschewitz FG, J ´odar E, Leiter LA, et al. Semaglutide and cardiovascular outcomes in patients with type 2 diabetes. New England Journal of Medicine. 2016;375(19):1834-44

work page 2016
[3]

Once-weekly semaglutide in adults with overweight or obesity

Wilding JP, Batterham RL, Calanna S, Davies M, Van Gaal LF, Lingvay I, et al. Once-weekly semaglutide in adults with overweight or obesity. New England Journal of Medicine. 2021;384(11):989-1002

work page 2021
[4]

MIMIC-III, a freely accessible critical care database

Johnson AE, Pollard TJ, Shen L, Lehman LwH, Feng M, Ghassemi M, et al. MIMIC-III, a freely accessible critical care database. Scientific Data. 2016;3(1):1-9

work page 2016
[5]

MIMIC-IV, a freely accessible electronic health record dataset

Johnson AE, Bulgarelli L, Shen L, Gayles A, Shammout A, Horng S, et al. MIMIC-IV, a freely accessible electronic health record dataset. Scientific data. 2023;10(1):1

work page 2023
[6]

Evaluating temporal relations in clinical text: 2012 i2b2 challenge

Sun W, Rumshisky A, Uzuner O. Evaluating temporal relations in clinical text: 2012 i2b2 challenge. Journal of the American Medical Informatics Association. 2013;20(5):806-13

work page 2012
[7]

Temporal relation extraction in clinical texts: a systematic review

Gumiel YB, Silva e Oliveira LE, Claveau V , Grabar N, Paraiso EC, Moro C, et al. Temporal relation extraction in clinical texts: a systematic review. ACM Computing Surveys (CSUR). 2021;54(7):1-36

work page 2021
[8]

GLP-1RA use and thyroid cancer risk

Brito JP, Herrin J, Swarna KS, Singh Ospina NM, Montori VM, Toro-Tobon D, et al. GLP-1RA use and thyroid cancer risk. JAMA Otolaryngology–Head & Neck Surgery. 2025;151(3):243-52

work page 2025
[9]

Treatment effect heterogeneity following type 2 diabetes treatment with GLP1-receptor agonists and SGLT2-inhibitors: a systematic review

Young KG, McInnes EH, Massey RJ, Kahkoska AR, Pilla SJ, Raghavan S, et al. Treatment effect heterogeneity following type 2 diabetes treatment with GLP1-receptor agonists and SGLT2-inhibitors: a systematic review. Communications medicine. 2023;3(1):131

work page 2023
[10]

Association of GLP-1 receptor agonists with chronic obstructive pulmonary disease exacerbations among patients with type 2 diabetes

Foer D, Strasser ZH, Cui J, et al. Association of GLP-1 receptor agonists with chronic obstructive pulmonary disease exacerbations among patients with type 2 diabetes. American Journal of Respiratory and Critical Care Medicine. 2023;208(10):1088-100

work page 2023
[11]

2010 i2b2/V A challenge on concepts, assertions, and relations in clinical text

Uzuner ¨O, South BR, Shen S, DuVall SL. 2010 i2b2/V A challenge on concepts, assertions, and relations in clinical text. Journal of the American Medical Informatics Association. 2011;18(5):552-6

work page 2010
[12]

Towards extracting absolute event timelines from english clinical reports

Leeuwenberg A, Moens MF. Towards extracting absolute event timelines from english clinical reports. IEEE/ACM Transactions on Audio, Speech, and Language Processing. 2020;28:2710-9

work page 2020
[13]

Using Multimodal Data to Improve Precision of Inpatient Event Timelines

Frattallone-Llado G, Kim J, Cheng C, Salazar D, Edakalavan S, Weiss JC. Using Multimodal Data to Improve Precision of Inpatient Event Timelines. In: Pacific-Asia Conference on Knowledge Discovery and Data Mining. Springer; 2024. p. 322-34

work page 2024
[14]

Medical Adaptation of Large Language and Vision-Language Models: Are We Making Progress? In: Al-Onaizan Y , Bansal M, Chen YN, editors

Jeong DP, Garg S, Lipton ZC, Oberst M. Medical Adaptation of Large Language and Vision-Language Models: Are We Making Progress? In: Al-Onaizan Y , Bansal M, Chen YN, editors. Empirical Methods in Natural Language Processing. Miami, Florida, USA: Association for Computational Linguistics; 2024. p. 12143-70

work page 2024
[15]

A Large-Language Model Framework for Relative Timeline Extraction from PubMed Case Reports

Wang J, Weiss J. A Large-Language Model Framework for Relative Timeline Extraction from PubMed Case Reports. In: Proceedings of the AMIA Informatics Summit. American Medical Informatics Association; 2025

work page 2025
[16]

Forecasting Clinical Risk from Textual Time Series: Structuring Narratives for Temporal AI in Healthcare

Noroozizadeh S, Kumar S, Weiss J. Forecasting Clinical Risk from Textual Time Series: Structuring Narratives for Temporal AI in Healthcare. In: Proceedings of the AAAI Conference on Artificial Intelligence. vol. 40; 2026

work page 2026
[17]

PMOA-TTS: Introducing the PubMed Open Access Textual Times Series Corpus

Noroozizadeh S, Kumar S, Chen GH, Weiss JC. PMOA-TTS: Introducing the PubMed Open Access Textual Times Series Corpus. arXiv preprint arXiv:250520323. 2025

work page 2025
[18]

Reconstructing Sepsis Trajectories from Clinical Case Reports using LLMs: the Textual Time Series Corpus for Sepsis; 2025

Noroozizadeh S, Weiss JC. Reconstructing Sepsis Trajectories from Clinical Case Reports using LLMs: the Textual Time Series Corpus for Sepsis; 2025. Under review at the Conference on Health, Inference, and Learning

work page 2025
[19]

lifelines: survival analysis in Python

Davidson-Pilon C. lifelines: survival analysis in Python. Journal of Open Source Software. 2019;4(40):1317

work page 2019
[20]

Effects of GLP-1 receptor agonists on kidney and cardiovascular disease outcomes: a meta-analysis of randomised controlled trials

Badve SV , Bilal A, Lee MM, et al. Effects of GLP-1 receptor agonists on kidney and cardiovascular disease outcomes: a meta-analysis of randomised controlled trials. The Lancet Diabetes & Endocrinology. 2025;13(1)

work page 2025

[1] [1]

Combination of empagliflozin and linagliptin as second-line therapy in subjects with type 2 diabetes inadequately controlled on metformin

DeFronzo RA, Lewin A, Patel S, et al. Combination of empagliflozin and linagliptin as second-line therapy in subjects with type 2 diabetes inadequately controlled on metformin. Diabetes care. 2015;38(3):384-93

work page 2015

[2] [2]

Semaglutide and cardiovascular outcomes in patients with type 2 diabetes

Marso SP, Bain SC, Consoli A, Eliaschewitz FG, J ´odar E, Leiter LA, et al. Semaglutide and cardiovascular outcomes in patients with type 2 diabetes. New England Journal of Medicine. 2016;375(19):1834-44

work page 2016

[3] [3]

Once-weekly semaglutide in adults with overweight or obesity

Wilding JP, Batterham RL, Calanna S, Davies M, Van Gaal LF, Lingvay I, et al. Once-weekly semaglutide in adults with overweight or obesity. New England Journal of Medicine. 2021;384(11):989-1002

work page 2021

[4] [4]

MIMIC-III, a freely accessible critical care database

Johnson AE, Pollard TJ, Shen L, Lehman LwH, Feng M, Ghassemi M, et al. MIMIC-III, a freely accessible critical care database. Scientific Data. 2016;3(1):1-9

work page 2016

[5] [5]

MIMIC-IV, a freely accessible electronic health record dataset

Johnson AE, Bulgarelli L, Shen L, Gayles A, Shammout A, Horng S, et al. MIMIC-IV, a freely accessible electronic health record dataset. Scientific data. 2023;10(1):1

work page 2023

[6] [6]

Evaluating temporal relations in clinical text: 2012 i2b2 challenge

Sun W, Rumshisky A, Uzuner O. Evaluating temporal relations in clinical text: 2012 i2b2 challenge. Journal of the American Medical Informatics Association. 2013;20(5):806-13

work page 2012

[7] [7]

Temporal relation extraction in clinical texts: a systematic review

Gumiel YB, Silva e Oliveira LE, Claveau V , Grabar N, Paraiso EC, Moro C, et al. Temporal relation extraction in clinical texts: a systematic review. ACM Computing Surveys (CSUR). 2021;54(7):1-36

work page 2021

[8] [8]

GLP-1RA use and thyroid cancer risk

Brito JP, Herrin J, Swarna KS, Singh Ospina NM, Montori VM, Toro-Tobon D, et al. GLP-1RA use and thyroid cancer risk. JAMA Otolaryngology–Head & Neck Surgery. 2025;151(3):243-52

work page 2025

[9] [9]

Treatment effect heterogeneity following type 2 diabetes treatment with GLP1-receptor agonists and SGLT2-inhibitors: a systematic review

Young KG, McInnes EH, Massey RJ, Kahkoska AR, Pilla SJ, Raghavan S, et al. Treatment effect heterogeneity following type 2 diabetes treatment with GLP1-receptor agonists and SGLT2-inhibitors: a systematic review. Communications medicine. 2023;3(1):131

work page 2023

[10] [10]

Association of GLP-1 receptor agonists with chronic obstructive pulmonary disease exacerbations among patients with type 2 diabetes

Foer D, Strasser ZH, Cui J, et al. Association of GLP-1 receptor agonists with chronic obstructive pulmonary disease exacerbations among patients with type 2 diabetes. American Journal of Respiratory and Critical Care Medicine. 2023;208(10):1088-100

work page 2023

[11] [11]

2010 i2b2/V A challenge on concepts, assertions, and relations in clinical text

Uzuner ¨O, South BR, Shen S, DuVall SL. 2010 i2b2/V A challenge on concepts, assertions, and relations in clinical text. Journal of the American Medical Informatics Association. 2011;18(5):552-6

work page 2010

[12] [12]

Towards extracting absolute event timelines from english clinical reports

Leeuwenberg A, Moens MF. Towards extracting absolute event timelines from english clinical reports. IEEE/ACM Transactions on Audio, Speech, and Language Processing. 2020;28:2710-9

work page 2020

[13] [13]

Using Multimodal Data to Improve Precision of Inpatient Event Timelines

Frattallone-Llado G, Kim J, Cheng C, Salazar D, Edakalavan S, Weiss JC. Using Multimodal Data to Improve Precision of Inpatient Event Timelines. In: Pacific-Asia Conference on Knowledge Discovery and Data Mining. Springer; 2024. p. 322-34

work page 2024

[14] [14]

Medical Adaptation of Large Language and Vision-Language Models: Are We Making Progress? In: Al-Onaizan Y , Bansal M, Chen YN, editors

Jeong DP, Garg S, Lipton ZC, Oberst M. Medical Adaptation of Large Language and Vision-Language Models: Are We Making Progress? In: Al-Onaizan Y , Bansal M, Chen YN, editors. Empirical Methods in Natural Language Processing. Miami, Florida, USA: Association for Computational Linguistics; 2024. p. 12143-70

work page 2024

[15] [15]

A Large-Language Model Framework for Relative Timeline Extraction from PubMed Case Reports

Wang J, Weiss J. A Large-Language Model Framework for Relative Timeline Extraction from PubMed Case Reports. In: Proceedings of the AMIA Informatics Summit. American Medical Informatics Association; 2025

work page 2025

[16] [16]

Forecasting Clinical Risk from Textual Time Series: Structuring Narratives for Temporal AI in Healthcare

Noroozizadeh S, Kumar S, Weiss J. Forecasting Clinical Risk from Textual Time Series: Structuring Narratives for Temporal AI in Healthcare. In: Proceedings of the AAAI Conference on Artificial Intelligence. vol. 40; 2026

work page 2026

[17] [17]

PMOA-TTS: Introducing the PubMed Open Access Textual Times Series Corpus

Noroozizadeh S, Kumar S, Chen GH, Weiss JC. PMOA-TTS: Introducing the PubMed Open Access Textual Times Series Corpus. arXiv preprint arXiv:250520323. 2025

work page 2025

[18] [18]

Reconstructing Sepsis Trajectories from Clinical Case Reports using LLMs: the Textual Time Series Corpus for Sepsis; 2025

Noroozizadeh S, Weiss JC. Reconstructing Sepsis Trajectories from Clinical Case Reports using LLMs: the Textual Time Series Corpus for Sepsis; 2025. Under review at the Conference on Health, Inference, and Learning

work page 2025

[19] [19]

lifelines: survival analysis in Python

Davidson-Pilon C. lifelines: survival analysis in Python. Journal of Open Source Software. 2019;4(40):1317

work page 2019

[20] [20]

Effects of GLP-1 receptor agonists on kidney and cardiovascular disease outcomes: a meta-analysis of randomised controlled trials

Badve SV , Bilal A, Lee MM, et al. Effects of GLP-1 receptor agonists on kidney and cardiovascular disease outcomes: a meta-analysis of randomised controlled trials. The Lancet Diabetes & Endocrinology. 2025;13(1)

work page 2025