Alistair Johnson, Tom Pollard, Steven Horng, Leo Anthony Celi, and Roger Mark

URLhttps: //api · 2023 · DOI 10.13026/1n74-ne17

6 Pith papers cite this work. Polarity classification is still indexing.

6 Pith papers citing it

open at publisher browse 6 citing papers

representative citing papers

MedicalBench: Evaluating Large Language Models Toward Improved Medical Concept Extraction

cs.CL · 2026-04-05 · unverdicted · novelty 7.0

MedicalBench is a benchmark for implicit medical concept extraction and sentence-level evidence retrieval built from MIMIC-IV discharge summaries with human verification to test LLM reasoning on unstated medical ideas.

MILM: Large Language Models for Multimodal Irregular Time Series with Informative Sampling

cs.LG · 2026-05-13 · unverdicted · novelty 6.0

MILM fine-tunes LLMs on XML-encoded multimodal irregular time series via a two-stage process that exploits informative sampling patterns to achieve top performance on EHR classification datasets.

Agentic clinical reasoning over longitudinal myeloma records: a retrospective evaluation against expert consensus

cs.AI · 2026-04-27 · unverdicted · novelty 6.0

An agentic LLM reasoning system reached 79.6% agreement with expert consensus on myeloma care questions from longitudinal records, outperforming iterative RAG and full-context baselines by 3.8-4.2 points with larger gains on complex cases.

RDMA: Cost Effective Agent-Driven Rare Disease Mining from Electronic Health Records

cs.LG · 2025-07-14 · unverdicted · novelty 6.0

RDMA equips small LLMs with abbreviation resolution, phenotype reasoning, and ontology tools to mine rare diseases from EHR notes, outperforming fine-tuned and RAG baselines at up to 10x lower inference cost.

MedMIX: Modality-Internal Expert Fusion for Multimodal Medical Diagnosis

cs.LG · 2026-05-15 · unverdicted · novelty 5.0

MedMIX combines intra-modality expert fusion, learned inter-modality fusion, and training-only large-small collaboration to deliver robust multimodal medical prediction under incomplete modalities across three benchmarks.

AgentRx: A Benchmark Study of LLM Agents for Multimodal Clinical Prediction Tasks

cs.AI · 2026-05-11 · unverdicted · novelty 5.0

Single-agent LLM frameworks outperform naive multi-agent systems in multimodal clinical risk prediction tasks and are better calibrated.

citing papers explorer

Showing 6 of 6 citing papers.

MedicalBench: Evaluating Large Language Models Toward Improved Medical Concept Extraction cs.CL · 2026-04-05 · unverdicted · none · ref 7
MedicalBench is a benchmark for implicit medical concept extraction and sentence-level evidence retrieval built from MIMIC-IV discharge summaries with human verification to test LLM reasoning on unstated medical ideas.
MILM: Large Language Models for Multimodal Irregular Time Series with Informative Sampling cs.LG · 2026-05-13 · unverdicted · none · ref 63
MILM fine-tunes LLMs on XML-encoded multimodal irregular time series via a two-stage process that exploits informative sampling patterns to achieve top performance on EHR classification datasets.
Agentic clinical reasoning over longitudinal myeloma records: a retrospective evaluation against expert consensus cs.AI · 2026-04-27 · unverdicted · none · ref 18
An agentic LLM reasoning system reached 79.6% agreement with expert consensus on myeloma care questions from longitudinal records, outperforming iterative RAG and full-context baselines by 3.8-4.2 points with larger gains on complex cases.
RDMA: Cost Effective Agent-Driven Rare Disease Mining from Electronic Health Records cs.LG · 2025-07-14 · unverdicted · none · ref 17
RDMA equips small LLMs with abbreviation resolution, phenotype reasoning, and ontology tools to mine rare diseases from EHR notes, outperforming fine-tuned and RAG baselines at up to 10x lower inference cost.
MedMIX: Modality-Internal Expert Fusion for Multimodal Medical Diagnosis cs.LG · 2026-05-15 · unverdicted · none · ref 16
MedMIX combines intra-modality expert fusion, learned inter-modality fusion, and training-only large-small collaboration to deliver robust multimodal medical prediction under incomplete modalities across three benchmarks.
AgentRx: A Benchmark Study of LLM Agents for Multimodal Clinical Prediction Tasks cs.AI · 2026-05-11 · unverdicted · none · ref 35
Single-agent LLM frameworks outperform naive multi-agent systems in multimodal clinical risk prediction tasks and are better calibrated.

Alistair Johnson, Tom Pollard, Steven Horng, Leo Anthony Celi, and Roger Mark

fields

years

verdicts

representative citing papers

citing papers explorer