NEJM AI , volume =

· 2025 · DOI 10.1056/aidbp2500144

2 Pith papers cite this work. Polarity classification is still indexing.

2 Pith papers citing it

open at publisher browse 2 citing papers

citation-role summary

background 2

citation-polarity summary

background 2

representative citing papers

ABRA: Agent Benchmark for Radiology Applications

cs.CV · 2026-05-11 · unverdicted · novelty 8.0

ABRA shows radiology agents excel at tool execution (89%+) but struggle with outcomes (0-25%), with oracle perception raising outcomes to 69-100%, identifying perception as the primary bottleneck.

ClinicalBench: Stress-Testing Assertion-Aware Retrieval for Cross-Admission Clinical QA on MIMIC-IV

cs.CL · 2026-05-11 · conditional · novelty 6.0

Intent-aware retrieval over assertion-labeled knowledge graphs improves clinical QA accuracy by 22 percentage points on a new MIMIC-IV benchmark that stresses negation, temporality, and attribution.

citing papers explorer

Showing 1 of 1 citing paper after filters.

ClinicalBench: Stress-Testing Assertion-Aware Retrieval for Cross-Admission Clinical QA on MIMIC-IV cs.CL · 2026-05-11 · conditional · none · ref 23
Intent-aware retrieval over assertion-labeled knowledge graphs improves clinical QA accuracy by 22 percentage points on a new MIMIC-IV benchmark that stresses negation, temporality, and attribution.

NEJM AI , volume =

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer