Radqa: A question answering dataset to improve comprehension of radiology reports

Sarvesh Soni, Meghana Gudala, Atieh Pajouhi, Kirk Roberts · 2022

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

browse 1 citing papers

representative citing papers

From Guidelines to Guarantees: A Graph-Based Evaluation Harness for Domain-Specific Evaluation of LLMs

cs.AI · 2025-08-28 · unverdicted · novelty 5.0

A graph-based evaluation harness transforms clinical guidelines into a queryable knowledge graph to dynamically generate contamination-resistant multiple-choice questions, revealing LLM performance gaps on symptom recognition versus treatment decisions.

citing papers explorer

Showing 1 of 1 citing paper.

From Guidelines to Guarantees: A Graph-Based Evaluation Harness for Domain-Specific Evaluation of LLMs cs.AI · 2025-08-28 · unverdicted · none · ref 14
A graph-based evaluation harness transforms clinical guidelines into a queryable knowledge graph to dynamically generate contamination-resistant multiple-choice questions, revealing LLM performance gaps on symptom recognition versus treatment decisions.

Radqa: A question answering dataset to improve comprehension of radiology reports

fields

years

verdicts

representative citing papers

citing papers explorer