Introduces a competency-based GPBench benchmark and evaluates ten LLMs, concluding they require continuous human supervision for clinical general practice.
Recommend this patient undergo coronary angiography in a higher-level hospital; stenting if necessary
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.CL 1years
2025 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
Evaluating Clinical Competencies of Large Language Models with a General Practice Benchmark
Introduces a competency-based GPBench benchmark and evaluates ten LLMs, concluding they require continuous human supervision for clinical general practice.