Few-shot prompted LLMs reach macro-F1 0.475 on four-class patient inquiry triage from HealthCareMagic-100K, modestly above BioBERT baseline of 0.378 with overlapping confidence intervals, supporting selective human review but not autonomous use.
InProceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024), pages 2049–2066, Torino, Italia
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.CL 1years
2026 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
Few-Shot Large Language Models for Actionable Triage Categorization of Online Patient Inquiries
Few-shot prompted LLMs reach macro-F1 0.475 on four-class patient inquiry triage from HealthCareMagic-100K, modestly above BioBERT baseline of 0.378 with overlapping confidence intervals, supporting selective human review but not autonomous use.