IyàwóBench is the first benchmark for LLM clinical triage accuracy on undifferentiated febrile illness using 200 synthetic vignettes from Nigerian PHCs, with results showing 100% safety but accuracy from 39% to 70.5%.
Large language models encode clinical knowledge
2 Pith papers cite this work. Polarity classification is still indexing.
years
2026 2verdicts
UNVERDICTED 2representative citing papers
ELDER-SIM builds personality-stable elderly digital twins via LLM orchestration with OCEAN traits, Beck CBT diagrams, long-term memory, and LoRA fine-tuning on CHARLS data, validated by Cronbach's alpha 0.70-0.94 and ICC 0.85-0.96.
citing papers explorer
-
Iy\`aw\'oBench: A Benchmark for Evaluating Large Language Model Clinical Triage Accuracy on Undifferentiated Febrile Illness in Nigerian Primary Health Settings
IyàwóBench is the first benchmark for LLM clinical triage accuracy on undifferentiated febrile illness using 200 synthetic vignettes from Nigerian PHCs, with results showing 100% safety but accuracy from 39% to 70.5%.
-
Elder-Sim: A Psychometrically Validated Platform for Personality-Stable Elderly Digital Twins
ELDER-SIM builds personality-stable elderly digital twins via LLM orchestration with OCEAN traits, Beck CBT diagrams, long-term memory, and LoRA fine-tuning on CHARLS data, validated by Cronbach's alpha 0.70-0.94 and ICC 0.85-0.96.