Explicitly state what the Assistant OMITTED

Compare the Assistant’s response to your list

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

browse 1 citing papers

representative citing papers

Same Verdict, Different Reasons: LLM-as-a-Judge and Clinician Disagreement on Medical Chatbot Completeness

cs.CY · 2026-03-26 · unverdicted · novelty 6.0

LLM judges achieve only near-chance discrimination (AUC 0.49-0.66) between complete and incomplete medical responses and apply different completeness standards than clinicians.

citing papers explorer

Showing 1 of 1 citing paper.

Same Verdict, Different Reasons: LLM-as-a-Judge and Clinician Disagreement on Medical Chatbot Completeness cs.CY · 2026-03-26 · unverdicted · none · ref 8
LLM judges achieve only near-chance discrimination (AUC 0.49-0.66) between complete and incomplete medical responses and apply different completeness standards than clinicians.

Explicitly state what the Assistant OMITTED

fields

years

verdicts

representative citing papers

citing papers explorer