A diagnostic meta-analysis of the patient health questionnaire-9 (phq-9) algorithm scoring method as a screen for depression.General hospital psychiatry, 37(1):67–75

Laura Manea, Simon Gilbody, Dean McMillan · 2015

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

browse 1 citing papers

representative citing papers

Expert Evaluation and the Limits of Human Feedback in Mental Health AI Safety Testing

cs.AI · 2026-01-26 · unverdicted · novelty 6.0

Psychiatrists show low inter-rater reliability when evaluating LLM mental health responses, with systematic disagreement reflecting distinct clinical frameworks rather than random error.

citing papers explorer

Showing 1 of 1 citing paper.

Expert Evaluation and the Limits of Human Feedback in Mental Health AI Safety Testing cs.AI · 2026-01-26 · unverdicted · none · ref 38
Psychiatrists show low inter-rater reliability when evaluating LLM mental health responses, with systematic disagreement reflecting distinct clinical frameworks rather than random error.

A diagnostic meta-analysis of the patient health questionnaire-9 (phq-9) algorithm scoring method as a screen for depression.General hospital psychiatry, 37(1):67–75

fields

years

verdicts

representative citing papers

citing papers explorer