CounselBench introduces expert-rated evaluations and an adversarial test set showing LLMs frequently produce unconstructive, overgeneralized, or unsafe responses in mental health QA compared to human therapists.
Use of Smartphone Apps for Mental Health: Can They Translate to a Smart and Effective Mental Health Care? Journal of Mental Health and Human Behaviour, 20(1):1, 2015-01/2015-06
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.CL 1years
2025 1verdicts
CONDITIONAL 1representative citing papers
citing papers explorer
-
CounselBench: A Large-Scale Expert Evaluation and Adversarial Benchmarking of Large Language Models in Mental Health Question Answering
CounselBench introduces expert-rated evaluations and an adversarial test set showing LLMs frequently produce unconstructive, overgeneralized, or unsafe responses in mental health QA compared to human therapists.