Building trust in mental health chatbots: safety metrics and llm-based evaluation tools.arXiv preprint arXiv:2408.04650, 2024

Jung In Park, Mahyar Abbasian, Iman Azimi, Dawn T Bounds, Angela Jun, Jaesu Han, Robert M McCarron, Jessica Borelli, Parmida Safavi, Sanaz Mirbaha, et al · 2024 · arXiv 2408.04650

3 Pith papers cite this work. Polarity classification is still indexing.

3 Pith papers citing it

read on arXiv browse 3 citing papers

representative citing papers

Using LLM-as-a-Judge/Jury to Advance Scalable, Clinically-Validated Safety Evaluations of Model Responses to Users Demonstrating Psychosis

cs.CL · 2026-03-20 · conditional · novelty 7.0

Seven clinician-informed safety criteria enable LLM-as-a-Judge to reach substantial agreement with human consensus (Cohen's κ up to 0.75) on evaluating LLM responses to users demonstrating psychosis.

Between Help and Harm: An Evaluation of Mental Health Crisis Handling by LLMs

cs.CL · 2025-09-29 · conditional · novelty 6.0

Creates a clinical crisis taxonomy and 2,252-example dataset then audits five LLMs, finding variable safety with notable failures on indirect signals and in self-harm categories.

Mental Health AI Safety Claims Must Preserve Temporal Evidence

cs.AI · 2026-05-09 · unverdicted · novelty 5.0

Mental health AI safety evaluations that discard temporal sequence and accumulation produce invalid conclusions; the paper formalizes this as Temporal Safety Non-Identifiability and proposes SCOPE-MH as a reporting standard that preserves evidence.

citing papers explorer

Showing 3 of 3 citing papers.

Using LLM-as-a-Judge/Jury to Advance Scalable, Clinically-Validated Safety Evaluations of Model Responses to Users Demonstrating Psychosis cs.CL · 2026-03-20 · conditional · none · ref 51
Seven clinician-informed safety criteria enable LLM-as-a-Judge to reach substantial agreement with human consensus (Cohen's κ up to 0.75) on evaluating LLM responses to users demonstrating psychosis.
Between Help and Harm: An Evaluation of Mental Health Crisis Handling by LLMs cs.CL · 2025-09-29 · conditional · none · ref 42
Creates a clinical crisis taxonomy and 2,252-example dataset then audits five LLMs, finding variable safety with notable failures on indirect signals and in self-harm categories.
Mental Health AI Safety Claims Must Preserve Temporal Evidence cs.AI · 2026-05-09 · unverdicted · none · ref 8
Mental health AI safety evaluations that discard temporal sequence and accumulation produce invalid conclusions; the paper formalizes this as Temporal Safety Non-Identifiability and proposes SCOPE-MH as a reporting standard that preserves evidence.

Building trust in mental health chatbots: safety metrics and llm-based evaluation tools.arXiv preprint arXiv:2408.04650, 2024

fields

years

verdicts

representative citing papers

citing papers explorer