Safety-aligned LLMs fail at core therapeutic tasks in simulated prolonged exposure and CBT sessions by grounding patients, offering false reassurance, and refusing to challenge harmful cognitions.
Scoping review of 132 studies following PRISMA-ScR guidelines
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.CL 1years
2026 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
AI Safety Training Can be Clinically Harmful
Safety-aligned LLMs fail at core therapeutic tasks in simulated prolonged exposure and CBT sessions by grounding patients, offering false reassurance, and refusing to challenge harmful cognitions.