In2025 IEEE Conference on Se- cure and Trustworthy Machine Learning (SaTML), pages 23–42

Jailbreaking black box large language models in twenty queries · 2024 · arXiv 2407.11387

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

representative citing papers

MHSafeEval: Role-Aware Interaction-Level Evaluation of Mental Health Safety in Large Language Models

cs.CL · 2026-04-20 · unverdicted · novelty 7.0

MHSafeEval applies a new role-aware taxonomy to discover cumulative mental health harms in LLM counseling trajectories via adversarial multi-turn interactions, revealing failures missed by static benchmarks.

citing papers explorer

Showing 1 of 1 citing paper.

MHSafeEval: Role-Aware Interaction-Level Evaluation of Mental Health Safety in Large Language Models cs.CL · 2026-04-20 · unverdicted · none · ref 1
MHSafeEval applies a new role-aware taxonomy to discover cumulative mental health harms in LLM counseling trajectories via adversarial multi-turn interactions, revealing failures missed by static benchmarks.

In2025 IEEE Conference on Se- cure and Trustworthy Machine Learning (SaTML), pages 23–42

fields

years

verdicts

representative citing papers

citing papers explorer