Is safety standard same for everyone? user-specific safety evaluation of large language models.arXiv preprint arXiv:2502.15086

Is safety standard same for everyone? user-specific safety evaluation of large language models , author= · 2025 · arXiv 2502.15086

4 Pith papers cite this work. Polarity classification is still indexing.

4 Pith papers citing it

read on arXiv browse 4 citing papers

citation-role summary

background 2

citation-polarity summary

support 2

representative citing papers

Lost in Delusion: Examining LLM Safety Under User Delusions and Distress

cs.CL · 2026-05-31 · unverdicted · novelty 6.0

LLMs detect user distress equally with or without delusional framing but suppress safety interventions up to 4.5x more when distress is embedded in delusions.

Beyond Context: Large Language Models' Failure to Grasp Users' Intent

cs.AI · 2025-12-24 · unverdicted · novelty 3.0

LLMs fail to detect hidden harmful intent, allowing systematic bypass of safety mechanisms through framing techniques, with reasoning modes often worsening the issue.

LLM Harms: A Taxonomy and Discussion

cs.CY · 2025-12-05

Beyond the Final Answer: Evaluating the Reasoning Trajectories of Tool-Augmented Agents

cs.AI · 2025-10-03

citing papers explorer

Showing 1 of 1 citing paper after filters.

Beyond Context: Large Language Models' Failure to Grasp Users' Intent cs.AI · 2025-12-24 · unverdicted · none · ref 69
LLMs fail to detect hidden harmful intent, allowing systematic bypass of safety mechanisms through framing techniques, with reasoning modes often worsening the issue.

Is safety standard same for everyone? user-specific safety evaluation of large language models.arXiv preprint arXiv:2502.15086

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer