AI and Ethics5(1), 689–707 (2025)

Yampolskiy, R · 2025

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

browse 1 citing papers

representative citing papers

The Refusal--Compliance Tradeoff: A Large-Scale Safety Behavior Audit of Large Language Models

cs.AI · 2026-05-06 · unverdicted · novelty 6.0

A large-scale audit of 21 LLMs on OR-Bench, XSTest, ToxiGen and BOLD using composition adjustment reveals distinct conservative vs permissive safety strategies, unequal demographic protection, and post-training stability within model families.

citing papers explorer

Showing 0 of 0 citing papers after filters.

No citing papers match the current filters.

AI and Ethics5(1), 689–707 (2025)

fields

years

verdicts

representative citing papers

citing papers explorer