MIT press, ??? (2009)

Koller, D · 2009

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

browse 1 citing papers

representative citing papers

The Refusal--Compliance Tradeoff: A Large-Scale Safety Behavior Audit of Large Language Models

cs.AI · 2026-05-06 · unverdicted · novelty 6.0

A large-scale audit of 21 LLMs on OR-Bench, XSTest, ToxiGen and BOLD using composition adjustment reveals distinct conservative vs permissive safety strategies, unequal demographic protection, and post-training stability within model families.

citing papers explorer

Showing 1 of 1 citing paper after filters.

The Refusal--Compliance Tradeoff: A Large-Scale Safety Behavior Audit of Large Language Models cs.AI · 2026-05-06 · unverdicted · none · ref 27
A large-scale audit of 21 LLMs on OR-Bench, XSTest, ToxiGen and BOLD using composition adjustment reveals distinct conservative vs permissive safety strategies, unequal demographic protection, and post-training stability within model families.

MIT press, ??? (2009)

fields

years

verdicts

representative citing papers

citing papers explorer