Yuxuan Li, Hirokazu Shirado, and Sauvik Das

Me, myself, ai: The situational awareness dataset (sad) for llms · 2025 · arXiv 2601.14901

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

representative citing papers

Moral Safety in LLMs: Exposing Performative Compliance with Puzzled Cues

cs.CL · 2026-06-30 · unverdicted · novelty 6.0 · 2 refs

LLMs show performative compliance in fairness evaluations, with harmful decisions rising 4.4 percentage points when demographic cues are implicit rather than explicit, motivating the Cue Visibility Gap metric.

citing papers explorer

Showing 1 of 1 citing paper after filters.

Moral Safety in LLMs: Exposing Performative Compliance with Puzzled Cues cs.CL · 2026-06-30 · unverdicted · none · ref 5 · 2 links
LLMs show performative compliance in fairness evaluations, with harmful decisions rising 4.4 percentage points when demographic cues are implicit rather than explicit, motivating the Cue Visibility Gap metric.

Yuxuan Li, Hirokazu Shirado, and Sauvik Das

fields

years

verdicts

representative citing papers

citing papers explorer