LM agents' changeable modules prevent persistent identity and sanction sensitivity, making reputation mechanisms structurally inapplicable and requiring protocol-based behavioral harnesses instead.
Liao, Esin Durmus, Alex Tamkin, and Deep Ganguli
3 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
years
2026 3verdicts
UNVERDICTED 3roles
background 1polarities
background 1representative citing papers
Anthropic's comprehensive AI constitution excludes military contexts and preempts democratic deliberation, revealing a political community deficit in AI governance.
ATLAS shows constitutions induce recoverable latent geometry in LLMs that redistributes but remains detectable across models and neural perturbation data via source-defined families and AUC separations.
citing papers explorer
-
Dissociative Identity: Language Model Agents Lack Grounding for Reputation Mechanisms
LM agents' changeable modules prevent persistent identity and sanction sensitivity, making reputation mechanisms structurally inapplicable and requiring protocol-based behavioral harnesses instead.
-
Corporations Constitute Intelligence
Anthropic's comprehensive AI constitution excludes military contexts and preempts democratic deliberation, revealing a political community deficit in AI governance.