Centralized matching mechanisms outperform free negotiation in stability and efficiency with LLM agents, who also report preferences truthfully more often than humans, though not always in line with strategy-proofness predictions.
Language models as agent models
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
years
2026 2verdicts
UNVERDICTED 2representative citing papers
Narrow constitutional finetuning on safety sub-tasks induces emergent alignment across broader safety domains and yields projectable ethical personas whose signatures can be measured with a multidimensional diagnostic.
citing papers explorer
-
Emergent alignment and the projectability of ethical personas
Narrow constitutional finetuning on safety sub-tasks induces emergent alignment across broader safety domains and yields projectable ethical personas whose signatures can be measured with a multidimensional diagnostic.