LLMs compute Nash actions internally but suppress them via prosocial overrides from training data, and this can be causally controlled through residual stream interventions.
Title resolution pending
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.GT 1years
2026 1verdicts
ACCEPT 1representative citing papers
citing papers explorer
-
What Suppresses Nash Equilibrium Play in Large Language Models? Mechanistic Evidence and Causal Control
LLMs compute Nash actions internally but suppress them via prosocial overrides from training data, and this can be causally controlled through residual stream interventions.