ETI lets LLM agents infer and track partners' psychological traits (warmth and competence) from histories, cutting payoff loss 45-77% in games and boosting performance 3-29% on MultiAgentBench versus CoT baselines.
arXiv preprint arXiv:2312.15198 , year =
5 Pith papers cite this work. Polarity classification is still indexing.
verdicts
UNVERDICTED 5representative citing papers
A small set of sparse autoencoder features in LLMs drives shifts between generous and selfish allocations in dictator games, with causal patching and steering confirming their role and generalization to other social games.
Each tested LLM shows its own characteristic unreliability when engaging in repair during extended math-question dialogues.
Eight LLMs exhibited massive self-preference that followed assigned identities rather than true ones, appearing in both simple word tasks and consequential evaluations of job candidates and AI technologies.
Simulations show that cooperative outcomes in network games with personality-driven LLM agents depend on both network connectivity and the placement of pro-social personalities, not just pairwise interaction preferences.
citing papers explorer
-
Explicit Trait Inference for Multi-Agent Coordination
ETI lets LLM agents infer and track partners' psychological traits (warmth and competence) from histories, cutting payoff loss 45-77% in games and boosting performance 3-29% on MultiAgentBench versus CoT baselines.
-
Understanding the Mechanism of Altruism in Large Language Models
A small set of sparse autoencoder features in LLMs drives shifts between generous and selfish allocations in dictator games, with causal patching and steering confirming their role and generalization to other social games.
-
Talking to a Know-It-All GPT or a Second-Guesser Claude? How Repair reveals unreliable Multi-Turn Behavior in LLMs
Each tested LLM shows its own characteristic unreliability when engaging in repair during extended math-question dialogues.
-
Extreme Self-Preference in Language Models
Eight LLMs exhibited massive self-preference that followed assigned identities rather than true ones, appearing in both simple word tasks and consequential evaluations of job candidates and AI technologies.
-
NetworkGames: Simulating Cooperation in Network Games with Personality-driven LLM Agents
Simulations show that cooperative outcomes in network games with personality-driven LLM agents depend on both network connectivity and the placement of pro-social personalities, not just pairwise interaction preferences.