LLM societies in Nomic show non-monotonic collective adaptation peaking at mid-scales, with smaller models rule-inert and larger ones restrictive.
secret hitler
4 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
verdicts
UNVERDICTED 4roles
background 1polarities
background 1representative citing papers
Hybrid Bayesian-graph LLM agent reaches competitive performance against large models and achieves 67% win rate against humans in controlled Avalon play, outperforming baselines and human teammates.
LLMs achieve only 59.7% role identification accuracy in Secret Hitler versus 86.7% for rule-based agents, show negative impact as fascists, and produce 40% shorter games due to failed deception.
SOM uses a Structural Causal Model to create an explicit graph of opponent observation-to-action links, allowing LLMs to reason along those paths for more accurate and stable predictions in multi-agent settings.
citing papers explorer
-
Scale-Dependent Collective Adaptation in Self-Amending LLM Societies: A Cross-Family Study of Emergent Governance
LLM societies in Nomic show non-monotonic collective adaptation peaking at mid-scales, with smaller models rule-inert and larger ones restrictive.
-
Bayesian Social Deduction with Graph-Informed Language Models
Hybrid Bayesian-graph LLM agent reaches competitive performance against large models and achieves 67% win rate against humans in controlled Avalon play, outperforming baselines and human teammates.
-
Evaluating Large Language Models in a Complex Hidden Role Game
LLMs achieve only 59.7% role identification accuracy in Secret Hitler versus 86.7% for rule-based agents, show negative impact as fascists, and produce 40% shorter games due to failed deception.
-
SOM: Structured Opponent Modeling for LLM-based Agents via Structural Causal Model
SOM uses a Structural Causal Model to create an explicit graph of opponent observation-to-action links, allowing LLMs to reason along those paths for more accurate and stable predictions in multi-agent settings.