Autocurricula and the Emergence of Innovation from Social Interaction: A Manifesto for Multi-Agent Intelligence Research

Edward Hughes; Joel Z. Leibo; Marc Lanctot; Thore Graepel

arxiv: 1903.00742 · v2 · pith:XSFOZJS6new · submitted 2019-03-02 · 💻 cs.AI · cs.GT· cs.MA· cs.NE· q-bio.NC

Autocurricula and the Emergence of Innovation from Social Interaction: A Manifesto for Multi-Agent Intelligence Research

Joel Z. Leibo , Edward Hughes , Marc Lanctot , Thore Graepel This is my paper

classification 💻 cs.AI cs.GTcs.MAcs.NEq-bio.NC

keywords socialchallengesinnovationinnovationsmulti-agentaccumulateadaptiveagents

0 comments

read the original abstract

Evolution has produced a multi-scale mosaic of interacting adaptive units. Innovations arise when perturbations push parts of the system away from stable equilibria into new regimes where previously well-adapted solutions no longer work. Here we explore the hypothesis that multi-agent systems sometimes display intrinsic dynamics arising from competition and cooperation that provide a naturally emergent curriculum, which we term an autocurriculum. The solution of one social task often begets new social tasks, continually generating novel challenges, and thereby promoting innovation. Under certain conditions these challenges may become increasingly complex over time, demanding that agents accumulate ever more innovations.

This paper has not been read by Pith yet.

discussion (0)

Forward citations

Cited by 5 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

Whose Good, Whose Place? The Moral Geography of Agentic AI for Social Good
cs.CY 2026-05 unverdicted novelty 7.0

Survey of 112 agentic AI for social good papers reveals moral-geographic asymmetry with 73% lacking geographic context (lowest for SDG 16) and only 25% reporting deployments.
MetaPS: Adaptive Programmatic Strategy Selection for Market Agents
cs.AI 2026-06 unverdicted novelty 6.0

MetaPS trains models via simulation rollouts to select from programmatic strategy libraries for market agents, yielding better performance than fixed or direct LLM baselines across model sizes.
Human-like autonomy emerges from self-play and a pinch of human data
cs.LG 2026-06 unverdicted novelty 5.0

Self-play RL regularized with 30 minutes of human data produces driving policies that coordinate with humans, training in 15 hours on one GPU with 2500x less data than imitation learning.
Solipsistic Superintelligence is Unlikely to be Cooperative
cs.AI 2026-06 unverdicted novelty 5.0

Solipsistic superintelligence developed via unilateral optimization is unlikely to cooperate due to endogenous non-stationarity creating an unclosable train-test-deploy gap.
From AGI to ASI
cs.AI 2026-06 unverdicted novelty 3.0

The paper characterizes ASI and examines scaling, paradigm shifts, recursive self-improvement, and multi-agent collectives as routes from AGI to ASI, together with frictions and open questions about acceleration.