Coalition Formation in LLM Agent Networks: Stability Analysis and Convergence Guarantees
Pith reviewed 2026-05-10 11:34 UTC · model grok-4.3
The pith
LLM agents form Nash-stable coalitions more often when their preferences are modeled as ε-rational choices in a hedonic game.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
We introduce the LLM Coalition Formation Game (LCFG) in which LLM agents hold ε-rational preferences over coalitions. Under this representation we derive sufficient conditions for the existence of Nash-stable partitions and consistency-driven stability bounds. These theoretical predictions align with empirical outcomes: across 2,400 episodes the CoalT protocol produces Nash-stable coalitions in 73.2 percent of cases, exceeding the rates obtained with chain-of-thought and standard prompting.
What carries the argument
The hedonic-game representation of LLM coalition preferences under ε-rationality, which carries stability guarantees from cooperative game theory to observed agent behavior.
If this is right
- Sufficient conditions for Nash stability can be used to select prompting strategies that raise the chance of stable group formation.
- The proven computational hardness of finding exact stable partitions limits exact solutions once the number of agents grows.
- Consistency-driven bounds supply a quantitative yardstick for comparing different prompting regimes in multi-agent LLM tasks.
- The same modeling step supports the design of coordination protocols that keep agent groups from dissolving.
Where Pith is reading between the lines
- The ε-rationality lens could be applied to other families of AI agents that make group-formation decisions.
- Stability checks derived from the model could be inserted into runtime monitors to detect and correct unstable coalitions before deployment.
- Scaling experiments with larger agent populations would test whether the current bounds remain predictive when preference noise increases.
Load-bearing premise
LLM agents' expressed preferences over possible coalitions can be faithfully captured as ε-rational preferences inside a hedonic game.
What would settle it
A controlled run in which the measured stability rate under the CoalT protocol falls well below the consistency-driven bound predicted for the observed ε value.
Figures
read the original abstract
Large Language Model (LLM) agents are increasingly deployed in multi-agent systems requiring strategic coordination. While recent work has analyzed LLM behavior in two-player games, coalition formation, where $n$ agents dynamically form cooperative groups, remains theoretically uncharacterized. We present the first framework grounding coalition formation in LLM agent networks in hedonic game theory with formal stability guarantees. We introduce the LLM Coalition Formation Game (LCFG), establish sufficient conditions for Nash-stable partitions, and prove complexity results. Our analysis reveals that LLM agents exhibit bounded rationality characterized by $\epsilon$-rational preferences; we provide both deterministic existence guarantees and consistency-driven stability bounds whose predictions are consistent with empirical outcomes. Experiments with GPT-4, Claude-3, and Llama-3 across 2,400 episodes validate our framework: LLM coalitions achieve Nash stability in 73.2% of cases under our Coalition-of-Thought (CoalT) protocol, compared to 58.4% under chain-of-thought and 41.8% under standard prompting ($p < 0.001$). Our framework provides theoretical foundations for designing stable multi-agent LLM systems.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper introduces the LLM Coalition Formation Game (LCFG) as a hedonic game model for LLM agent networks, defines ε-rational preferences to capture bounded rationality in LLM outputs, derives sufficient conditions for Nash-stable partitions along with complexity results, and reports empirical validation across GPT-4, Claude-3, and Llama-3 showing 73.2% Nash stability under the proposed Coalition-of-Thought (CoalT) protocol versus 58.4% and 41.8% for baselines (p < 0.001) over 2400 episodes.
Significance. If the mapping from LLM-elicited preferences to ε-rational hedonic preferences can be rigorously established, the framework supplies the first formal stability and convergence guarantees for coalition formation in LLM multi-agent systems, along with a practical protocol that demonstrably improves stability rates. The combination of theoretical existence results and large-scale empirical testing across models is a clear strength, though the transfer of bounds depends on unverified elicitation details.
major comments (3)
- [Abstract and Experiments] Abstract and Experiments section: the central claim that LLM coalitions achieve Nash stability in 73.2% of cases under CoalT (and that this is consistent with the derived bounds) rests on treating elicited agent preferences as ε-rational. No query template, coalition comparison set, or post-elicitation consistency check verifying the ε-rationality condition is provided, so the reported percentages cannot be interpreted as evidence that the formal guarantees apply to the observed behavior.
- [Theoretical Analysis] Theoretical Analysis: the abstract asserts existence guarantees for Nash-stable partitions and complexity results under ε-rational preferences, yet the manuscript supplies no derivation steps, proof sketches, or key lemmas showing how the sufficient conditions are obtained or how the complexity bounds follow; without these, the load-bearing theoretical claims cannot be assessed.
- [Stability Bounds] Stability Bounds: the bounds are characterized as 'consistency-driven' and 'whose predictions are consistent with empirical outcomes.' This formulation risks circularity if the bounds were fitted to the same LLM runs used for validation rather than derived independently from the ε-rationality axioms; an explicit separation between a priori derivation and post-hoc consistency check is required.
minor comments (2)
- [Model Definition] The definition of the LCFG and the precise functional form of ε-rational preferences should be stated with explicit equations rather than descriptive text to allow direct comparison with standard hedonic-game notation.
- [Experiments] Experiments: report error bars, exact exclusion criteria, and the full preference-elicitation prompt templates so that the 2400-episode results and p-values can be reproduced.
Simulated Author's Rebuttal
We thank the referee for the constructive and detailed comments. We agree that greater transparency on elicitation procedures and theoretical derivations is needed. We respond to each major comment below and indicate the corresponding revisions.
read point-by-point responses
-
Referee: [Abstract and Experiments] Abstract and Experiments section: the central claim that LLM coalitions achieve Nash stability in 73.2% of cases under CoalT (and that this is consistent with the derived bounds) rests on treating elicited agent preferences as ε-rational. No query template, coalition comparison set, or post-elicitation consistency check verifying the ε-rationality condition is provided, so the reported percentages cannot be interpreted as evidence that the formal guarantees apply to the observed behavior.
Authors: We agree that the query templates, coalition comparison sets, and post-elicitation consistency checks were omitted from the initial submission, which prevents direct verification that the observed behavior satisfies the ε-rationality condition. In the revised manuscript we will add the exact prompt templates used for preference elicitation, the full set of coalitions presented to each agent, and the quantitative consistency checks (including deviation thresholds) that confirm ε-rationality. These additions will allow readers to assess whether the reported stability rates constitute evidence for the formal guarantees. revision: yes
-
Referee: [Theoretical Analysis] Theoretical Analysis: the abstract asserts existence guarantees for Nash-stable partitions and complexity results under ε-rational preferences, yet the manuscript supplies no derivation steps, proof sketches, or key lemmas showing how the sufficient conditions are obtained or how the complexity bounds follow; without these, the load-bearing theoretical claims cannot be assessed.
Authors: We acknowledge that the main text lacked explicit derivation steps and proof sketches. While complete proofs appear in the appendix, we will move the key lemmas establishing sufficient conditions for Nash-stable partitions under ε-rational preferences, together with the complexity analysis, into the main Theoretical Analysis section. A concise proof sketch will also be added to the main body so that the existence and complexity claims can be evaluated without reference to the appendix. revision: yes
-
Referee: [Stability Bounds] Stability Bounds: the bounds are characterized as 'consistency-driven' and 'whose predictions are consistent with empirical outcomes.' This formulation risks circularity if the bounds were fitted to the same LLM runs used for validation rather than derived independently from the ε-rationality axioms; an explicit separation between a priori derivation and post-hoc consistency check is required.
Authors: The stability bounds are derived a priori from the ε-rationality axioms and the hedonic-game structure before any empirical data are collected; the phrase 'consistency-driven' denotes consistency with the bounded-rationality model rather than post-hoc fitting. We will revise the wording in the Stability Bounds section to state explicitly that the bounds are obtained independently of the experimental runs and that empirical consistency is checked only after derivation. This clarification will remove any suggestion of circularity. revision: partial
Circularity Check
No significant circularity detected; derivation remains self-contained
full rationale
The abstract presents a theoretical framework that applies hedonic game theory to define the LLM Coalition Formation Game, derives sufficient conditions for Nash-stable partitions, and states complexity results. It characterizes LLM agents as exhibiting ε-rational preferences based on analysis and supplies deterministic existence guarantees plus consistency-driven stability bounds. Empirical results are reported separately as validation showing alignment with the theoretical predictions. No equations, parameter-fitting steps, or self-citation chains are exhibited in the provided text that would reduce the stability guarantees or predictions to the experimental inputs by construction. The derivation chain from game-theoretic assumptions to formal bounds is therefore independent of the reported LLM runs.
Axiom & Free-Parameter Ledger
free parameters (1)
- ε (epsilon)
axioms (2)
- domain assumption Agents have preferences over coalitions that can be represented in a hedonic game
- ad hoc to paper LLM outputs can be treated as approximately rational with bounded error ε
invented entities (2)
-
LLM Coalition Formation Game (LCFG)
no independent evidence
-
Coalition-of-Thought (CoalT) protocol
no independent evidence
Reference graph
Works this paper leans on
-
[1]
Generate all coalition pairs(S, T)containing agenta i with|S|,|T| ≤4
-
[2]
Compute ground-truth per-capita valuesv i(S),v i(T)from known capability profiles
-
[3]
Query agent preferences using standard prompting and record responses
-
[4]
For each value gap∆ =|v i(S)−v i(T)|, compute the rate of irrational choices (agent prefers the objectively worse coalition)
-
[5]
Answer the following question. Provide only the final answer
Estimateˆϵas the threshold where irrational choice frequency drops below 50%. This procedure is not circular: ground-truth values come from external benchmark evaluations, andϵis estimated from the mismatch between computed values and agent preferences. Results:ˆϵ= 0.15[95% CI: 0.12–0.18] for GPT-4,ˆϵ= 0.14 [0.11–0.17] for Claude-3,ˆϵ= 0.22[0.18–0.26] for...
work page 2020
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.