Coalition Formation in LLM Agent Networks: Stability Analysis and Convergence Guarantees

Dongxin Guo; Jikun Wu; Siu-Ming Yiu

arxiv: 2604.14386 · v1 · submitted 2026-04-15 · 💻 cs.GT · cs.AI

Coalition Formation in LLM Agent Networks: Stability Analysis and Convergence Guarantees

Dongxin Guo , Jikun Wu , Siu-Ming Yiu This is my paper

Pith reviewed 2026-05-10 11:34 UTC · model grok-4.3

classification 💻 cs.GT cs.AI

keywords coalition formationhedonic gamesLLM agentsNash stabilitymulti-agent systemsbounded rationality

0 comments

The pith

LLM agents form Nash-stable coalitions more often when their preferences are modeled as ε-rational choices in a hedonic game.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

This paper builds a formal model that treats coalition formation among LLM agents as a hedonic game. Agents choose groups according to preferences that satisfy a bounded-rationality condition called ε-rationality, which lets the authors transfer classic stability results to LLM outputs. They introduce the LLM Coalition Formation Game, prove conditions under which Nash-stable partitions exist, and establish related complexity bounds. Experiments across multiple models show that a prompting method called Coalition-of-Thought raises the observed rate of stable coalitions above what standard or chain-of-thought prompting produces. The framework therefore supplies both a predictive tool and a practical way to improve coordination in networks of language-model agents.

Core claim

We introduce the LLM Coalition Formation Game (LCFG) in which LLM agents hold ε-rational preferences over coalitions. Under this representation we derive sufficient conditions for the existence of Nash-stable partitions and consistency-driven stability bounds. These theoretical predictions align with empirical outcomes: across 2,400 episodes the CoalT protocol produces Nash-stable coalitions in 73.2 percent of cases, exceeding the rates obtained with chain-of-thought and standard prompting.

What carries the argument

The hedonic-game representation of LLM coalition preferences under ε-rationality, which carries stability guarantees from cooperative game theory to observed agent behavior.

If this is right

Sufficient conditions for Nash stability can be used to select prompting strategies that raise the chance of stable group formation.
The proven computational hardness of finding exact stable partitions limits exact solutions once the number of agents grows.
Consistency-driven bounds supply a quantitative yardstick for comparing different prompting regimes in multi-agent LLM tasks.
The same modeling step supports the design of coordination protocols that keep agent groups from dissolving.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The ε-rationality lens could be applied to other families of AI agents that make group-formation decisions.
Stability checks derived from the model could be inserted into runtime monitors to detect and correct unstable coalitions before deployment.
Scaling experiments with larger agent populations would test whether the current bounds remain predictive when preference noise increases.

Load-bearing premise

LLM agents' expressed preferences over possible coalitions can be faithfully captured as ε-rational preferences inside a hedonic game.

What would settle it

A controlled run in which the measured stability rate under the CoalT protocol falls well below the consistency-driven bound predicted for the observed ε value.

Figures

Figures reproduced from arXiv: 2604.14386 by Dongxin Guo, Jikun Wu, Siu-Ming Yiu.

**Figure 1.** Figure 1: LCFG Framework Overview. Left: Input context with heterogeneous LLM agents showing capability profiles across Math, Facts, and Logic dimensions. Center: Coalition-of-Thought (CoalT) reasoning module, a 5-step pipeline that guides agents through structured coalition evaluation. The latent space visualization shows agents transitioning from scattered initial positions to clustered Nash-stable coalitions thr… view at source ↗

**Figure 2.** Figure 2: Nash stability rate vs. preference consistency across ex [PITH_FULL_IMAGE:figures/full_fig_p007_2.png] view at source ↗

read the original abstract

Large Language Model (LLM) agents are increasingly deployed in multi-agent systems requiring strategic coordination. While recent work has analyzed LLM behavior in two-player games, coalition formation, where $n$ agents dynamically form cooperative groups, remains theoretically uncharacterized. We present the first framework grounding coalition formation in LLM agent networks in hedonic game theory with formal stability guarantees. We introduce the LLM Coalition Formation Game (LCFG), establish sufficient conditions for Nash-stable partitions, and prove complexity results. Our analysis reveals that LLM agents exhibit bounded rationality characterized by $\epsilon$-rational preferences; we provide both deterministic existence guarantees and consistency-driven stability bounds whose predictions are consistent with empirical outcomes. Experiments with GPT-4, Claude-3, and Llama-3 across 2,400 episodes validate our framework: LLM coalitions achieve Nash stability in 73.2% of cases under our Coalition-of-Thought (CoalT) protocol, compared to 58.4% under chain-of-thought and 41.8% under standard prompting ($p < 0.001$). Our framework provides theoretical foundations for designing stable multi-agent LLM systems.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper brings hedonic games to LLM coalitions with a new prompting protocol and some stability numbers, but the theory does not clearly map onto the empirical measurements.

read the letter

The paper applies hedonic game theory to coalition formation among LLM agents and introduces a prompting protocol called CoalT that reportedly increases the rate of Nash-stable coalitions. The empirical results are the most concrete part, but the connection between the formal analysis and the observed behavior needs more support. They define the LLM Coalition Formation Game and provide conditions for stable partitions along with complexity results. The CoalT method is positioned as an improvement over chain-of-thought and standard prompting, with experiments across GPT-4, Claude-3, and Llama-3 in 2400 episodes showing 73.2% stability versus 58.4% and 41.8%. That difference is reported with p-values below 0.001, which suggests the protocol has some effect in their setup. The work does a reasonable job of bringing game-theoretic tools to this domain and running a multi-model comparison. The existence guarantees and the ε-rationality characterization are attempts to make the analysis tractable for boundedly rational agents like LLMs. The main weakness is that the paper does not detail how agent preferences over coalitions were elicited or checked to ensure they fit the ε-rational model used in the proofs. The stress-test concern holds up: without that mapping, the stability percentages cannot be read as confirmation that the theoretical guarantees apply to the actual LLM outputs. The bounds being 'consistent with empirical outcomes' also raises the possibility that they were adjusted to match the runs rather than predicted ahead of time. This paper is aimed at researchers working on multi-agent LLM systems who are looking for formal ways to think about coordination. Someone in that area would find the framework and the prompting results useful as a starting point, even with the gaps. It deserves peer review. The combination is new enough and the experiments substantial enough that referees could help tighten the methods and clarify the theory-experiment link.

Referee Report

3 major / 2 minor

Summary. The paper introduces the LLM Coalition Formation Game (LCFG) as a hedonic game model for LLM agent networks, defines ε-rational preferences to capture bounded rationality in LLM outputs, derives sufficient conditions for Nash-stable partitions along with complexity results, and reports empirical validation across GPT-4, Claude-3, and Llama-3 showing 73.2% Nash stability under the proposed Coalition-of-Thought (CoalT) protocol versus 58.4% and 41.8% for baselines (p < 0.001) over 2400 episodes.

Significance. If the mapping from LLM-elicited preferences to ε-rational hedonic preferences can be rigorously established, the framework supplies the first formal stability and convergence guarantees for coalition formation in LLM multi-agent systems, along with a practical protocol that demonstrably improves stability rates. The combination of theoretical existence results and large-scale empirical testing across models is a clear strength, though the transfer of bounds depends on unverified elicitation details.

major comments (3)

[Abstract and Experiments] Abstract and Experiments section: the central claim that LLM coalitions achieve Nash stability in 73.2% of cases under CoalT (and that this is consistent with the derived bounds) rests on treating elicited agent preferences as ε-rational. No query template, coalition comparison set, or post-elicitation consistency check verifying the ε-rationality condition is provided, so the reported percentages cannot be interpreted as evidence that the formal guarantees apply to the observed behavior.
[Theoretical Analysis] Theoretical Analysis: the abstract asserts existence guarantees for Nash-stable partitions and complexity results under ε-rational preferences, yet the manuscript supplies no derivation steps, proof sketches, or key lemmas showing how the sufficient conditions are obtained or how the complexity bounds follow; without these, the load-bearing theoretical claims cannot be assessed.
[Stability Bounds] Stability Bounds: the bounds are characterized as 'consistency-driven' and 'whose predictions are consistent with empirical outcomes.' This formulation risks circularity if the bounds were fitted to the same LLM runs used for validation rather than derived independently from the ε-rationality axioms; an explicit separation between a priori derivation and post-hoc consistency check is required.

minor comments (2)

[Model Definition] The definition of the LCFG and the precise functional form of ε-rational preferences should be stated with explicit equations rather than descriptive text to allow direct comparison with standard hedonic-game notation.
[Experiments] Experiments: report error bars, exact exclusion criteria, and the full preference-elicitation prompt templates so that the 2400-episode results and p-values can be reproduced.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for the constructive and detailed comments. We agree that greater transparency on elicitation procedures and theoretical derivations is needed. We respond to each major comment below and indicate the corresponding revisions.

read point-by-point responses

Referee: [Abstract and Experiments] Abstract and Experiments section: the central claim that LLM coalitions achieve Nash stability in 73.2% of cases under CoalT (and that this is consistent with the derived bounds) rests on treating elicited agent preferences as ε-rational. No query template, coalition comparison set, or post-elicitation consistency check verifying the ε-rationality condition is provided, so the reported percentages cannot be interpreted as evidence that the formal guarantees apply to the observed behavior.

Authors: We agree that the query templates, coalition comparison sets, and post-elicitation consistency checks were omitted from the initial submission, which prevents direct verification that the observed behavior satisfies the ε-rationality condition. In the revised manuscript we will add the exact prompt templates used for preference elicitation, the full set of coalitions presented to each agent, and the quantitative consistency checks (including deviation thresholds) that confirm ε-rationality. These additions will allow readers to assess whether the reported stability rates constitute evidence for the formal guarantees. revision: yes
Referee: [Theoretical Analysis] Theoretical Analysis: the abstract asserts existence guarantees for Nash-stable partitions and complexity results under ε-rational preferences, yet the manuscript supplies no derivation steps, proof sketches, or key lemmas showing how the sufficient conditions are obtained or how the complexity bounds follow; without these, the load-bearing theoretical claims cannot be assessed.

Authors: We acknowledge that the main text lacked explicit derivation steps and proof sketches. While complete proofs appear in the appendix, we will move the key lemmas establishing sufficient conditions for Nash-stable partitions under ε-rational preferences, together with the complexity analysis, into the main Theoretical Analysis section. A concise proof sketch will also be added to the main body so that the existence and complexity claims can be evaluated without reference to the appendix. revision: yes
Referee: [Stability Bounds] Stability Bounds: the bounds are characterized as 'consistency-driven' and 'whose predictions are consistent with empirical outcomes.' This formulation risks circularity if the bounds were fitted to the same LLM runs used for validation rather than derived independently from the ε-rationality axioms; an explicit separation between a priori derivation and post-hoc consistency check is required.

Authors: The stability bounds are derived a priori from the ε-rationality axioms and the hedonic-game structure before any empirical data are collected; the phrase 'consistency-driven' denotes consistency with the bounded-rationality model rather than post-hoc fitting. We will revise the wording in the Stability Bounds section to state explicitly that the bounds are obtained independently of the experimental runs and that empirical consistency is checked only after derivation. This clarification will remove any suggestion of circularity. revision: partial

Circularity Check

0 steps flagged

No significant circularity detected; derivation remains self-contained

full rationale

The abstract presents a theoretical framework that applies hedonic game theory to define the LLM Coalition Formation Game, derives sufficient conditions for Nash-stable partitions, and states complexity results. It characterizes LLM agents as exhibiting ε-rational preferences based on analysis and supplies deterministic existence guarantees plus consistency-driven stability bounds. Empirical results are reported separately as validation showing alignment with the theoretical predictions. No equations, parameter-fitting steps, or self-citation chains are exhibited in the provided text that would reduce the stability guarantees or predictions to the experimental inputs by construction. The derivation chain from game-theoretic assumptions to formal bounds is therefore independent of the reported LLM runs.

Axiom & Free-Parameter Ledger

1 free parameters · 2 axioms · 2 invented entities

The central claims rest on standard hedonic-game assumptions plus two paper-specific constructs: ε-rational preferences and the CoalT protocol. No independent evidence is given for either.

free parameters (1)

ε (epsilon)
Bound on rationality deviation for LLM preferences; introduced to characterize observed behavior and enable stability proofs.

axioms (2)

domain assumption Agents have preferences over coalitions that can be represented in a hedonic game
Invoked to apply Nash-stability results to LLM agents.
ad hoc to paper LLM outputs can be treated as approximately rational with bounded error ε
Used to bridge theoretical guarantees to empirical LLM behavior.

invented entities (2)

LLM Coalition Formation Game (LCFG) no independent evidence
purpose: Formal model for dynamic coalition formation among LLM agents
New game definition that grounds the stability analysis.
Coalition-of-Thought (CoalT) protocol no independent evidence
purpose: Prompting method claimed to increase Nash stability
New technique whose performance is measured in experiments.

pith-pipeline@v0.9.0 · 5500 in / 1637 out tokens · 79274 ms · 2026-05-10T11:34:23.968685+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

5 extracted references · 5 canonical work pages

[1]

Generate all coalition pairs(S, T)containing agenta i with|S|,|T| ≤4

work page
[2]

Compute ground-truth per-capita valuesv i(S),v i(T)from known capability profiles

work page
[3]

Query agent preferences using standard prompting and record responses

work page
[4]

For each value gap∆ =|v i(S)−v i(T)|, compute the rate of irrational choices (agent prefers the objectively worse coalition)

work page
[5]

Answer the following question. Provide only the final answer

Estimateˆϵas the threshold where irrational choice frequency drops below 50%. This procedure is not circular: ground-truth values come from external benchmark evaluations, andϵis estimated from the mismatch between computed values and agent preferences. Results:ˆϵ= 0.15[95% CI: 0.12–0.18] for GPT-4,ˆϵ= 0.14 [0.11–0.17] for Claude-3,ˆϵ= 0.22[0.18–0.26] for...

work page 2020

[1] [1]

Generate all coalition pairs(S, T)containing agenta i with|S|,|T| ≤4

work page

[2] [2]

Compute ground-truth per-capita valuesv i(S),v i(T)from known capability profiles

work page

[3] [3]

Query agent preferences using standard prompting and record responses

work page

[4] [4]

For each value gap∆ =|v i(S)−v i(T)|, compute the rate of irrational choices (agent prefers the objectively worse coalition)

work page

[5] [5]

Answer the following question. Provide only the final answer

Estimateˆϵas the threshold where irrational choice frequency drops below 50%. This procedure is not circular: ground-truth values come from external benchmark evaluations, andϵis estimated from the mismatch between computed values and agent preferences. Results:ˆϵ= 0.15[95% CI: 0.12–0.18] for GPT-4,ˆϵ= 0.14 [0.11–0.17] for Claude-3,ˆϵ= 0.22[0.18–0.26] for...

work page 2020