Models delayed verification in multi-agent LLMs as graph consensus, derives stability thresholds (inverse golden ratio for delay two) via grounded Laplacian, and gives a supermodular greedy rule for corrector placement; experiments on five models confirm dose-delay oscillations.
Contagion Networks: Evaluator Preference Propagation in Multi-Agent LLM Systems
2 Pith papers cite this work. Polarity classification is still indexing.
abstract
When large language models serve as evaluators in multi-agent systems, their strategy preferences -- whether induced by explicit prompts or by shared architectural priors -- propagate through the agent network. We introduce Contagion Networks, a formal framework for measuring how evaluator preferences spread across interacting LLM agents. In a controlled 3-agent experiment using DeepSeek-chat with three distinct evaluator preference profiles (structured, balanced, evidence-based), we measure the Cross-Agent Contagion Matrix Gamma_3 and find that preferences consistently propagate between agents (gamma in [0.157, 0.352]). A neutral-prompt control experiment reveals a counter-intuitive result: shared architectural priors dominate explicit preference prompts as the driver of contagion (rho_neutral = 1.498 vs. rho_mixed = 1.299; prompt contribution: -63.5%). We identify three propagation regimes governed by the spectral radius rho(Gamma_N) and demonstrate that the same agents suppress preference contagion in chain topology (beta_3 = 0.0126 +/- 0.0038, 95% CI [0.0089, 0.0163], n=4 seeds) but cascade in fully-connected topology (Delta H_avg = -0.020) -- a topology-dependent regime transition validated both for homogeneous and cross-model agent pools (rho^cross = 1.296 +/- 0.016, n=4). We show that increasing evaluator committee size from k=1 to k=3 reduces effective contagion by 68.9% +/- 14.1% (n=4 seeds), providing an actionable mitigation strategy. We release the open-source Contagion Network experimental framework.
years
2026 2verdicts
UNVERDICTED 2representative citing papers
Probability calibration applied to LLM evaluator judgments reduces preference coupling gamma by 20-49% and Jensen-Shannon divergence by 45-67% in a within-subjects experiment with N=5.
citing papers explorer
-
Delayed Verification Destabilizes Multi-Agent LLM Belief: Instability Thresholds and Optimal Corrector Placement
Models delayed verification in multi-agent LLMs as graph consensus, derives stability thresholds (inverse golden ratio for delay two) via grounded Laplacian, and gives a supermodular greedy rule for corrector placement; experiments on five models confirm dose-delay oscillations.
-
Calibrating the Evaluator: Does Probability Calibration Mitigate Preference Coupling in LLM Agent Feedback Loops?
Probability calibration applied to LLM evaluator judgments reduces preference coupling gamma by 20-49% and Jensen-Shannon divergence by 45-67% in a within-subjects experiment with N=5.