Recognition: no theorem link
Do Agent Societies Develop Intellectual Elites? The Hidden Power Laws of Collective Cognition in LLM Multi-Agent Systems
Pith reviewed 2026-05-13 19:08 UTC · model grok-4.3
The pith
LLM multi-agent systems develop intellectual elites as coordination follows heavy-tailed cascades and preferential attachment.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The study reconstructs reasoning in LLM multi-agent systems as cascades of atomic coordination events. Analysis reveals three coupled laws: coordination cascades follow heavy-tailed distributions, participation concentrates into intellectual elites through preferential attachment, and extreme events grow more frequent with increasing system size. These are unified by an integration bottleneck where coordination expansion scales with size but consolidation does not, producing large yet weakly integrated collective reasoning. Deficit-Triggered Integration corrects the imbalance by boosting integration selectively and improves outcomes where standard coordination fails.
What carries the argument
The integration bottleneck, in which coordination expansion scales with system size while consolidation does not, which links heavy-tailed cascades, preferential attachment to elites, and rising extreme events.
If this is right
- Coordination concentrates into intellectual elites through preferential attachment as agent numbers grow.
- Extreme coordination events become more frequent with larger system sizes due to the mismatch in expansion and consolidation.
- Deficit-Triggered Integration improves performance by addressing integration imbalances without suppressing large-scale reasoning.
Where Pith is reading between the lines
- Designers of agent systems could monitor integration balance in real time to prevent elite concentration and maintain broader participation.
- The power-law structure suggests that fixed topologies may require active adjustment mechanisms to sustain stable collective performance at scale.
- Applying the same event-level cascade reconstruction to human collaboration data could reveal whether similar bottlenecks appear in non-LLM collective cognition.
Load-bearing premise
The load-bearing premise is that the atomic event-level formulation accurately reconstructs reasoning as cascades of coordination and that the observed patterns generalize beyond the specific tasks, topologies, and LLM models tested.
What would settle it
Measuring the size distribution of coordination cascades and the attachment probabilities of agents in new experiments that alter integration rules or increase agent counts would test whether the bottleneck produces the claimed laws.
Figures
read the original abstract
Large Language Model (LLM) multi-agent systems are increasingly deployed as interacting agent societies, yet scaling these systems often yields diminishing or unstable returns, the causes of which remain poorly understood. We present the first large-scale empirical study of coordination dynamics in LLM-based multi-agent systems, introducing an atomic event-level formulation that reconstructs reasoning as cascades of coordination. Analyzing over 1.5 Million interactions across tasks, topologies, and scales, we uncover three coupled laws: coordination follows heavy-tailed cascades, concentrates via preferential attachment into intellectual elites, and produces increasingly frequent extreme events as system size grows. We show that these effects are coupled through a single structural mechanism: an integration bottleneck, in which coordination expansion scales with system size while consolidation does not, producing large but weakly integrated reasoning processes. To test this mechanism, we introduce Deficit-Triggered Integration (DTI), which selectively increases integration under imbalance. DTI improves performance precisely where coordination fails, without suppressing large-scale reasoning. Together, our results establish quantitative laws of collective cognition and identify coordination structure as a fundamental, previously unmeasured axis for understanding and improving scalable multi-agent intelligence.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper presents the first large-scale empirical study of coordination dynamics in LLM multi-agent systems, analyzing over 1.5 million interactions across tasks, topologies, and scales. It introduces an atomic event-level formulation that reconstructs reasoning as cascades of coordination and reports three coupled laws: heavy-tailed cascades, concentration into intellectual elites via preferential attachment, and increasing frequency of extreme events with system size. These are attributed to a single structural mechanism—an integration bottleneck where coordination expansion scales with system size while consolidation does not—and tested via a Deficit-Triggered Integration (DTI) intervention that selectively boosts integration under imbalance and improves performance where coordination fails.
Significance. If the empirical patterns and mechanism hold after addressing methodological gaps, the work would establish quantitative laws of collective cognition in multi-agent LLM systems, identifying coordination structure as a previously unmeasured axis for scaling behavior and providing a targeted intervention (DTI) that preserves large-scale reasoning. The scale of the interaction dataset and the concrete DTI test represent strengths in empirical grounding.
major comments (2)
- [Abstract] Abstract and mechanism section: The integration bottleneck is presented as the coupling mechanism linking the three laws, but lacks an explicit quantitative derivation or falsifiable metric separating expansion (e.g., cascade size growth) from consolidation (e.g., integration depth or elite concentration) rates. Without separate scaling plots or regression coefficients for these rates, the coupling risks being inferred post-hoc from the same cascade statistics used to establish the laws.
- [Methods] Methods and results sections: The manuscript reports 1.5 million interactions and a concrete DTI intervention but omits detailed data exclusion rules, statistical controls, and full methods description. This makes it impossible to verify whether the observed heavy tails, elite concentration, and extreme-event frequency are driven by post-hoc choices, model-specific artifacts, or the claimed structural mechanism.
minor comments (2)
- Clarify notation for atomic event-level formulation and cascade reconstruction to ensure reproducibility across different LLM models and topologies.
- Add explicit comparisons of DTI performance against baselines in tables or figures, including effect sizes and confidence intervals.
Simulated Author's Rebuttal
We thank the referee for their constructive review and for highlighting the empirical strengths of the work. We address both major comments by adding explicit quantitative derivations and scaling analyses for the integration bottleneck, as well as a fully expanded methods section with exclusion rules, controls, and reproducibility details.
read point-by-point responses
-
Referee: [Abstract] Abstract and mechanism section: The integration bottleneck is presented as the coupling mechanism linking the three laws, but lacks an explicit quantitative derivation or falsifiable metric separating expansion (e.g., cascade size growth) from consolidation (e.g., integration depth or elite concentration) rates. Without separate scaling plots or regression coefficients for these rates, the coupling risks being inferred post-hoc from the same cascade statistics used to establish the laws.
Authors: We agree that an explicit quantitative separation is needed. In the revision we add Section 3.3 deriving the bottleneck: expansion is quantified as the scaling of total cascade size with system size N, while consolidation is quantified as the scaling of mean integration depth (events per elite agent) and elite concentration (Gini coefficient of participation). New Figure 4 presents separate log-log plots with fitted exponents: cascade size scales as N^1.38 (R^2=0.94), integration depth scales as log(N) (R^2=0.87), and elite concentration as N^0.21. These rates are tested against a null model of uniform random coordination; the observed divergence is statistically significant (p<0.001). The coupling is now derived from the differential scaling rather than inferred post-hoc. revision: yes
-
Referee: [Methods] Methods and results sections: The manuscript reports 1.5 million interactions and a concrete DTI intervention but omits detailed data exclusion rules, statistical controls, and full methods description. This makes it impossible to verify whether the observed heavy tails, elite concentration, and extreme-event frequency are driven by post-hoc choices, model-specific artifacts, or the claimed structural mechanism.
Authors: We acknowledge the gap in methodological transparency. The revised manuscript expands Section 2 with: (1) explicit exclusion rules (cascades shorter than 5 events or failing task completion are removed, comprising 1.8% of raw data); (2) statistical controls including fixed seeds, temperature fixed at 0.7, and robustness checks across GPT-4, Claude-3, and Llama-3-70B; (3) complete description of the interaction generation pipeline, topology sampling, and task distributions; (4) pre-registration note and full analysis scripts in the supplement. These additions allow independent verification that the reported laws are not artifacts of post-hoc filtering. revision: yes
Circularity Check
Empirical observations with post-hoc explanatory mechanism; no derivation reduces to inputs by construction
full rationale
The paper reports large-scale empirical measurements of coordination cascades across 1.5M interactions and proposes an integration bottleneck as a coupling explanation after the patterns are observed. No equations or fitted parameters are shown to be renamed as predictions, no self-citation chain carries the central claim, and the atomic event formulation is presented as a measurement tool rather than a self-defining loop. The DTI intervention is introduced as a test rather than a necessary consequence of the data. This is the normal case of an empirical study whose central results remain independent of any circular reduction.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption Heavy-tailed distributions and preferential attachment describe coordination events in LLM agent interactions
invented entities (1)
-
integration bottleneck
no independent evidence
Reference graph
Works this paper leans on
-
[1]
Kolmogorov An. Sulla determinazione empirica di una legge didistribuzione.Giorn Dell’inst Ital Degli Att, 4:89–91, 1933
work page 1933
-
[2]
Self-organized criticality: An explanation of the 1/f noise.Physical review letters, 59(4):381, 1987
Per Bak, Chao Tang, and Kurt Wiesenfeld. Self-organized criticality: An explanation of the 1/f noise.Physical review letters, 59(4):381, 1987
work page 1987
-
[3]
Everyone’s an influencer: quantifying influence on twitter
Eytan Bakshy, Jake M Hofman, Winter A Mason, and Duncan J Watts. Everyone’s an influencer: quantifying influence on twitter. InProceedings of the fourth ACM international conference on Web search and data mining, pages 65–74, 2011
work page 2011
-
[4]
The origin of bursts and heavy tails in human dynamics.Nature, 435(7039):207–211, 2005
Albert-Laszlo Barabasi. The origin of bursts and heavy tails in human dynamics.Nature, 435(7039):207–211, 2005
work page 2005
-
[5]
Emergence of scaling in random networks.science, 286(5439):509–512, 1999
Albert-László Barabási and Réka Albert. Emergence of scaling in random networks.science, 286(5439):509–512, 1999
work page 1999
-
[6]
Why Do Multi-Agent LLM Systems Fail?
Mert Cemri, Melissa Z Pan, Shuyi Yang, Lakshya A Agrawal, Bhavya Chopra, Rishabh Tiwari, Kurt Keutzer, Aditya Parameswaran, Dan Klein, Kannan Ramchandran, et al. Why do multi- agent llm systems fail?arXiv preprint arXiv:2503.13657, 2025
work page internal anchor Pith review Pith/arXiv arXiv 2025
-
[7]
Measuring user influence in twitter: The million follower fallacy
Meeyoung Cha, Hamed Haddadi, Fabricio Benevenuto, and Krishna Gummadi. Measuring user influence in twitter: The million follower fallacy. InProceedings of the international AAAI conference on web and social media, volume 4, pages 10–17, 2010
work page 2010
-
[8]
ChatEval: Towards Better LLM-based Evaluators through Multi-Agent Debate
Chi-Min Chan, Weize Chen, Yusheng Su, Jianxuan Yu, Wei Xue, Shanghang Zhang, Jie Fu, and Zhiyuan Liu. Chateval: Towards better llm-based evaluators through multi-agent debate. arXiv preprint arXiv:2308.07201, 2023
work page internal anchor Pith review Pith/arXiv arXiv 2023
-
[9]
Langgraph: Building stateful, multi-agent applications with llms, 2024
Harrison Chase and LangChain Inc. Langgraph: Building stateful, multi-agent applications with llms, 2024
work page 2024
-
[10]
Reconcile: Round-table conference improves reasoning via consensus among diverse llms
Justin Chen, Swarnadeep Saha, and Mohit Bansal. Reconcile: Round-table conference improves reasoning via consensus among diverse llms. InProceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 7066–7085, 2024
work page 2024
-
[11]
Are more llm calls all you need? towards the scaling properties of compound ai systems
Lingjiao Chen, Jared Davis, Boris Hanin, Peter Bailis, Ion Stoica, Matei Zaharia, and James Zou. Are more llm calls all you need? towards the scaling properties of compound ai systems. Advances in Neural Information Processing Systems, 37:45767–45790, 2024
work page 2024
-
[12]
Agentverse: Facilitating multi-agent collaboration and exploring emergent behaviors
Weize Chen, Yusheng Su, Jingwei Zuo, Cheng Yang, Chenfei Yuan, Chi-Min Chan, Heyang Yu, Yaxi Lu, Yi-Hsin Hung, Chen Qian, et al. Agentverse: Facilitating multi-agent collaboration and exploring emergent behaviors. InThe Twelfth International Conference on Learning Representations, 2023
work page 2023
-
[13]
Power-law distributions in empirical data.SIAM review, 51(4):661–703, 2009
Aaron Clauset, Cosma Rohilla Shalizi, and Mark EJ Newman. Power-law distributions in empirical data.SIAM review, 51(4):661–703, 2009
work page 2009
-
[14]
Riley Crane and Didier Sornette. Robust dynamic classes revealed by measuring the response function of a social system.Proceedings of the National Academy of Sciences, 105(41):15649– 15653, 2008
work page 2008
-
[15]
Laurens De Haan and Ana Ferreira.Extreme value theory: an introduction. Springer, 2006
work page 2006
-
[16]
Improv- ing factuality and reasoning in language models through multiagent debate
Yilun Du, Shuang Li, Antonio Torralba, Joshua B Tenenbaum, and Igor Mordatch. Improv- ing factuality and reasoning in language models through multiagent debate. InForty-first international conference on machine learning, 2024
work page 2024
-
[17]
Springer Science & Business Media, 2013
Paul Embrechts, Claudia Klüppelberg, and Thomas Mikosch.Modelling extremal events: for insurance and finance, volume 33. Springer Science & Business Media, 2013
work page 2013
-
[18]
arXiv preprint arXiv:2502.18836 , year=
Longling Geng and Edward Y Chang. Realm-bench: A benchmark for evaluating multi- agent systems on real-world, dynamic planning and scheduling tasks.arXiv preprint arXiv:2502.18836, 2025. 17
-
[19]
Measurement of inequality of incomes.The economic journal, 31(121):124–125, 1921
Corrado Gini. Measurement of inequality of incomes.The economic journal, 31(121):124–125, 1921
work page 1921
-
[20]
K-I Goh, Byungnam Kahng, and Doochul Kim. Universal behavior of load distribution in scale-free networks.Physical review letters, 87(27):278701, 2001
work page 2001
-
[21]
Michel L Goldstein, Steven A Morris, and Gary G Yen. Problems with fitting to the power- law distribution.The European Physical Journal B-Condensed Matter and Complex Systems, 41(2):255–258, 2004
work page 2004
-
[22]
arXiv preprint arXiv:2411.06559 , year=
Yu Gu, Kai Zhang, Yuting Ning, Boyuan Zheng, Boyu Gou, Tianci Xue, Cheng Chang, Sanjari Srivastava, Yanan Xie, Peng Qi, et al. Is your llm secretly a world model of the internet? model-based planning for web agents.arXiv preprint arXiv:2411.06559, 2024
-
[23]
Large Language Model based Multi-Agents: A Survey of Progress and Challenges
Taicheng Guo, Xiuying Chen, Yaqi Wang, Ruidi Chang, Shichao Pei, Nitesh V Chawla, Olaf Wiest, and Xiangliang Zhang. Large language model based multi-agents: A survey of progress and challenges.arXiv preprint arXiv:2402.01680, 2024
work page internal anchor Pith review Pith/arXiv arXiv 2024
-
[24]
Aaron Halfaker, R Stuart Geiger, Jonathan T Morgan, and John Riedl. The rise and decline of an open collaboration system: How wikipedia’s reaction to popularity is causing its decline. American behavioral scientist, 57(5):664–688, 2013
work page 2013
-
[25]
Metagpt: Meta programming for a multi-agent collaborative framework
Sirui Hong, Mingchen Zhuge, Jonathan Chen, Xiawu Zheng, Yuheng Cheng, Jinlin Wang, Ceyao Zhang, Zili Wang, Steven Ka Shing Yau, Zijuan Lin, et al. Metagpt: Meta programming for a multi-agent collaborative framework. InThe twelfth international conference on learning representations, 2023
work page 2023
-
[26]
SWE-bench: Can Language Models Resolve Real-World GitHub Issues?
Carlos E Jimenez, John Yang, Alexander Wettig, Shunyu Yao, Kexin Pei, Ofir Press, and Karthik Narasimhan. Swe-bench: Can language models resolve real-world github issues?arXiv preprint arXiv:2310.06770, 2023
work page internal anchor Pith review Pith/arXiv arXiv 2023
-
[27]
Siegel, Nitya Nadgir, and Arvind Narayanan
Sayash Kapoor, Benedikt Stroebl, Zachary S Siegel, Nitya Nadgir, and Arvind Narayanan. Ai agents that matter.arXiv preprint arXiv:2407.01502, 2024
-
[28]
Towards a Science of Scaling Agent Systems
Yubin Kim, Ken Gu, Chanwoo Park, Chunjong Park, Samuel Schmidgall, A Ali Heydari, Yao Yan, Zhihan Zhang, Yuchen Zhuang, Mark Malhotra, et al. Towards a science of scaling agent systems.arXiv preprint arXiv:2512.08296, 2025
work page internal anchor Pith review Pith/arXiv arXiv 2025
-
[29]
Highly clustered scale-free networks.Physical Review E, 65(3):036123, 2002
Konstantin Klemm and Victor M Eguiluz. Highly clustered scale-free networks.Physical Review E, 65(3):036123, 2002
work page 2002
-
[30]
Large language models miss the multi-agent mark
Emanuele La Malfa, Gabriele La Malfa, Samuele Marro, Jie M Zhang, Elizabeth Black, Michael Luck, Philip Torr, and Michael Wooldridge. Large language models miss the multi-agent mark. arXiv preprint arXiv:2505.21298, 2025
-
[31]
Joel Z Leibo, Vinicius Zambaldi, Marc Lanctot, Janusz Marecki, and Thore Graepel. Multi- agent reinforcement learning in sequential social dilemmas.arXiv preprint arXiv:1702.03037, 2017
-
[32]
The dynamics of viral marketing
Jure Leskovec, Lada A Adamic, and Bernardo A Huberman. The dynamics of viral marketing. ACM Transactions on the Web (TWEB), 1(1):5–es, 2007
work page 2007
-
[33]
Guohao Li, Hasan Hammoud, Hani Itani, Dmitrii Khizbullin, and Bernard Ghanem. Camel: Communicative agents for" mind" exploration of large language model society.Advances in neural information processing systems, 36:51991–52008, 2023
work page 2023
-
[34]
Encouraging divergent thinking in large language models through multi- agent debate
Tian Liang, Zhiwei He, Wenxiang Jiao, Xing Wang, Yan Wang, Rui Wang, Yujiu Yang, Shuming Shi, and Zhaopeng Tu. Encouraging divergent thinking in large language models through multi- agent debate. InProceedings of the 2024 conference on empirical methods in natural language processing, pages 17889–17904, 2024
work page 2024
-
[35]
AgentBench: Evaluating LLMs as Agents
Xiao Liu, Hao Yu, Hanchen Zhang, Yifan Xu, Xuanyu Lei, Hanyu Lai, Yu Gu, Hangliang Ding, Kaiwen Men, Kejuan Yang, et al. Agentbench: Evaluating llms as agents.arXiv preprint arXiv:2308.03688, 2023. 18
work page internal anchor Pith review Pith/arXiv arXiv 2023
-
[36]
Dynamic llm-agent network: An llm-agent collaboration framework with agent team optimization
Zijun Liu, Yanzhe Zhang, Peng Li, Yang Liu, and Diyi Yang. Dynamic llm-agent net- work: An llm-agent collaboration framework with agent team optimization.arXiv preprint arXiv:2310.02170, 2023
-
[37]
Max O Lorenz. Methods of measuring the concentration of wealth.Publications of the American statistical association, 9(70):209–219, 1905
work page 1905
-
[38]
Ryan Lowe, Yi I Wu, Aviv Tamar, Jean Harb, OpenAI Pieter Abbeel, and Igor Mordatch. Multi-agent actor-critic for mixed cooperative-competitive environments.Advances in neural information processing systems, 30, 2017
work page 2017
-
[39]
Thomas W Malone and Michael Bernstein.Handbook of collective intelligence. MIT press, 2015
work page 2015
-
[40]
Gaia: a benchmark for general ai assistants
Grégoire Mialon, Clémentine Fourrier, Thomas Wolf, Yann LeCun, and Thomas Scialom. Gaia: a benchmark for general ai assistants. InThe Twelfth International Conference on Learning Representations, 2023
work page 2023
-
[41]
Michael Mitzenmacher. A brief history of generative models for power law and lognormal distributions.Internet mathematics, 1(2):226–251, 2004
work page 2004
-
[42]
The structure and function of complex networks.SIAM review, 45(2):167– 256, 2003
Mark EJ Newman. The structure and function of complex networks.SIAM review, 45(2):167– 256, 2003
work page 2003
-
[43]
Power laws, pareto distributions and zipf’s law.Contemporary physics, 46(5):323–351, 2005
Mark EJ Newman. Power laws, pareto distributions and zipf’s law.Contemporary physics, 46(5):323–351, 2005
work page 2005
-
[44]
From text to life: On the reciprocal relationship between artificial life and large language models
Eleni Nisioti, Claire Glanois, Elias Najarro, Andrew Dai, Elliot Meyerson, Joachim Winther Pedersen, Laetitia Teodorescu, Conor F Hayes, Shyam Sudhakaran, and Sebastian Risi. From text to life: On the reciprocal relationship between artificial life and large language models. In Artificial Life Conference Proceedings 36, volume 2024, page 39. MIT Press One...
work page 2024
-
[45]
The influence of scaffolds on coordination scaling laws in LLM agents
Rebecka Nordenlöw et al. The influence of scaffolds on coordination scaling laws in LLM agents. InNeurIPS 2025 Workshop on Multi-Turn Interactions in Large Language Models (MTI-LLM), 2025
work page 2025
-
[46]
MemGPT: Towards LLMs as Operating Systems
C Packer, V Fang, SG Patil, K Lin, S Wooders, and J Gonzalez. Memgpt: Towards llms as operating systems. arxiv 2023.arXiv preprint arXiv:2310.08560
work page internal anchor Pith review Pith/arXiv arXiv 2023
-
[47]
Vilfredo Pareto.Cours d’économie politique, volume 1. Librairie Droz, 1964
work page 1964
-
[48]
Generative agents: Interactive simulacra of human behavior
Joon Sung Park, Joseph O’Brien, Carrie Jun Cai, Meredith Ringel Morris, Percy Liang, and Michael S Bernstein. Generative agents: Interactive simulacra of human behavior. InProceed- ings of the 36th annual acm symposium on user interface software and technology, pages 1–22, 2023
work page 2023
-
[49]
Harvard University Press, 2014
Thomas Piketty.Capital in the twenty-first century. Harvard University Press, 2014
work page 2014
-
[50]
Chatdev: Communicative agents for software development
Chen Qian, Wei Liu, Hongzhang Liu, Nuo Chen, Yufan Dang, Jiahao Li, Cheng Yang, Weize Chen, Yusheng Su, Xin Cong, et al. Chatdev: Communicative agents for software development. InProceedings of the 62nd annual meeting of the association for computational linguistics (volume 1: Long papers), pages 15174–15186, 2024
work page 2024
-
[51]
Scaling large-language-model-based multi-agent collaboration
Chen Qian, Zihao Xie, Yifei Wang, Wei Liu, Kunlun Zhu, Hanchen Xia, Yufan Dang, Zhuoyun Du, Weize Chen, Cheng Yang, et al. Scaling large language model-based multi-agent collabora- tion.arXiv preprint arXiv:2406.07155, 2024
-
[52]
Noah Shinn, Federico Cassano, Ashwin Gopinath, Karthik Narasimhan, and Shunyu Yao. Reflexion: Language agents with verbal reinforcement learning.Advances in neural information processing systems, 36:8634–8652, 2023
work page 2023
-
[53]
Nickolay Smirnov. Table for estimating the goodness of fit of empirical distributions.The annals of mathematical statistics, 19(2):279–281, 1948. 19
work page 1948
-
[54]
Critical truths about power laws.Science, 335(6069):665–666, 2012
Michael PH Stumpf and Mason A Porter. Critical truths about power laws.Science, 335(6069):665–666, 2012
work page 2012
- [55]
-
[56]
Multi-Agent Collaboration Mechanisms: A Survey of LLMs
Khanh-Tung Tran, Dung Dao, Minh-Duong Nguyen, Quoc-Viet Pham, Barry O’Sullivan, and Hoang D Nguyen. Multi-agent collaboration mechanisms: A survey of llms.arXiv preprint arXiv:2501.06322, 2025
work page internal anchor Pith review Pith/arXiv arXiv 2025
-
[57]
Kavana Venkatesh, Yinhan He, Jundong Li, and Jiaming Cui. Physicsagentabm: Physics-guided generative agent-based modeling.arXiv preprint arXiv:2602.06030, 2026
-
[58]
Power-law distributions and binned empirical data
Yogesh S Virkar. Power-law distributions and binned empirical data. Master’s thesis, University of Colorado at Boulder, 2012
work page 2012
-
[59]
Quang H Vuong. Likelihood ratio tests for model selection and non-nested hypotheses.Econo- metrica: journal of the Econometric Society, pages 307–333, 1989
work page 1989
-
[60]
Lei Wang, Chen Ma, Xueyang Feng, Zeyu Zhang, Hao Yang, Jingsen Zhang, Zhiyuan Chen, Jiakai Tang, Xu Chen, Yankai Lin, et al. A survey on large language model based autonomous agents.Frontiers of Computer Science, 18(6):186345, 2024
work page 2024
-
[61]
Self-Consistency Improves Chain of Thought Reasoning in Language Models
Xuezhi Wang, Jason Wei, Dale Schuurmans, Quoc Le, Ed Chi, Sharan Narang, Aakanksha Chowdhery, and Denny Zhou. Self-consistency improves chain of thought reasoning in language models.arXiv preprint arXiv:2203.11171, 2022
work page internal anchor Pith review Pith/arXiv arXiv 2022
-
[62]
Duncan J Watts. A simple model of global cascades on random networks.Proceedings of the National Academy of Sciences, 99(9):5766–5771, 2002
work page 2002
-
[63]
Collective dynamics of ‘small-world’networks.nature, 393(6684):440–442, 1998
Duncan J Watts and Steven H Strogatz. Collective dynamics of ‘small-world’networks.nature, 393(6684):440–442, 1998
work page 1998
-
[64]
Chain-of-thought prompting elicits reasoning in large language models
Jason Wei, Xuezhi Wang, Dale Schuurmans, Maarten Bosma, Fei Xia, Ed Chi, Quoc V Le, Denny Zhou, et al. Chain-of-thought prompting elicits reasoning in large language models. Advances in neural information processing systems, 35:24824–24837, 2022
work page 2022
-
[65]
Evidence for a collective intelligence factor in the performance of human groups
Anita Williams Woolley, Christopher F Chabris, Alex Pentland, Nada Hashmi, and Thomas W Malone. Evidence for a collective intelligence factor in the performance of human groups. science, 330(6004):686–688, 2010
work page 2010
-
[66]
Autogen: Enabling next-gen llm applications via multi-agent conversations
Qingyun Wu, Gagan Bansal, Jieyu Zhang, Yiran Wu, Beibin Li, Erkang Zhu, Li Jiang, Xiaoyun Zhang, Shaokun Zhang, Jiale Liu, et al. Autogen: Enabling next-gen llm applications via multi-agent conversations. InFirst conference on language modeling, 2024
work page 2024
-
[67]
Zhiheng Xi, Wenxiang Chen, Xin Guo, Wei He, Yiwen Ding, Boyang Hong, Ming Zhang, Junzhe Wang, Senjie Jin, Enyu Zhou, et al. The rise and potential of large language model based agents: A survey.Science China Information Sciences, 68(2):121101, 2025
work page 2025
-
[68]
React: Synergizing reasoning and acting in language models
Shunyu Yao, Jeffrey Zhao, Dian Yu, Nan Du, Izhak Shafran, Karthik R Narasimhan, and Yuan Cao. React: Synergizing reasoning and acting in language models. InThe eleventh international conference on learning representations, 2022
work page 2022
-
[69]
Multiagentbench: Evaluating the collaboration and competition of llm agents
Kunlun Zhu, Hongyi Du, Zhaochen Hong, Xiaocheng Yang, Shuyi Guo, Daisy Zhe Wang, Zhenhailong Wang, Cheng Qian, Robert Tang, Heng Ji, et al. Multiagentbench: Evaluating the collaboration and competition of llm agents. InProceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 8580–8622, 2025
work page 2025
-
[70]
Gptswarm: Language agents as optimizable graphs
Mingchen Zhuge, Wenyi Wang, Louis Kirsch, Francesco Faccio, Dmitrii Khizbullin, and Jürgen Schmidhuber. Gptswarm: Language agents as optimizable graphs. InForty-first International Conference on Machine Learning, 2024. 20 Appendix A Additional Qualitative Results In this section, we provide additional qualitative results pertaining to each of the hypothes...
work page 2024
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.