pith. sign in

arxiv: 2606.05985 · v1 · pith:HX4E6G5Rnew · submitted 2026-06-04 · 💻 cs.CL · cs.CY

Beyond Alignment: Value Diversity as a Collective Property in Multicultural Agent Systems

Pith reviewed 2026-06-28 01:54 UTC · model grok-4.3

classification 💻 cs.CL cs.CY
keywords value diversitymulticultural agentsWorld Values Surveymulti-agent systemsLLM alignmentcultural pluralityhomogenizationcollective decision-making
0
0 comments X

The pith

Multicultural LLM agent systems exhibit far lower value diversity than human societies, and this diversity is largely independent of per-agent alignment.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

Existing evaluations of multicultural multi-agent systems focus on alignment, or how closely each agent matches a target culture. The paper instead measures value diversity at the system level as the dissimilarity among agents' answers to the same World Values Survey questions when each is conditioned on a different culture. This diversity shows little correlation with alignment scores, revealing that the two are separate properties. Across 19 cultures and many model configurations, current systems fall well below the diversity observed in human populations, with mixed model backbones closing part but not all of the gap. Allowing agents to interact drives them toward consensus and reduces diversity further, which in turn narrows the range of outcomes in collective tasks such as participatory budgeting.

Core claim

Value diversity, quantified as the dissimilarity between culturally conditioned agents' responses on the World Values Survey, is largely uncorrelated with alignment and substantially lower in current multicultural agent systems than in human societies. Mixed-backbone systems narrow but do not close this gap, which persists across culture compositions and agent scales. Social interaction among agents erodes diversity by driving consensus, and a participatory budgeting case study shows that the resulting homogenization narrows the breadth of collective decisions.

What carries the argument

Value diversity defined as dissimilarity between agents' responses on the World Values Survey.

If this is right

  • Alignment and value diversity capture complementary properties of multicultural systems.
  • Current systems fall substantially below human levels of value diversity.
  • Mixed-backbone configurations reduce the diversity gap but do not eliminate it.
  • Social interaction among agents drives consensus and lowers diversity.
  • Lower diversity narrows the range of collective decisions in applications such as budgeting.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • Explicit mechanisms to maintain response differences during interaction may be needed to preserve diversity in agent societies.
  • The survey-based measure could be tested on other value instruments or on downstream tasks that require cultural variation.
  • The persistent gap raises questions about whether scaling agent numbers or interaction rounds will widen or shrink cultural representation.
  • Homogenization effects might be mitigated by periodic re-conditioning of agents to distinct cultural prompts.

Load-bearing premise

That differences in how agents answer World Values Survey questions accurately reflect whether the system preserves distinct cultural perspectives rather than model artifacts or surface response patterns.

What would settle it

Re-running the evaluation on a different value survey or on observed behavior in a real collective decision task would show whether the reported diversity gap and its independence from alignment still appear.

Figures

Figures reproduced from arXiv: 2606.05985 by Jingshen Zhang, Jinyuan Li, Long P. Hoang, Shaoyang Xu, Wenxuan Zhang.

Figure 1
Figure 1. Figure 1: Landscape of system-level value diversity and value alignment. Left: 18 single-backbone systems on the (Diversity, Alignment) plane, colored by model family. Dashed lines indicate across-system means. Pearson correlation between the two metrics is reported. Right: Per-question (Dq, Aq) distributions for the two circled systems; bubble size encodes item density, and stars mark across-question means. sualize… view at source ↗
Figure 2
Figure 2. Figure 2: Diversity–alignment landscape of all 185 ≈ 1.89M backbone configurations (N = 5 cultures). Blue hexbin shows configuration density (darker = more). Three notable mixed￾backbone configurations (green stars) are annotated with ∆D(Diversity), ∆A(Alignment) relative to the near￾est single-backbone reference (circled). The mixed-backbone Pareto frontier (solid green) strictly dominates the single-backbone one (… view at source ↗
Figure 4
Figure 4. Figure 4: Effect of one-round social exposure on system-level diversity and alignment. Each value is the change relative to the static system of Section 5. All systems lose diversity (∆D < 0), while alignment generally rises but by a much smaller margin. 6 Towards Dynamic Interaction Our experiments so far focus on static systems, where each agent answers the WVS independently. Real-world agent-driven platforms, how… view at source ↗
Figure 6
Figure 6. Figure 6: Collective decision-making outcomes in the Participatory Budgeting task. Both systems use claude-opus-4.7 as the backbone model. The results show that the high-diversity system distributes support across substantially broader societal dimensions. havioral patterns when making group decisions. 7.1 Experimental Setup Participatory Budgeting To study this question, we consider a democratic decision-making sce… view at source ↗
Figure 7
Figure 7. Figure 7: Relationship between Structural Diver￾sity and Alignment. Each point represents one single￾backbone multicultural agent system. Dashed lines indi￾cate across-system means. 0 1 2 3 4 5 Interaction round 67 68 69 70 71 System alignment gpt-5.4 claude-opus-4.7 gemini-3.1-flash-lite-preview grok-4.3 Qwen3.5-27B llama-4-scout [PITH_FULL_IMAGE:figures/full_fig_p013_7.png] view at source ↗
Figure 8
Figure 8. Figure 8: System alignment over five rounds of interac [PITH_FULL_IMAGE:figures/full_fig_p013_8.png] view at source ↗
Figure 9
Figure 9. Figure 9: Collective decision-making outcomes in the Participatory Budgeting task. Both systems use [PITH_FULL_IMAGE:figures/full_fig_p015_9.png] view at source ↗
Figure 10
Figure 10. Figure 10: Collective decision-making outcomes in the Participatory Budgeting task. Both systems use [PITH_FULL_IMAGE:figures/full_fig_p015_10.png] view at source ↗
read the original abstract

Multicultural multi-agent systems are increasingly deployed in globally diverse settings, where different agents are grounded in different cultural backgrounds. Existing cultural evaluation focuses on value alignment: how closely a single agent matches a target culture. Yet alignment is a per-agent property and cannot reveal whether a system, taken as a whole, preserves the cultural plurality it is meant to represent. We propose value diversity as a system-level evaluation axis for multicultural agent systems, defined through the dissimilarity between culturally conditioned agents' responses on a shared value survey. Using the World Values Survey, we evaluate 19 cultures and 18 backbone models across a wide range of system configurations. We find that diversity is largely uncorrelated with alignment, indicating that the two capture complementary system properties, and that current multicultural agent systems fall substantially below human societies in value diversity. Mixed-backbone systems narrow this gap but do not close it, and the gap persists across culture compositions and agent scales. Social interaction further erodes diversity by driving agents toward consensus, and a participatory budgeting case study shows that this homogenization narrows the breadth of collective decision-making. Together, our results establish value diversity as a distinct evaluation axis for multicultural multi-agent systems and reveal a persistent homogenization tendency in current LLM-based societies. Our code and data are publicly available at https://github.com/iNLP-Lab/MultiAgent-Diversity.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 1 minor

Summary. The paper proposes 'value diversity' as a system-level property for multicultural multi-agent systems, defined via average pairwise dissimilarity of agents' responses to World Values Survey items when agents are culturally conditioned. Experiments across 19 cultures and 18 backbone models show this metric is largely uncorrelated with per-agent alignment, that LLM systems exhibit substantially lower diversity than human societies (with mixed-backbone setups narrowing but not closing the gap), that social interaction erodes diversity via consensus, and that this affects collective decisions in a participatory budgeting case study. Code and data are released publicly.

Significance. If the core metric is shown to be robust, the work establishes a distinct evaluation axis complementary to alignment, documenting a homogenization tendency in current LLM-based multicultural systems and motivating new design approaches. The public code and data release is a clear strength, supporting reproducibility and follow-on work.

major comments (2)
  1. [Abstract] Abstract: The central claim that value diversity (defined as WVS response dissimilarity) is 'largely uncorrelated with alignment' and that systems 'fall substantially below human societies' treats survey-answer vectors as a faithful proxy for preserved cultural plurality. This assumption is load-bearing for all quantitative gaps, mixed-backbone results, and interaction effects, yet the abstract provides no validation against human response distributions or controls for LLM artifacts such as training-data overlap or output regularities (consistent with the alternative explanation that mixed backbones increase measured diversity via stylistic variation rather than internalized values).
  2. [Abstract] Abstract (social interaction results): The finding that 'social interaction further erodes diversity by driving agents toward consensus' requires explicit controls to distinguish genuine value homogenization from prompt-induced convergence or shared context effects; without these, the erosion claim cannot be isolated from the experimental setup and remains load-bearing for the participatory budgeting case study implications.
minor comments (1)
  1. The abstract states 'our code and data are publicly available' but does not specify the exact repository contents (e.g., whether raw WVS responses, dissimilarity computation scripts, and human baseline data are included), which would aid immediate verification.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their constructive comments on our manuscript. We respond point by point to the major comments below, indicating where revisions will be made to address concerns about validation, controls, and potential confounds.

read point-by-point responses
  1. Referee: [Abstract] Abstract: The central claim that value diversity (defined as WVS response dissimilarity) is 'largely uncorrelated with alignment' and that systems 'fall substantially below human societies' treats survey-answer vectors as a faithful proxy for preserved cultural plurality. This assumption is load-bearing for all quantitative gaps, mixed-backbone results, and interaction effects, yet the abstract provides no validation against human response distributions or controls for LLM artifacts such as training-data overlap or output regularities (consistent with the alternative explanation that mixed backbones increase measured diversity via stylistic variation rather than internalized values).

    Authors: The World Values Survey is a validated instrument widely used in cross-cultural research to measure value distributions. We computed the identical diversity metric directly on the human WVS response data for the same 19 cultures and items, establishing the human baseline against which LLM systems are compared. Experiments across 18 backbone models show the low diversity and lack of correlation with alignment are consistent, reducing the likelihood that results stem from model-specific artifacts. We will add explicit discussion of potential stylistic confounds in mixed-backbone conditions and further controls comparing answer distributions in the revision. revision: partial

  2. Referee: [Abstract] Abstract (social interaction results): The finding that 'social interaction further erodes diversity by driving agents toward consensus' requires explicit controls to distinguish genuine value homogenization from prompt-induced convergence or shared context effects; without these, the erosion claim cannot be isolated from the experimental setup and remains load-bearing for the participatory budgeting case study implications.

    Authors: Our interaction experiments compare diversity before and after multi-round exchanges using fixed neutral prompts without explicit consensus instructions, with a no-interaction control condition. To further isolate effects from shared context or prompt artifacts, we will include additional ablations (e.g., private vs. broadcast messaging and varied prompt phrasings) in the revised manuscript. These will strengthen the homogenization claim and its link to the participatory budgeting results. revision: yes

Circularity Check

0 steps flagged

No circularity: value diversity defined from external WVS dissimilarity with independent empirical measurements

full rationale

The paper defines value diversity directly as dissimilarity between agents' responses on the World Values Survey (an external benchmark) and reports empirical comparisons to alignment, human societies, mixed-backbone systems, and interaction effects. No equations, predictions, or derivations reduce these findings to fitted parameters, self-citations, or self-referential quantities. The central claims rest on measurements across 19 cultures and 18 models rather than any construction that equates outputs to inputs by definition. This is a standard non-circular empirical evaluation.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 1 invented entities

The central claim rests on treating survey-response dissimilarity as a faithful proxy for cultural plurality preservation; no free parameters are named in the abstract, but the choice of survey and dissimilarity function are unvalidated modeling decisions.

axioms (1)
  • domain assumption Responses to the World Values Survey by LLM agents reflect their cultural grounding in a manner comparable to human respondents.
    Invoked when using the survey to measure both alignment and diversity across 19 cultures.
invented entities (1)
  • value diversity no independent evidence
    purpose: System-level evaluation axis for multicultural agent systems
    Newly defined collective property measured via response dissimilarity; no independent evidence of validity provided in abstract.

pith-pipeline@v0.9.1-grok · 5782 in / 1379 out tokens · 22375 ms · 2026-06-28T01:54:28.622703+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

43 extracted references · 26 canonical work pages · 5 internal anchors

  1. [1]

    Designing Digital Voting Systems for Citizens: Achieving Fairness and Legitimacy in Participatory Budgeting , journal =

    Joshua Chu. Designing Digital Voting Systems for Citizens: Achieving Fairness and Legitimacy in Participatory Budgeting , journal =. 2024 , url =. doi:10.1145/3665332 , timestamp =

  2. [2]

    Multiple

    Dayeon Ki and Rachel Rudinger and Tianyi Zhou and Marine Carpuat , editor =. Multiple. Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers),. 2025 , url =

  3. [3]

    Multi-Agent Teams Hold Experts Back

    Aneesh Pappu and Batu El and Hancheng Cao and Carmelo di Nolfo and Yanchao Sun and Meng Cao and James Zou , title =. CoRR , volume =. 2026 , url =. doi:10.48550/ARXIV.2602.01011 , eprinttype =. 2602.01011 , timestamp =

  4. [4]

    Murthy and Tomer D

    Sonia K. Murthy and Tomer D. Ullman and Jennifer Hu , editor =. One fish, two fish, but not the whole sea: Alignment reduces language models' conceptual diversity , booktitle =. 2025 , url =. doi:10.18653/V1/2025.NAACL-LONG.561 , timestamp =

  5. [5]

    Shivalika Singh and Angelika Romanou and Cl. Global. Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers),. 2025 , url =

  6. [6]

    CulturalBench:

    Yu Ying Chiu and Liwei Jiang and Bill Yuchen Lin and Chan Young Park and Shuyue Stella Li and Sahithya Ravi and Mehar Bhatia and Maria Antoniak and Yulia Tsvetkov and Vered Shwartz and Yejin Choi , editor =. CulturalBench:. Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers),. 2025 , url =. doi:1...

  7. [7]

    CultureBank: An online community-driven knowledge base towards 12 culturally aware language technologies

    Weiyan Shi and Ryan Li and Yutong Zhang and Caleb Ziems and Sunny Yu and Raya Horesh and Rog. CultureBank: An Online Community-Driven Knowledge Base Towards Culturally Aware Language Technologies , booktitle =. 2024 , url =. doi:10.18653/V1/2024.FINDINGS-EMNLP.288 , timestamp =

  8. [8]

    Junho Myung and Nayeon Lee and Yi Zhou and Jiho Jin and Rifki Afina Putri and Dimosthenis Antypas and Hsuvas Borkakoty and Eunsu Kim and Carla P. BLEnD:. Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2024, NeurIPS 2024, Vancouver, BC, Canada, December 10 - 15, 2024 , year =

  9. [9]

    1984 , publisher=

    Culture's consequences: International differences in work-related values , author=. 1984 , publisher=

  10. [10]

    Madrid, Spain & Vienna, Austria: JD Systems Institute & WVSA Secretariat , volume=

    Christian Haerpfer and Ronald Inglehart and Alejandro Moreno and Christian Welzel and Kseniya Kizilova and Jaime Diez-Medrano and Marta Lagos and Pippa Norris and Eduard Ponarin and Bjorn Puranen , title =. Madrid, Spain & Vienna, Austria: JD Systems Institute & WVSA Secretariat , volume=

  11. [11]

    Political psychology , pages=

    The social identity theory of intergroup behavior , author=. Political psychology , pages=. 2004 , publisher=

  12. [12]

    Demystifying Multi-Agent Debate: The Role of Confidence and Diversity

    Xiaochen Zhu and Caiqi Zhang and Yizhou Chi and Tom Stafford and Nigel Collier and Andreas Vlachos , title =. CoRR , volume =. 2026 , url =. doi:10.48550/ARXIV.2601.19921 , eprinttype =. 2601.19921 , timestamp =

  13. [13]

    Understanding agent scaling in LLM-based multi-agent systems via diversity.arXiv preprint arXiv:2602.03794, 2026

    Yingxuan Yang and Chengrui Qu and Muning Wen and Laixi Shi and Ying Wen and Weinan Zhang and Adam Wierman and Shangding Gu , title =. CoRR , volume =. 2026 , url =. doi:10.48550/ARXIV.2602.03794 , eprinttype =. 2602.03794 , timestamp =

  14. [14]

    Leibo and Tom Griffiths and Iyad Rahwan and Fernando Santos and Matjaz Perc and Valerio Capraro , title =

    The Anh Han and Joel Z. Leibo and Tom Griffiths and Iyad Rahwan and Fernando Santos and Matjaz Perc and Valerio Capraro , title =. CoRR , volume =. 2026 , url =. doi:10.48550/ARXIV.2603.16900 , eprinttype =. 2603.16900 , timestamp =

  15. [15]

    arXiv preprint arXiv:2510.22954 , year=

    Liwei Jiang and Yuanjun Chai and Margaret Li and Mickel Liu and Raymond Fok and Nouha Dziri and Yulia Tsvetkov and Maarten Sap and Alon Albalak and Yejin Choi , title =. CoRR , volume =. 2025 , url =. doi:10.48550/ARXIV.2510.22954 , eprinttype =. 2510.22954 , timestamp =

  16. [16]

    2025 , eprint =

    Yiming Zhang and Harshita Diddee and Susan Holm and Hanchen Liu and Xinyue Liu and Vinay Samuel and Barry Wang and Daphne Ippolito , title =. CoRR , volume =. 2025 , url =. doi:10.48550/ARXIV.2504.05228 , eprinttype =. 2504.05228 , timestamp =

  17. [17]

    8th International Conference on Learning Representations,

    Ari Holtzman and Jan Buys and Li Du and Maxwell Forbes and Yejin Choi , title =. 8th International Conference on Learning Representations,. 2020 , url =

  18. [18]

    Chunting Zhou and Pengfei Liu and Puxin Xu and Srinivasan Iyer and Jiao Sun and Yuning Mao and Xuezhe Ma and Avia Efrat and Ping Yu and Lili Yu and Susan Zhang and Gargi Ghosh and Mike Lewis and Luke Zettlemoyer and Omer Levy , editor =. Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, ...

  19. [19]

    Le and Ed H

    Xuezhi Wang and Jason Wei and Dale Schuurmans and Quoc V. Le and Ed H. Chi and Sharan Narang and Aakanksha Chowdhery and Denny Zhou , title =. The Eleventh International Conference on Learning Representations,. 2023 , url =

  20. [20]

    Reasoning models generate societies of thought.arXiv preprint arXiv:2601.10825, 2026

    Junsol Kim and Shiyang Lai and Nino Scherrer and Blaise Ag. Reasoning Models Generate Societies of Thought , journal =. 2026 , url =. doi:10.48550/ARXIV.2601.10825 , eprinttype =. 2601.10825 , timestamp =

  21. [21]

    1996 , publisher=

    The morality of pluralism , author=. 1996 , publisher=

  22. [22]

    CoRR , volume =

    Muhua Huang and Qinlin Zhao and Xiaoyuan Yi and Xing Xie , title =. CoRR , volume =. 2025 , url =. doi:10.48550/ARXIV.2512.10665 , eprinttype =. 2512.10665 , timestamp =

  23. [23]

    CoRR , volume =

    Ivar Frisch and Mario Giulianelli , title =. CoRR , volume =. 2024 , url =. doi:10.48550/ARXIV.2402.02896 , eprinttype =. 2402.02896 , timestamp =

  24. [24]

    Generative agents: Interactive simulacra of human behavior,

    Joon Sung Park and Joseph C. O'Brien and Carrie Jun Cai and Meredith Ringel Morris and Percy Liang and Michael S. Bernstein , editor =. Generative Agents: Interactive Simulacra of Human Behavior , booktitle =. 2023 , url =. doi:10.1145/3586183.3606763 , timestamp =

  25. [25]

    Self-Pluralising Culture Alignment for Large Language Models , booktitle =

    Shaoyang Xu and Yongqi Leng and Linhao Yu and Deyi Xiong , editor =. Self-Pluralising Culture Alignment for Large Language Models , booktitle =. 2025 , url =. doi:10.18653/V1/2025.NAACL-LONG.350 , timestamp =

  26. [26]

    Masoud and Ziquan Liu and Martin Ferianc and Philip C

    Reem I. Masoud and Ziquan Liu and Martin Ferianc and Philip C. Treleaven and Miguel Rodrigues , editor =. Cultural Alignment in Large Language Models: An Explanatory Analysis Based on Hofstede's Cultural Dimensions , booktitle =. 2025 , url =

  27. [27]

    , title =

    Wenxuan Wang and Wenxiang Jiao and Jingyuan Huang and Ruyi Dai and Jen. Not All Countries Celebrate Thanksgiving: On the Cultural Dominance in Large Language Models , booktitle =. 2024 , url =. doi:10.18653/V1/2024.ACL-LONG.345 , timestamp =

  28. [28]

    Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) , pages =

    Badr AlKhamissi and Muhammad N. ElNokrashy and Mai Alkhamissi and Mona T. Diab , editor =. Investigating Cultural Alignment of Large Language Models , booktitle =. 2024 , url =. doi:10.18653/V1/2024.ACL-LONG.671 , timestamp =

  29. [29]

    CoRR , volume =

    Yong Cao and Li Zhou and Seolhwa Lee and Laura Cabello and Min Chen and Daniel Hershcovich , title =. CoRR , volume =. 2023 , url =. doi:10.48550/ARXIV.2303.17466 , eprinttype =. 2303.17466 , timestamp =

  30. [30]

    Gordon and Niloofar Mireshghallah and Christopher Michael Rytting and Andre Ye and Liwei Jiang and Ximing Lu and Nouha Dziri and Tim Althoff and Yejin Choi , editor =

    Taylor Sorensen and Jared Moore and Jillian Fisher and Mitchell L. Gordon and Niloofar Mireshghallah and Christopher Michael Rytting and Andre Ye and Liwei Jiang and Ximing Lu and Nouha Dziri and Tim Althoff and Yejin Choi , editor =. Position:. Forty-first International Conference on Machine Learning,. 2024 , url =

  31. [31]

    Holliday and Bob M

    Vincent Conitzer and Rachel Freedman and Jobst Heitzig and Wesley H. Holliday and Bob M. Jacobs and Nathan Lambert and Milan Moss. Position: Social Choice Should Guide. Forty-first International Conference on Machine Learning,. 2024 , url =

  32. [32]

    URL , volume =

    Moltbook , title =. URL , volume =. 2026 , url =

  33. [33]

    Humans welcome to observe

    Yukun Jiang and Yage Zhang and Xinyue Shen and Michael Backes and Yang Zhang , title =. CoRR , volume =. 2026 , url =. doi:10.48550/ARXIV.2602.10127 , eprinttype =. 2602.10127 , timestamp =

  34. [34]

    Exploring Silicon-Based Societies: An Early Study of the Moltbook Agent Community , journal =

    Yu. Exploring Silicon-Based Societies: An Early Study of the Moltbook Agent Community , journal =. 2026 , url =. doi:10.48550/ARXIV.2602.02613 , eprinttype =. 2602.02613 , timestamp =

  35. [35]

    Motaleb Hossen Manik and Ge Wang , title =

    Md. Motaleb Hossen Manik and Ge Wang , title =. CoRR , volume =. 2026 , url =. doi:10.48550/ARXIV.2602.02625 , eprinttype =. 2602.02625 , timestamp =

  36. [36]

    MoltNet: Understanding Social Behavior of AI Agents in the Agent-Native MoltBook

    Yi Feng and Chen Huang and Zhibo Man and Ryner Tan and Long P. Hoang and Shaoyang Xu and Wenxuan Zhang , title =. CoRR , volume =. 2026 , url =. doi:10.48550/ARXIV.2602.13458 , eprinttype =. 2602.13458 , timestamp =

  37. [37]

    doi: 10.1007/s11704-024-40231-1

    Lei Wang and Chen Ma and Xueyang Feng and Zeyu Zhang and Hao Yang and Jingsen Zhang and Zhiyuan Chen and Jiakai Tang and Xu Chen and Yankai Lin and Wayne Xin Zhao and Zhewei Wei and Jirong Wen , title =. Frontiers Comput. Sci. , volume =. 2024 , url =. doi:10.1007/S11704-024-40231-1 , timestamp =

  38. [38]

    Large Language Model Agent: A Survey on Methodology, Applications and Challenges

    Junyu Luo and Weizhi Zhang and Ye Yuan and Yusheng Zhao and Junwei Yang and Yiyang Gu and Bohan Wu and Binqi Chen and Ziyue Qiao and Qingqing Long and Rongcheng Tu and Xiao Luo and Wei Ju and Zhiping Xiao and Yifan Wang and Meng Xiao and Chenwu Liu and Jingyang Yuan and Shichang Zhang and Yiqiao Jin and Fan Zhang and Xian Wu and Hanqing Zhao and Dacheng T...

  39. [39]

    Chawla and Olaf Wiest and Xiangliang Zhang , title =

    Taicheng Guo and Xiuying Chen and Yaqi Wang and Ruidi Chang and Shichao Pei and Nitesh V. Chawla and Olaf Wiest and Xiangliang Zhang , title =. Proceedings of the Thirty-Third International Joint Conference on Artificial Intelligence,. 2024 , url =

  40. [40]

    Multi-Agent Collaboration Mechanisms: A Survey of LLMs

    Khanh. Multi-Agent Collaboration Mechanisms:. CoRR , volume =. 2025 , url =. doi:10.48550/ARXIV.2501.06322 , eprinttype =. 2501.06322 , timestamp =

  41. [41]

    Tenenbaum and Igor Mordatch , editor =

    Yilun Du and Shuang Li and Antonio Torralba and Joshua B. Tenenbaum and Igor Mordatch , editor =. Improving Factuality and Reasoning in Language Models through Multiagent Debate , booktitle =. 2024 , url =

  42. [42]

    doi: 10.18653/v1/2024.emnlp-main.992

    Tian Liang and Zhiwei He and Wenxiang Jiao and Xing Wang and Yan Wang and Rui Wang and Yujiu Yang and Shuming Shi and Zhaopeng Tu , editor =. Encouraging Divergent Thinking in Large Language Models through Multi-Agent Debate , booktitle =. 2024 , url =. doi:10.18653/V1/2024.EMNLP-MAIN.992 , timestamp =

  43. [43]

    The Thirteenth International Conference on Learning Representations,

    Junlin Wang and Jue Wang and Ben Athiwaratkun and Ce Zhang and James Zou , title =. The Thirteenth International Conference on Learning Representations,. 2025 , url =