LLM Agents Make Collective Belief Dynamics Programmable: Challenges and Research Directions
Pith reviewed 2026-05-20 04:33 UTC · model grok-4.3
The pith
Coordinated LLM agents can deliberately steer population-level beliefs through programmable collective dynamics.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
Classical models of opinion dynamics assume human participants with bounded rationality and limited coordination. The rise of LLM-based agents introduces a qualitative shift: agents can now participate in online discussions at scale, maintain consistent persuasion strategies, and coordinate systematically. This paper argues that LLM agents make collective belief dynamics programmable, enabling deliberate steering of population-level beliefs. We term this emerging problem programmable collective belief control. Through controlled multi-agent simulations, we provide proof-of-concept evidence that coordinated AI agents can induce measurable belief shifts that stabilize within a few interaction.
What carries the argument
Programmable collective belief control: the capacity for coordinated LLM agents to induce and stabilize targeted shifts in group beliefs via systematic interaction.
If this is right
- Coordinated agents can produce measurable belief shifts that stabilize after only a few interaction rounds.
- Four properties—indistinguishability, persistence, contextuality, and configurability—make reliable detection and defense difficult.
- New theoretical foundations are needed to model adversarial belief dynamics driven by programmable agents.
- Operational methods must be developed for system-level detection and intervention at scale.
- Dedicated simulation infrastructure is required to test interventions before real-world deployment.
Where Pith is reading between the lines
- If the programmability claim holds, platforms may need coordination-detection layers that treat message patterns rather than individual content as the primary signal.
- The same mechanisms could be repurposed for positive interventions such as countering misinformation clusters once detection methods mature.
- Scaling the observed stabilization effect to real populations would require testing whether human users and platform moderation alter the few-round convergence seen in simulations.
Load-bearing premise
Behaviors seen in controlled simulations of LLM agents will generalize to real online discussions and the four structural properties will render detection and defense fundamentally difficult rather than merely technically challenging.
What would settle it
Running the same coordinated-agent protocol on an actual public discussion platform and measuring whether belief distributions shift and stabilize in the same pattern and timescale observed in the simulations.
Figures
read the original abstract
Classical models of opinion dynamics assume human participants with bounded rationality and limited coordination. The rise of LLM-based agents introduces a qualitative shift: agents can now participate in online discussions at scale, maintain consistent persuasion strategies, and coordinate systematically. This paper argues that LLM agents make collective belief dynamics programmable, enabling deliberate steering of population-level beliefs. We term this emerging problem programmable collective belief control. Through controlled multi-agent simulations, we provide proof-of-concept evidence that coordinated AI agents can induce measurable belief shifts that stabilize within a few interaction rounds. We identify four structural properties (indistinguishability, persistence, contextuality, and configurability) that make detection and defense fundamentally difficult. Based on these findings, we outline a research agenda spanning theoretical foundations for adversarial belief dynamics, operational methods for system-level detection and intervention, and simulation infrastructure for scalable experimentation. Our goal is not to present a complete solution, but to articulate why this problem demands urgent attention and to provide a conceptual foundation for future work.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper claims that LLM-based agents introduce a qualitative shift in collective belief dynamics by enabling systematic coordination and consistent persuasion strategies, making population-level beliefs programmable. It introduces the term 'programmable collective belief control' and presents proof-of-concept evidence from controlled multi-agent simulations showing that coordinated agents can induce measurable belief shifts that stabilize within a few interaction rounds. The work identifies four structural properties (indistinguishability, persistence, contextuality, and configurability) that purportedly make detection and defense fundamentally difficult, and outlines a research agenda covering theoretical foundations, operational detection methods, and scalable simulation infrastructure.
Significance. If the simulation results and generalization arguments hold, the paper identifies a timely emerging risk in AI-mediated social systems and provides a useful conceptual framing for future work on adversarial belief dynamics. The explicit positioning as a foundation for a research agenda rather than a complete solution is appropriate, and the focus on multi-agent coordination in belief formation aligns with the journal's scope in multi-agent systems.
major comments (2)
- [Simulations and Proof-of-Concept] The proof-of-concept simulations are described only at a high level without details on agent coordination prompts, interaction protocols, belief measurement methods, number of agents, or statistical validation of stabilization; this weakens the support for the central claim that coordinated LLM agents can deliberately steer and stabilize beliefs in a programmable manner (see the abstract and the section presenting the simulations).
- [Structural Properties] The claim that the four structural properties make detection and defense 'fundamentally difficult' rests on idealized isolated simulations that omit stochastic human responses, variable participation, external signals, and platform moderation; without additional experiments or analysis bridging to real-world conditions, this does not yet substantiate the difficulty assessment (see the section on structural properties).
minor comments (2)
- [Abstract] The abstract and introduction could more explicitly separate the conceptual argument from the simulation-based illustration to clarify the strength of evidence for each part of the claim.
- [Introduction] Additional references to classical opinion dynamics models (e.g., DeGroot, bounded confidence) would help situate the qualitative shift argument.
Simulated Author's Rebuttal
We thank the referee for the constructive feedback and for recognizing the paper's alignment with the journal's scope and its value as a foundation for a research agenda. We address the two major comments point by point below, indicating planned revisions to improve clarity and support for the claims while preserving the manuscript's focus on identifying challenges rather than providing exhaustive solutions.
read point-by-point responses
-
Referee: [Simulations and Proof-of-Concept] The proof-of-concept simulations are described only at a high level without details on agent coordination prompts, interaction protocols, belief measurement methods, number of agents, or statistical validation of stabilization; this weakens the support for the central claim that coordinated LLM agents can deliberately steer and stabilize beliefs in a programmable manner (see the abstract and the section presenting the simulations).
Authors: We agree that the simulation description is high-level and that additional specifics would improve reproducibility and strengthen support for the proof-of-concept. The simulations were designed as controlled illustrations of programmable belief shifts rather than comprehensive empirical validation, consistent with the paper's positioning as a research agenda. In the revised manuscript, we will expand the relevant section to detail the agent coordination prompts, interaction protocols (including round structure and message passing), belief measurement methods (e.g., quantitative tracking via semantic embeddings or scaled responses), number of agents per run, and statistical validation of stabilization (such as convergence metrics and variability across trials). These additions will be incorporated into the main text or an appendix. revision: yes
-
Referee: [Structural Properties] The claim that the four structural properties make detection and defense 'fundamentally difficult' rests on idealized isolated simulations that omit stochastic human responses, variable participation, external signals, and platform moderation; without additional experiments or analysis bridging to real-world conditions, this does not yet substantiate the difficulty assessment (see the section on structural properties).
Authors: We acknowledge that the simulations are idealized and do not directly incorporate stochastic human responses, variable participation, external signals, or platform moderation, which limits direct empirical substantiation of real-world difficulty. The four properties (indistinguishability, persistence, contextuality, and configurability) are presented as structural features of LLM agents that complicate detection and defense in principle. To address the gap, we will add a subsection analyzing how each property would likely interact with and persist amid human variability, fluctuating participation, external influences, and moderation efforts, drawing on related work in misinformation dynamics and multi-agent systems. This provides a conceptual bridge without new experiments, reinforcing why the properties warrant attention in the proposed research agenda. revision: partial
Circularity Check
No circularity: conceptual argument supported by illustrative simulations without reduction to inputs.
full rationale
The paper advances a conceptual framing that LLM agents render collective belief dynamics programmable, supported by controlled multi-agent simulations as proof-of-concept evidence for belief shifts and four structural properties. No equations, fitted parameters, or self-citation chains appear in the provided text that would reduce the central claim to a self-defined quantity or force the conclusion by construction. The simulations are presented as exploratory illustrations rather than as a closed derivation, and the properties are derived from observed simulation outcomes rather than presupposed. The argument remains self-contained as a research agenda proposal drawing on established multi-agent systems concepts without circular loops.
Axiom & Free-Parameter Ledger
axioms (2)
- domain assumption LLM-based agents can maintain consistent persuasion strategies and coordinate systematically at scale in online discussions.
- domain assumption Controlled multi-agent simulations provide valid proof-of-concept evidence for real-world belief dynamics.
Lean theorems connected to this paper
-
IndisputableMonolith/Foundation/ArithmeticFromLogic.leanreality_from_one_distinction unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
We identify four structural properties (indistinguishability, persistence, contextuality, and configurability) that make detection and defense fundamentally difficult.
-
IndisputableMonolith/Cost/FunctionalEquation.leanwashburn_uniqueness_aczel unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
coordinated AI agents can induce measurable belief shifts that stabilize within a few interaction rounds
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
Reference graph
Works this paper leans on
-
[1]
Generative agents: Interactive simulacra of human behavior,
J. S. Park, J. O’Brien, C. J. Cai, M. R. Morris, P. Liang, and M. S. Bernstein, “Generative agents: Interactive simulacra of human behavior,”Proceedings of the 36th Annual ACM Symposium on User Interface Software and Technology, 2023
work page 2023
-
[2]
The rise and potential of large language model based agents: A survey,
Z. Xi, W. Chen, X. Guo, W. He, Y . Ding, B. Hong, M. Zhang, J. Wang, S. Jin, E. Zhou, R. Zheng, X. Fan, X. Wang, L. Xiong, Q. Liu, Y . Zhou, W. Wang, C. Jiang, Y . Zou, X. Liu, Z. Yin, S. Dou, R. Weng, W. Cheng, Q. Zhang, W. Qin, Y . Zheng, X. Qiu, X. Huan, and T. Gui, “The rise and potential of large language model based agents: A survey,”arXiv.org, 2023
work page 2023
-
[3]
Human heuristics for ai-generated language are flawed,
M. Jakesch, J. T. Hancock, and M. Naaman, “Human heuristics for ai-generated language are flawed,”Proceed- ings of the National Academy of Sciences, vol. 120, no. 11, p. e2208839120, 2023
work page 2023
-
[4]
All that’s ‘human’ is not gold: Evaluating human evaluation of generated text,
E. Clark, T. August, S. Serrano, N. Haduong, S. Gururangan, and N. A. Smith, “All that’s ‘human’ is not gold: Evaluating human evaluation of generated text,” inProceedings of the 59th Annual Meeting of the Association for Computational Linguistics (ACL), pp. 7282–7296, 2021
work page 2021
-
[5]
M. H. DeGroot, “Reaching a consensus,”Journal of the American Statistical Association, vol. 69, no. 345, pp. 118–121, 1974
work page 1974
-
[6]
Social influence and opinions,
N. E. Friedkin and E. C. Johnsen, “Social influence and opinions,”Journal of Mathematical Sociology, vol. 15, no. 3-4, pp. 193–206, 1990
work page 1990
-
[7]
F. Sakketou, A. Lahnala, L. V ogel, and L. Flek, “Investigating user radicalization: A novel dataset for identifying fine-grained temporal shifts in opinion,” inProceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3798–3808, 2022
work page 2022
-
[8]
A coefficient of agreement for nominal scales,
J. Cohen, “A coefficient of agreement for nominal scales,”Educational and psychological measurement, vol. 20, no. 1, pp. 37–46, 1960
work page 1960
-
[9]
DetectGPT: Zero-shot machine-generated text detection using probability curvature,
E. Mitchell, Y . Lee, A. Khazatsky, C. D. Manning, and C. Finn, “DetectGPT: Zero-shot machine-generated text detection using probability curvature,” inProceedings of the 40th International Conference on Machine Learning (A. Krause, E. Brunskill, K. Cho, B. Engelhardt, S. Sabato, and J. Scarlett, eds.), vol. 202 ofProceedings of Machine Learning Research, ...
work page 2023
-
[11]
Uncovering coordinated networks on social media: Methods and case studies,
D. Pacheco, P.-M. Hui, C. Torres-Lugo, B. T. Truong, A. Flammini, and F. Menczer, “Uncovering coordinated networks on social media: Methods and case studies,” inProceedings of the International AAAI Conference on Web and Social Media (ICWSM), vol. 15, pp. 455–466, 2021. 8 LLM Agents Make Collective Belief Dynamics Programmable: Challenges and Research Directions
work page 2021
-
[12]
Herd immunity and herd effect: new insights and definitions,
T. J. John and R. Samuel, “Herd immunity and herd effect: new insights and definitions,”European journal of epidemiology, vol. 16, no. 7, pp. 601–606, 2000
work page 2000
-
[13]
Mixing beliefs among interacting agents,
G. Deffuant, D. Neau, F. Amblard, and G. Weisbuch, “Mixing beliefs among interacting agents,”Advances in Complex Systems, vol. 3, no. 01n04, pp. 87–98, 2000
work page 2000
-
[14]
Statistical physics of social dynamics,
C. Castellano, S. Fortunato, and V . Loreto, “Statistical physics of social dynamics,”Reviews of Modern Physics, vol. 81, no. 2, pp. 591–646, 2009
work page 2009
-
[15]
Continuous opinion dynamics under bounded confidence: A survey,
J. Lorenz, “Continuous opinion dynamics under bounded confidence: A survey,”International Journal of Modern Physics C, vol. 18, no. 12, pp. 1819–1838, 2007
work page 2007
-
[16]
Opinion polarization by learning from social feedback,
S. Banisch and E. Olbrich, “Opinion polarization by learning from social feedback,”The Journal of Mathematical Sociology, vol. 43, no. 2, pp. 76–103, 2019
work page 2019
-
[17]
Exposure to opposing views on social media can increase political polarization,
C. A. Bail, L. P. Argyle, T. W. Brown, J. P. Bumpus, H. Chen, M. F. Hunzaker, J. Lee, M. Mann, F. Merhout, and A. V olfovsky, “Exposure to opposing views on social media can increase political polarization,”Proceedings of the National Academy of Sciences, vol. 115, no. 37, pp. 9216–9221, 2018
work page 2018
-
[18]
Online human-bot interactions: Detection, esti- mation, and characterization,
O. Varol, E. Ferrara, C. Davis, F. Menczer, and A. Flammini, “Online human-bot interactions: Detection, esti- mation, and characterization,” inProceedings of the International AAAI Conference on Web and Social Media (ICWSM), vol. 11, pp. 280–289, 2017
work page 2017
-
[19]
Scalable and generalizable social bot detection through data selection,
K.-C. Yang, O. Varol, P.-M. Hui, and F. Menczer, “Scalable and generalizable social bot detection through data selection,” inProceedings of the AAAI Conference on Artificial Intelligence, vol. 34, pp. 1096–1103, 2020
work page 2020
-
[20]
Opinion dynamics in social networks: From models to data,
A. F. Peralta, J. Kert ´esz, and G. I ˜niguez, “Opinion dynamics in social networks: From models to data,” in Handbook of Computational Social Science, pp. 384–406, Edward Elgar Publishing Limited, 2025
work page 2025
-
[21]
A. Khadka, “Game theory in social media: A stackelberg model of collaboration, conflict, and algorithmic incentives,” 2025
work page 2025
-
[22]
Quantifying the self-interest level of markov social dilemmas,
R. Willis, Y . Du, J. Z. Leibo, and M. Luck, “Quantifying the self-interest level of markov social dilemmas,”arXiv preprint arXiv:2501.16138, 2025
-
[23]
Social learning in markov games: Empowering autonomous driving,
X. Chen, Z. Li, and X. Di, “Social learning in markov games: Empowering autonomous driving,” in2022 IEEE Intelligent Vehicles Symposium (IV), pp. 478–483, 2022
work page 2022
-
[24]
Game theory meets large language models: A systematic survey,
H. Sun, Y . Wu, Y . Cheng, and X. Chu, “Game theory meets large language models: A systematic survey,”arXiv preprint arXiv:2502.09053, 2025
-
[25]
LLM strategic reasoning: Agentic study through behav- ioral game theory,
J. Jia, Z. Yuan, J. Pan, P. E. McNamara, and D. Chen, “LLM strategic reasoning: Agentic study through behav- ioral game theory,” inThe Thirty-ninth Annual Conference on Neural Information Processing Systems, 2025
work page 2025
-
[26]
ALYMPICS: LLM agents meet game theory,
S. Mao, Y . Cai, Y . Xia, W. Wu, X. Wang, F. Wang, Q. Guan, T. Ge, and F. Wei, “ALYMPICS: LLM agents meet game theory,” inProceedings of the 31st International Conference on Computational Linguistics(O. Rambow, L. Wanner, M. Apidianaki, H. Al-Khalifa, B. D. Eugenio, and S. Schockaert, eds.), (Abu Dhabi, UAE), pp. 2845– 2866, Association for Computational ...
work page 2025
-
[27]
Data-driven mechanism design using multi-agent revealed preferences,
L. Snow and V . Krishnamurthy, “Data-driven mechanism design using multi-agent revealed preferences,” 2024
work page 2024
-
[28]
Mechanism design powered by social interactions,
D. Zhao, “Mechanism design powered by social interactions,” 2021
work page 2021
-
[29]
Multi-agent reinforcement learning simulation for environ- mental policy synthesis,
J. Rudd-Jones, M. Musolesi, and M. P ´erez-Ortiz, “Multi-agent reinforcement learning simulation for environ- mental policy synthesis,”arXiv preprint arXiv:2504.12777, 2025
-
[30]
Evolutionary multi-agent reinforcement learning in group social dilemmas,
B. Mintz and F. Fu, “Evolutionary multi-agent reinforcement learning in group social dilemmas,”Chaos: An Interdisciplinary Journal of Nonlinear Science, vol. 35, no. 2, 2025
work page 2025
-
[31]
The digital ecosystem of beliefs: does evolution favour ai over hu- mans?,
D. M. Bossens, S. Feng, and Y .-S. Ong, “The digital ecosystem of beliefs: does evolution favour ai over hu- mans?,”arXiv preprint arXiv:2412.14500, 2024
-
[32]
Cognitive cascades: How to model (and potentially counter) the spread of fake news,
N. Rabb, L. Cowen, J. P. de Ruiter, and M. Scheutz, “Cognitive cascades: How to model (and potentially counter) the spread of fake news,”Plos one, vol. 17, no. 1, p. e0261811, 2022
work page 2022
-
[33]
Temporal graph learning for dynamic link prediction with text in online social networks,
M. Dileo, M. Zignani, and S. Gaito, “Temporal graph learning for dynamic link prediction with text in online social networks,”Machine learning, vol. 113, no. 4, pp. 2207–2226, 2024
work page 2024
-
[34]
Graph neural networks for social recommendation,
W. Fan, Y . Ma, Q. Li, Y . He, E. Zhao, J. Tang, and D. Yin, “Graph neural networks for social recommendation,” inProceedings of The World Wide Web Conference (WWW ’19), pp. 417–426, 2019
work page 2019
-
[35]
Neural information diffusion prediction with topic-aware attention network,
H. Wang, C. Yang, and C. Shi, “Neural information diffusion prediction with topic-aware attention network,” in Proceedings of the 30th ACM International Conference on Information & Knowledge Management, pp. 1899– 1908, 2021. 9 LLM Agents Make Collective Belief Dynamics Programmable: Challenges and Research Directions
work page 1908
-
[36]
Diffusion causal models for counterfactual estimation,
P. Sanchez and S. A. Tsaftaris, “Diffusion causal models for counterfactual estimation,”arXiv preprint arXiv:2202.10166, 2022
-
[37]
Cross-domain social rumor-propagation model based on transfer learning,
Y . Xiao, J. Yang, W. Zhao, Q. Li, and Y . Pang, “Cross-domain social rumor-propagation model based on transfer learning,”IEEE Transactions on Neural Networks and Learning Systems, vol. 36, no. 4, pp. 6529–6543, 2024
work page 2024
-
[38]
Metaadapt: Domain adaptive few-shot misinformation detection via meta learning,
Z. Yue, H. Zeng, Y . Zhang, L. Shang, and D. Wang, “Metaadapt: Domain adaptive few-shot misinformation detection via meta learning,” inProceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 5223–5239, 2023
work page 2023
-
[39]
Evaluating explainability for graph neural networks,
C. Agarwal, O. Queen, H. Lakkaraju, and M. Zitnik, “Evaluating explainability for graph neural networks,” Scientific Data, vol. 10, no. 1, p. 144, 2023
work page 2023
-
[40]
A survey on explainability of graph neural net- works,
J. Kakkad, J. Jannu, K. Sharma, C. Aggarwal, and S. Medya, “A survey on explainability of graph neural net- works,”arXiv preprint arXiv:2306.01958, 2023
-
[41]
A. A. Najjar, H. I. Ashqar, O. Darwish, and E. Hammad, “Leveraging explainable ai for llm text attribution: Differentiating human-written and multiple llm-generated text,”Information, vol. 16, no. 9, p. 767, 2025
work page 2025
-
[42]
M4: Multi-generator, multi-domain, and multi-lingual black-box machine-generated text detection,
Y . Wang, J. Mansurov, P. Ivanov, J. Su, A. Shelmanov, A. Tsvigun,et al., “M4: Multi-generator, multi-domain, and multi-lingual black-box machine-generated text detection,”arXiv preprint arXiv:2305.14902, 2024
-
[43]
Can AI-Generated Text be Reliably Detected?
V . S. Sadasivan, A. Kumar, S. Balasubramanian, W. Wang, and S. Feizi, “Can ai-generated text be reliably detected?,”arXiv preprint arXiv:2303.11156, 2024
work page Pith review arXiv 2024
-
[44]
Paraphrasing evades detectors of ai-generated text, but retrieval is an effective defense,
M. Iyyer, M. Karpinska, K. Krishna, Y . Song, and J. Wieting, “Paraphrasing evades detectors of ai-generated text, but retrieval is an effective defense,”Advances in Neural Information Processing Systems 36, 2023
work page 2023
-
[45]
X. Mou, X. Ding, Q. He, L. Wang, J. Liang, X. Zhang, L. Sun, J. Lin, J. Zhou, X. Huang, and Z. Wei, “From individual to society: A survey on social simulation driven by large language model-based agents,”arXiv preprint arXiv:2412.03563, 2024
-
[46]
Detectgpt: Zero-shot machine-generated text de- tection using probability curvature,
E. Mitchell, Y . Lee, A. Khazatsky, C. D. Manning, and C. Finn, “Detectgpt: Zero-shot machine-generated text de- tection using probability curvature,” inInternational conference on machine learning, pp. 24950–24962, PMLR, 2023
work page 2023
-
[47]
Combating misinformation in the age of llms: Opportunities and challenges,
C. Chen and K. Shu, “Combating misinformation in the age of llms: Opportunities and challenges,”AI Magazine, 2024
work page 2024
-
[48]
Identifying coordinated accounts on social media through hidden influence and group behaviours,
K. Sharma, Y . Zhang, E. Ferrara, and Y . Liu, “Identifying coordinated accounts on social media through hidden influence and group behaviours,” inProceedings of the 27th ACM SIGKDD conference on knowledge discovery & data mining, pp. 1441–1451, 2021
work page 2021
-
[49]
Adversarial training for high-stakes reliability,
D. Ziegler, S. Nix, L. Chan, T. Bauman, P. Schmidt-Nielsen, T. Lin, A. Scherlis, N. Nabeshima, B. Weinstein- Raun, D. de Haas, B. Shlegeris, and N. Thomas, “Adversarial training for high-stakes reliability,”Advances in Neural Information Processing Systems 35, 2022
work page 2022
-
[50]
Llm-generated messages can persuade humans on policy issues,
H. Bai, J. G. V oelkel, S. Muldowney, J. C. Eichstaedt, and R. Willer, “Llm-generated messages can persuade humans on policy issues,”Nature Communications, vol. 16, no. 1, p. 6037, 2025
work page 2025
-
[51]
Are aligned neural networks adversarially aligned?,
N. Carlini, M. Nasr, C. A. Choquette-Choo, M. Jagielski, I. Gao, P. W. W. Koh, D. Ippolito, F. Tram `er, and L. Schmidt, “Are aligned neural networks adversarially aligned?,”Advances in Neural Information Processing Systems 36, 2023
work page 2023
-
[52]
Red Teaming Language Models with Language Models
E. Perez, S. Huang, F. Song, T. Cai, R. Ring, J. Aslanides,et al., “Red teaming language models with language models,”arXiv preprint arXiv:2202.03286, 2022
work page internal anchor Pith review Pith/arXiv arXiv 2022
-
[53]
Epidemic modeling with generative agents,
R. Williams, N. Hosseinichimeh, A. Majumder, and M. Ghassemi, “Epidemic modeling with generative agents,” arXiv preprint arXiv:2307.04986, 2023
-
[54]
Simu- lating opinion dynamics with networks of llm-based agents,
Y .-S. Chuang, A. Goyal, N. Harlalka, S. Suresh, R. Hawkins, S. Yang, D. Shah, J. Hu, and T. Rogers, “Simu- lating opinion dynamics with networks of llm-based agents,” inFindings of the association for computational linguistics: NAACL 2024, pp. 3326–3346, 2024
work page 2024
-
[55]
Z. Yang, Z. Zhang, Z. Zheng, Y . Jiang, Z. Gan, Z. Wang, Z. Ling, J. Chen, M. Ma, B. Dong,et al., “Oasis: Open agent social interaction simulations with one million agents,”arXiv preprint arXiv:2411.11581, 2024
-
[56]
K. Kandasamy, K. R. Vysyaraju, W. Neiswanger, B. Paria, C. R. Collins,et al., “Tuning hyperparameters with- out grad students: Scalable and robust bayesian optimisation with dragonfly,”Journal of Machine Learning Research, vol. 21, no. 81, pp. 1–27, 2020. 10 LLM Agents Make Collective Belief Dynamics Programmable: Challenges and Research Directions
work page 2020
-
[57]
Parallel bayesian optimization of multiple noisy objectives with ex- pected hypervolume improvement,
S. Daulton, M. Balandat, and E. Bakshy, “Parallel bayesian optimization of multiple noisy objectives with ex- pected hypervolume improvement,”Advances in neural information processing systems, vol. 34, pp. 2187–2200, 2021
work page 2021
-
[58]
MiniLLM: Knowledge distillation of large language models,
Y . Gu, L. Dong, F. Wei, and M. Huang, “MiniLLM: Knowledge distillation of large language models,” inThe Twelfth International Conference on Learning Representations, 2024
work page 2024
-
[59]
Efficient Memory Management for Large Language Model Serving with PagedAttention
W. Kwon, Z. Li, S. Zhuang, Y . Sheng, L. Zheng, C. H. Yu,et al., “Efficient memory management for large language model serving with pagedattention,”arXiv preprint arXiv:2309.06180, 2023
work page internal anchor Pith review Pith/arXiv arXiv 2023
-
[60]
Fast inference from transformers via speculative decoding,
Y . Leviathan, M. Kalman, and Y . Matias, “Fast inference from transformers via speculative decoding,”Interna- tional Conference on Machine Learning, 2022
work page 2022
-
[61]
Automl: A survey of the state-of-the-art,
X. He, K. Zhao, and X. Chu, “Automl: A survey of the state-of-the-art,”Knowledge-based systems, vol. 212, p. 106622, 2021
work page 2021
-
[62]
The challenge of using llms to simulate human behavior: A causal inference perspective,
G. Gui and O. Toubia, “The challenge of using llms to simulate human behavior: A causal inference perspective,” arXiv preprint arXiv:2312.15524, 2023
-
[63]
https://arxiv.org/abs/2301.07543
J. Horton, A. Filippas, and B. Manning, “Large language models as simulated economic agents: What can we learn from homo silicus?,”arXiv preprint arXiv:2301.07543, 2023
-
[64]
The AI Scientist: Towards Fully Automated Open-Ended Scientific Discovery
C. Lu, C. Lu, R. T. Lange, J. Foerster, J. Clune, and D. Ha, “The ai scientist: Towards fully automated open-ended scientific discovery,”arXiv preprint arXiv:2408.06292, 2024
work page internal anchor Pith review Pith/arXiv arXiv 2024
-
[65]
Autonomous chemical research with large language models,
D. A. Boiko, R. MacKnight, B. Kline, and G. Gomes, “Autonomous chemical research with large language models,”Nature, vol. 624, no. 7992, pp. 570–578, 2023
work page 2023
-
[66]
Mlagentbench: Evaluating language agents on machine learning experimentation,
Q. Huang, J. V ora, P. Liang, and J. Leskovec, “Mlagentbench: Evaluating language agents on machine learning experimentation,” inProceedings of the 41st International Conference on Machine Learning, pp. 20158–20208, 2024
work page 2024
-
[67]
Red teaming language models to reduce harms: Methods, scaling behaviors, and lessons learned,
D. Ganguli, L. Lovitt, J. Kernion, A. Askell, Y . Bai, S. Kadavath, B. Mann, E. Perez, N. Schiefer, K. Ndousse, A. Jones, S. Bowman, A. Chen, T. Conerly, N. DasSarma, D. Drain, N. Elhage, S. El-Showk, S. Fort, Z. Hatfield- Dodds, T. Henighan, D. Hernandez, T. Hume, J. Jacobson, S. Johnston, S. Kravec, C. Olsson, S. Ringer, E. Tran- Johnson, D. Amodei, T. ...
work page 2022
-
[68]
Toward trustworthy ai development: Mechanisms for supporting veri- fiable claims,
M. Brundage, S. Avin, J. Wang, H. Belfield, G. Krueger, G. Hadfield, H. Khlaaf, J. Yang, H. Toner, R. Fong, T. Maharaj, P. W. Koh, S. Hooker, J. Leung, A. Trask, E. Bluemke, J. Lebensold, C. O’Keefe, M. Koren, T. Ryffel, J. Rubinovitz, T. Besiroglu, F. Carugati, J. Clark, P. Eckersley, S. de Haas, M. Johnson, B. Laurie, A. Ingerman, I. Krawczuk, A. Askell...
-
[69]
The gradient of generative ai release: Methods and considerations,
I. Solaiman, “The gradient of generative ai release: Methods and considerations,”Conference on Fairness, Ac- countability and Transparency, 2023. 11
work page 2023
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.