pith. sign in

arxiv: 2606.30571 · v1 · pith:BDCL2YVUnew · submitted 2026-06-29 · 💻 cs.LG · cs.CL

Attractor States Emerge in Multi-Turn LLM Conversations

Pith reviewed 2026-06-30 06:55 UTC · model grok-4.3

classification 💻 cs.LG cs.CL
keywords LLM interactionsattractor statesmulti-agent systemsself-playmixed-play debatesconversation dynamicsrepresentation spacediscourse traits
0
0 comments X

The pith

Self-play LLM conversations form model-specific attractor states that pull other models toward their traits in mixed debates.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper tests whether open-ended discussions between LLMs settle into stable patterns of behavior that do not depend on the topic. It runs self-play debates, where each model talks only to itself, and mixed-play debates between different models across 20 controversial topics. By tracking positions in representation space along with discourse features and stances, it shows that each model's self-play path acts as an attractor. These attractors draw other models asymmetrically, changing their stylistic choices and positions in predictable ways. If the pattern holds, long-running multi-agent LLM systems would exhibit partially deterministic dynamics rooted in the individual models rather than emerging solely from the interaction rules.

Core claim

Self-play trajectories constitute model-specific attractors in representation space, discourse traits, and stances that draw conversation partners asymmetrically during mixed-play debates, thereby influencing the other models' stylistic choices and behavior.

What carries the argument

Model-specific attractor states formed by self-play trajectories, measured through convergence in latent representations, discourse traits, and stances across topics.

If this is right

  • Self-play trajectories act as stable reference points that other models approach in mixed interactions.
  • Influence between models is asymmetric, with certain models exerting stronger pull on stylistic and stance features.
  • Open-ended LLM interactions become partially predictable from the participating models' individual attractor properties.
  • Structured partner effects shape final behavior beyond simple averaging or random drift.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • Systems that repeatedly pair a strong attractor model with malleable ones may converge to the attractor's style across many tasks.
  • Tracking which models function as attractors could guide selection of agent teams to achieve desired stability or diversity.
  • The asymmetry suggests that adding or removing one model can shift the entire conversation basin in ways not symmetric to its own self-play behavior.

Load-bearing premise

The convergences seen in representation space, discourse traits, and stances across the tested topics reflect genuine topic-independent attractor states rather than topic-specific effects or measurement artifacts.

What would settle it

Repeating the experiments on a fresh set of topics outside the original 20 and finding that the same models no longer converge to the same relative positions in representation space or adopt the same discourse traits would falsify the attractor claim.

Figures

Figures reproduced from arXiv: 2606.30571 by Jonas Geiping, Ting-Wen Ko.

Figure 1
Figure 1. Figure 1: Left. We study 20-turn debate between two LLM-agents using two setups: 1) mixed-play where agents are instantiated from different models, and 2) self-play where agents are from the same model, which is also the control group that we later observe as ”attractor” (Sec. 4.2). In both settings we assign one agent to be supporting and the other to be opposing a controversial topic. Right. The 20-turn mixed-play… view at source ↗
Figure 2
Figure 2. Figure 2: Self-play mean trajectories and endpoints. (a) Self-play mean trajectories separate over turns (b) Self-play endpoint basins occupy broad, model-specific regions across topics in the latent space, here shown by PCs of topic-centered embedding of all turns. free ablation, both agents receive the neutral DISCUSSANT instruction. This information-symmetric design makes re￾sults easier to attribute to model ide… view at source ↗
Figure 3
Figure 3. Figure 3: Mixed-play endpoint metrics. Left: mA|B, model A’s endpoint when paired with model B, is decomposed relative to the self-play axis from self-play endpoints sA to sB. Partnerward pull αA|B measures interpolation along this axis, while off-axis drift δ ⊥ A|B measures displacement not explained by one-dimensional consensus. Right: Pair contraction CAB measures how much closer the two mixed-play endpoints beco… view at source ↗
Figure 4
Figure 4. Figure 4: Self-play basins are stable. Left: neutral DISCUSSANT/DISCUSSANT self-play trajectories projected onto the self-play PCA space. Even without pro/con roles, trajectories separate into model-specific regions. Right: selected Supporter/Opposer self-play settings repeated across three seeds, where each line shows a repeated run. We can see that repeated runs return to comparable endpoint regions. 4.3. Mixed-pl… view at source ↗
Figure 5
Figure 5. Figure 5: Models attract each other in mixed play. (a) We overlay mixed-play trajectories on the self-play trajectories in Fig. 2a. Solid lines and small markers are mixed-play trajectories and endpoints; dashed lines and large points are self-play ones. (b) Focusing only on endpoints, we highlight the mixed-play centroids pulled away from their self-play counterparts. the line from sA,k to sB,k represents the direc… view at source ↗
Figure 6
Figure 6. Figure 6: Mixed-play endpoint decomposition. Purple and or￾ange bars denote the partnerward pull α and normalized off-axis drift δ ⊥ respectively, along the same-topic self-play axis. Most endpoints lie in the interpolation regime 0 < α < 1, but they also retain nonzero off-axis displacement δ ⊥. over partners, as stronger resistance to cross-model displace￾ment. Results [PITH_FULL_IMAGE:figures/full_fig_p008_6.png] view at source ↗
Figure 7
Figure 7. Figure 7: Model-specific self-play discourse signatures. Claude Haiku stands out on meta-commentary, while other models differ in flattery, rationality, agreement, rebuttal, negativity, and intensity. Complete trait tables are provided in App. D. −0.05 0.00 0.05 0.10 0.15 0.20 Influence on partner models GPT-4o mini Qwen 3.5 Gemini Flash Lite Grok 4.1 Nemotron Claude Haiku Meta-commentary −0.10 −0.05 0.00 0.05 0.10 … view at source ↗
Figure 8
Figure 8. Figure 8: Trait-level partner influence in mixed-play. Claude Haiku most strongly pulls its partners toward meta-commentary, while Gemini Flash Lite, GPT-4o mini, and Qwen 3.5 most strongly pull their partners toward flattery. compute the model-level self-play mean ¯f s A = 1 T X T t=1 f s,(t) A , (11) over T turns [PITH_FULL_IMAGE:figures/full_fig_p009_8.png] view at source ↗
Figure 9
Figure 9. Figure 9: Stance Changes during Debate. Left: In the Discussant/Discussant setting, where no explicit Supporter/Opposer roles are assigned, models stabilize at different intrinsic stance levels. The full range of the stance is 1-5. Right: In mixed-play, stance trajectories do not follow a single convergence pattern across model pairs. Gemini, GPT, and Qwen tend to move toward weaker, more neutral stances in both Sup… view at source ↗
Figure 10
Figure 10. Figure 10: Comparison of endpoint convex hulls in the 2-D self-play PCA space. Adding mixed-play runs increases overlap between some model-specific regions, but the overall structure still shows partial separation rather than convergence to one shared basin. Label source Silhouette Permutation p Model identity 0.0659 < 0.001 Topic -0.0324 < 0.001 [PITH_FULL_IMAGE:figures/full_fig_p020_10.png] view at source ↗
Figure 11
Figure 11. Figure 11: Centroid-level visualization of pairwise mixed-play displacement. Each panel compares a model pair’s mixed-play endpoint centroids with the two corresponding self-play centroid regions. C.4.1. MIXED-PLAY MODEL-LEVEL DECOMPOSITION [PITH_FULL_IMAGE:figures/full_fig_p021_11.png] view at source ↗
Figure 12
Figure 12. Figure 12: Supplementary topic-matched mixed-play endpoints with null-corrected excess off-axis drift. The x-axis is partnerward pull α; the y-axis is δ ⊥. define the interaction-specific excess as δ ⊥,excess A|B,k = δ ⊥ A|B,k − δ ⊥,null A|B,k . (14) Positive excess indicates that mixed-play moves model A farther away from the self-play consensus axis than would be expected from topic-level variation alone [PITH_FU… view at source ↗
Figure 13
Figure 13. Figure 13: Discourse shifts from debate toward affiliation over the course of conversation. Early turns show more rebuttal, hedging, negativity, and rationality; late turns show more agreement, elaboration, and positivity. 27 [PITH_FULL_IMAGE:figures/full_fig_p027_13.png] view at source ↗
Figure 14
Figure 14. Figure 14: Influence-on-partners heatmaps across all turns. Rows index influencing models and columns index affected partner models. Each off-diagonal cell shows the affected partner minus self-play value for pairing affected partner S with influencing model M; row averages therefore recover the mean-transfer and temporal-transfer summaries. Companion self-play calibration values belong to the affected partner colum… view at source ↗
Figure 15
Figure 15. Figure 15: Examples of feature-level influence in different model pairs. Solid lines show models’ mixed-play behavior, and dashed lines show self-play behavior. (a)–(c) Claude Haiku shows a strong pull on explicit AI-role expression, boldface formatting, and conversation￾termination language. Affected partner models shift toward Claude-associated behavior in these dimensions during interaction. (d) Appreciativeness … view at source ↗
Figure 16
Figure 16. Figure 16: (a)–(c) Lexicon entropy decreases and ROUGE-L similarity increases over turns, indicating lexical compression, while semantic similarity remains flat or decreases, indicating continued semantic diversity. (d) Conversations continue to drift away from the initial topic over time [PITH_FULL_IMAGE:figures/full_fig_p030_16.png] view at source ↗
Figure 17
Figure 17. Figure 17: Pairwise stance trajectory. E. Additional Stance Results [PITH_FULL_IMAGE:figures/full_fig_p033_17.png] view at source ↗
read the original abstract

Large language models (LLMs) are increasingly used in open-ended multi-agent settings, but the long-run dynamics of model--model interaction remain poorly understood. We study whether open-ended LLM discussions exhibit attractor-like behavior, i.e. topic-independent stable sets of behaviors which conversations settle into. Across 7 LLMs and 20 controversial topics, we compare self-play and mixed-play dyadic debates, tracking trajectories in representation space, discourse traits, and stances. We find self-play trajectories to be model-specific attractors that draw their conversation partners asymmetrically in mixed-play debates, influencing the other models' stylistic choices and behavior. For example, Claude Haiku is a strong attractor of other models in latent space, corresponding to other models taking on its traits like metacommentary, and models like GPT-4.1 nano are especially malleable. Our results suggest that open-ended LLM interactions are partially predictable from model-specific attractors, but shaped by structured and asymmetric partner influence. Overall, our analysis sheds some light on the complex behavior of open-ended multi-agent interaction, which we hope is helpful in designing, predicting, and monitoring autonomous agentic systems in the real world.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

3 major / 1 minor

Summary. The paper examines long-run dynamics in open-ended multi-turn LLM conversations, testing whether they exhibit attractor-like behavior (topic-independent stable sets of behaviors). Using 7 LLMs and 20 controversial topics, it compares self-play versus mixed-play dyadic debates and tracks trajectories in representation space, discourse traits, and stances. The central finding is that self-play trajectories constitute model-specific attractors that asymmetrically draw conversation partners in mixed-play settings, with examples such as Claude Haiku strongly influencing other models' traits (e.g., metacommentary) while models like GPT-4.1 nano are more malleable. The authors conclude that such interactions are partially predictable from these attractors but shaped by asymmetric partner influence.

Significance. If the central claim holds after verification of topic controls and statistical rigor, the work would offer a useful empirical lens on multi-agent LLM dynamics, with potential value for designing, predicting, and monitoring autonomous agent systems. The asymmetric influence findings and model-specific patterns could inform practical deployment considerations. However, the current presentation supplies no details on representation-space metrics, statistical tests, error bars, or data exclusion rules, limiting immediate impact.

major comments (3)
  1. [Abstract / Experimental Setup] Abstract and experimental description: the claim that self-play trajectories form topic-independent model-specific attractors requires an explicit cross-topic distance metric (or equivalent control) between same-model self-play trajectories to distinguish model basins from topic-driven clustering or shared lexical/stance priors across the 20 controversial topics. No such metric or control is described, leaving the topic-independence assumption unverified and load-bearing for the central claim.
  2. [Results] Results section: the identification of attractors in representation space, discourse traits, and stances lacks reported statistical tests, error bars, or controls for topic dependence; without these, it is unclear whether observed convergence reflects genuine attractor states or measurement artifacts or topic-specific effects.
  3. [Methods] Methods: no details are supplied on the precise representation-space metrics, discourse trait definitions, stance extraction procedures, or data exclusion rules, which are necessary to assess whether the reported asymmetric influence (e.g., Claude Haiku as strong attractor) is robust.
minor comments (1)
  1. [Abstract] The abstract could more clearly distinguish the self-play versus mixed-play comparison from any fitted parameters or self-referential definitions to strengthen the circularity assessment.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for their constructive feedback, which will help strengthen the empirical rigor of our study on attractor states in multi-turn LLM conversations. We address each major comment below.

read point-by-point responses
  1. Referee: [Abstract / Experimental Setup] Abstract and experimental description: the claim that self-play trajectories form topic-independent model-specific attractors requires an explicit cross-topic distance metric (or equivalent control) between same-model self-play trajectories to distinguish model basins from topic-driven clustering or shared lexical/stance priors across the 20 controversial topics. No such metric or control is described, leaving the topic-independence assumption unverified and load-bearing for the central claim.

    Authors: We agree that an explicit cross-topic distance metric would better substantiate the topic-independence of the attractors. While our experiments span 20 diverse topics and show consistent model-specific patterns in self-play that differ from mixed-play influences, we did not compute a formal metric. In the revision, we will add a cross-topic analysis computing the average distance between self-play trajectories of the same model on different topics and compare it to distances between different models, to demonstrate that model basins are tighter than topic effects. revision: yes

  2. Referee: [Results] Results section: the identification of attractors in representation space, discourse traits, and stances lacks reported statistical tests, error bars, or controls for topic dependence; without these, it is unclear whether observed convergence reflects genuine attractor states or measurement artifacts or topic-specific effects.

    Authors: The referee is correct that the current results presentation would benefit from statistical tests and error bars. We will revise the Results section to include appropriate statistical analyses, such as tests for significant differences in convergence rates, error bars on trajectory plots derived from multiple runs or bootstrapping, and controls by reporting results aggregated across topics with per-topic breakdowns to rule out topic-specific effects. revision: yes

  3. Referee: [Methods] Methods: no details are supplied on the precise representation-space metrics, discourse trait definitions, stance extraction procedures, or data exclusion rules, which are necessary to assess whether the reported asymmetric influence (e.g., Claude Haiku as strong attractor) is robust.

    Authors: We will update the Methods section to include all requested details. Specifically, we will specify the representation space metric (e.g., the embedding model and similarity measure), provide definitions and examples for each discourse trait, detail the stance extraction method (including any classifiers or prompts used), and list data exclusion rules such as filters for conversation validity or length. revision: yes

Circularity Check

0 steps flagged

No circularity: purely empirical trajectory comparison

full rationale

The paper reports observational results from running self-play and mixed-play conversations across models and topics, then measuring convergence in representation space, discourse traits, and stances. No equations, fitted parameters, or first-principles derivations are presented whose outputs reduce by construction to the inputs. The attractor claim is an empirical description of observed behavior, not a self-referential definition or renamed fit. Self-citations, if any, are not load-bearing for the central finding.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract-only review; no explicit free parameters, mathematical axioms, or invented entities are stated. The term 'attractor' is used descriptively without formal dynamical-systems derivation.

pith-pipeline@v0.9.1-grok · 5734 in / 1174 out tokens · 36680 ms · 2026-06-30T06:55:49.647761+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

75 extracted references · 24 canonical work pages · 6 internal anchors

  1. [1]

    Nature , pages=

    Accelerating scientific discovery with Co-Scientist , author=. Nature , pages=. 2026 , publisher=

  2. [2]

    Agent-to-Agent Theory of Mind: Testing Interlocutor Awareness among Large Language Models

    Choi, Younwoo and Li, Changling and Yang, Yongjin and Jin, Zhijing. Agent-to-Agent Theory of Mind: Testing Interlocutor Awareness among Large Language Models. Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing. 2025. doi:10.18653/v1/2025.emnlp-main.1471

  3. [3]

    Beyond Single-Turn: A Survey on Multi-Turn Interactions with Large Language Models

    Beyond single-turn: A survey on multi-turn interactions with large language models , author=. arXiv preprint arXiv:2504.04717 , year=

  4. [4]

    Attention is All you Need , url =

    Vaswani, Ashish and Shazeer, Noam and Parmar, Niki and Uszkoreit, Jakob and Jones, Llion and Gomez, Aidan N and Kaiser, ukasz and Polosukhin, Illia , booktitle =. Attention is All you Need , url =

  5. [5]

    Echoes of

    Kaur, Avneet , editor =. Echoes of. Findings of the. doi:10.18653/v1/2025.findings-emnlp.1241 , url =

  6. [6]

    Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks

    Reimers, Nils and Gurevych, Iryna. Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks. Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing. 2019

  7. [7]

    American Journal Of Big Data , volume=

    From Code Completion to Autonomous Pipeline Orchestration: How LLM-Powered Developer Tools Are Reshaping Software Engineering Workflows , author=. American Journal Of Big Data , volume=

  8. [8]

    Human Relations , volume =

    Karen D Hughes and Alla Konnikov and Nicole Denier and Yang Hu , title =. Human Relations , volume =. 2026 , doi =. https://doi.org/10.1177/00187267251403902 , abstract =

  9. [9]

    Nature Machine Intelligence , pages=

    A large-scale randomized study of large language model feedback in peer review , author=. Nature Machine Intelligence , pages=. 2026 , publisher=

  10. [10]

    Ferrag, Mohamed Amine and Tihanyi, Norbert and Debbah, Merouane , month = mar, year =. From. doi:10.48550/arXiv.2504.19678 , abstract =

  11. [11]

    arXiv.org , author =

    Unveiling. arXiv.org , author =. 2025 , keywords =

  12. [12]

    arXiv preprint arXiv:2512.10350 , year=

    Tacheny, Nicolas , month = jan, year =. Geometric. doi:10.48550/arXiv.2512.10350 , abstract =

  13. [13]

    Interaction

    Gooding, Sian and Grefenstette, Edward , month = nov, year =. Interaction. doi:10.48550/arXiv.2511.08394 , abstract =

  14. [14]

    Perez, Jérémy and Kovač, Grgur and Léger, Corentin and Colas, Cédric and Molinaro, Gaia and Derex, Maxime and Oudeyer, Pierre-Yves and Moulin-Frier, Clément , month = jan, year =. When. doi:10.48550/arXiv.2407.04503 , abstract =

  15. [15]

    2018 , publisher=

    Nonlinear dynamics and chaos: with applications to physics, biology, chemistry, and engineering , author=. 2018 , publisher=

  16. [16]

    Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) , pages=

    Unveiling Attractor Cycles in Large Language Models: A Dynamical Systems View of Successive Paraphrasing , author=. Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) , pages=. 2025 , publisher=

  17. [17]

    arXiv preprint arXiv:2512.10350 , year=

    Geometric Dynamics of Agentic Loops in Large Language Models , author=. arXiv preprint arXiv:2512.10350 , year=

  18. [18]

    Selective agreement, not sycophancy: investigating opinion dynamics in

    Cau, Erica and Pansanella, Valentina and Pedreschi, Dino and Rossetti, Giulio , journal=. Selective agreement, not sycophancy: investigating opinion dynamics in. 2025 , publisher=

  19. [19]

    Language-Driven Opinion Dynamics in Agent-Based Simulations with

    Cau, Erica and Pansanella, Valentina and Pedreschi, Dino and Rossetti, Giulio , journal=. Language-Driven Opinion Dynamics in Agent-Based Simulations with. 2025 , url=

  20. [20]

    Simulating Opinion Dynamics with Networks of

    Chuang, Yun-Shiuan and Goyal, Agam and Harlalka, Nikunj and Suresh, Siddharth and Hawkins, Robert and Yang, Sijia and Shah, Dhavan and Hu, Junjie and Rogers, Timothy , booktitle=. Simulating Opinion Dynamics with Networks of. 2024 , publisher=

  21. [21]

    Emergent social conventions and collective bias in

    Ashery, Ariel Flint and Aiello, Luca Maria and Baronchelli, Andrea , journal=. Emergent social conventions and collective bias in. 2025 , doi=

  22. [22]

    Findings of the Association for Computational Linguistics: NAACL 2025 , pages=

    Biases in Opinion Dynamics in Multi-Agent Systems of Large Language Models: A Case Study on Funding Allocation , author=. Findings of the Association for Computational Linguistics: NAACL 2025 , pages=. 2025 , publisher=

  23. [23]

    and Rockt\"

    Khan, Akbir and Hughes, John and Valentine, Dan and Ruis, Laura and Sachan, Kshitij and Radhakrishnan, Ansh and Grefenstette, Edward and Bowman, Samuel R. and Rockt\". Debating with More Persuasive. Proceedings of the 41st International Conference on Machine Learning , pages=. 2024 , volume=

  24. [24]

    Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing , pages=

    Encouraging Divergent Thinking in Large Language Models through Multi-Agent Debate , author=. Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing , pages=. 2024 , publisher=

  25. [25]

    Estornell, Andrew and Liu, Yang , booktitle=. Multi-. 2024 , publisher=

  26. [26]

    Proceedings of the 41st International Conference on Machine Learning , year=

    Improving Factuality and Reasoning in Language Models through Multiagent Debate , author=. Proceedings of the 41st International Conference on Machine Learning , year=

  27. [27]

    2018 , url=

    Irving, Geoffrey and Christiano, Paul and Amodei, Dario , journal=. 2018 , url=

  28. [28]

    The Twelfth International Conference on Learning Representations , year=

    Towards Understanding Sycophancy in Language Models , author=. The Twelfth International Conference on Learning Representations , year=

  29. [29]

    2025 , url=

    Liu, Joshua and Jain, Aarav and Takuri, Soham and Vege, Srihan and Akalin, Aslihan and Zhu, Kevin and O'Brien, Sean and Sharma, Vasu , journal=. 2025 , url=

  30. [30]

    Social Sycophancy: A Broader Understanding of

    Cheng, Myra and Yu, Sunny and Lee, Cinoo and Khadpe, Pranav and Ibrahim, Lujain and Jurafsky, Dan , booktitle=. Social Sycophancy: A Broader Understanding of. 2026 , note=

  31. [31]

    On the conversational persuasiveness of

    Salvi, Francesco and Horta Ribeiro, Manoel and Gallotti, Riccardo and West, Robert , journal=. On the conversational persuasiveness of. 2025 , doi=

  32. [32]

    Beyond One-Way Influence: Bidirectional Opinion Dynamics in Multi-Turn Human-

    Jiang, Yuyang and Guo, Longjie and Wu, Yuchen and Caliskan, Aylin and Mitra, Tanu and Shen, Hua , journal=. Beyond One-Way Influence: Bidirectional Opinion Dynamics in Multi-Turn Human-. 2025 , url=

  33. [33]

    Conference on Language Modeling (COLM 2024) , year=

    Measuring and Controlling Instruction (In)Stability in Language Model Dialogs , author=. Conference on Language Modeling (COLM 2024) , year=

  34. [34]

    2024 , publisher=

    Frisch, Ivar and Giulianelli, Mario , booktitle=. 2024 , publisher=

  35. [35]

    2024 , doi=

    Shumailov, Ilia and Shumaylov, Zakhar and Zhao, Yiren and Papernot, Nicolas and Anderson, Ross and Gal, Yarin , journal=. 2024 , doi=

  36. [36]

    O’Brien, Carrie J

    Generative Agents: Interactive Simulacra of Human Behavior , author=. Proceedings of the 36th Annual ACM Symposium on User Interface Software and Technology (UIST '23) , year=. doi:10.1145/3586183.3606763 , url=

  37. [37]

    and Saichandran, Ketan S

    Lehr, Steven A. and Saichandran, Ketan S. and Harmon-Jones, Eddie and Vitali, Nykko and Banaji, Mahzarin R. , journal=. Kernels of selfhood:. 2025 , doi=

  38. [38]

    2025 , month=

    System Card:. 2025 , month=

  39. [39]

    ``Spiritual Bliss'' in

    Michels, Julian , year=. ``Spiritual Bliss'' in

  40. [40]

    Alexander, Scott , year=. The

  41. [41]

    2025 , month=

    Claude Finds God , author=. 2025 , month=

  42. [42]

    Emergent

    Barrie, Christopher and T\". Emergent. arXiv preprint arXiv:2505.23796 , year=

  43. [43]

    Journal of the American Statistical Association , volume=

    Reaching a Consensus , author=. Journal of the American Statistical Association , volume=

  44. [44]

    Journal of Artificial Societies and Social Simulation , volume=

    Opinion Dynamics and Bounded Confidence: Models, Analysis and Simulation , author=. Journal of Artificial Societies and Social Simulation , volume=. 2002 , url=

  45. [45]

    The Journal of Mathematical Sociology , volume=

    Social influence and opinions , author=. The Journal of Mathematical Sociology , volume=

  46. [46]

    Advances in Group Processes , volume=

    Social Influence Networks and Opinion Change , author=. Advances in Group Processes , volume=. 1999 , publisher=

  47. [47]

    Advances in Complex Systems , volume=

    Mixing beliefs among interacting agents , author=. Advances in Complex Systems , volume=

  48. [48]

    Advances in Neural Information Processing Systems , volume=

    Training language models to follow instructions with human feedback , author=. Advances in Neural Information Processing Systems , volume=

  49. [49]

    and Hatfield-Dodds, Zac and Mann, Ben and Amodei, Dario and Joseph, Nicholas and McCandlish, Sam and Brown, Tom and Kaplan, Jared , journal=

    Bai, Yuntao and Kadavath, Saurav and Kundu, Sandipan and Askell, Amanda and Kernion, Jackson and Jones, Andy and Chen, Anna and Goldie, Anna and Mirhoseini, Azalia and McKinnon, Cameron and Chen, Carol and Olsson, Catherine and Olah, Christopher and Hernandez, Danny and Drain, Dale and Ganguli, Deep and Li, Dustin and Tran-Johnson, Eli and Perez, Ethan an...

  50. [50]

    Perez, J\'. When. arXiv preprint arXiv:2407.04503 , year=

  51. [51]

    Interaction Dynamics as a Reward Signal for

    Gooding, Sian and Grefenstette, Edward , journal=. Interaction Dynamics as a Reward Signal for. 2025 , url=

  52. [52]

    Proceedings of the 8th BlackboxNLP Workshop: Analyzing and Interpreting Neural Networks for NLP , pages=

    Emergent Convergence in Multi-Agent LLM Annotation , author=. Proceedings of the 8th BlackboxNLP Workshop: Analyzing and Interpreting Neural Networks for NLP , pages=

  53. [53]

    2023 , url=

    Li, Guohao and Hammoud, Hasan Abed Al Kader and Itani, Hani and Khizbullin, Dmitrii and Ghanem, Bernard , booktitle=. 2023 , url=

  54. [54]

    The Twelfth International Conference on Learning Representations , year=

    Does Writing with Language Models Reduce Content Diversity? , author=. The Twelfth International Conference on Learning Representations , year=

  55. [55]

    Conformity, Confabulation, and Impersonation: Persona Inconstancy in Multi-Agent LLM Collaboration

    Baltaji, Razan and Hemmatian, Babak and Varshney, Lav. Conformity, Confabulation, and Impersonation: Persona Inconstancy in Multi-Agent LLM Collaboration. Proceedings of the 2nd Workshop on Cross-Cultural Considerations in NLP. 2024. doi:10.18653/v1/2024.c3nlp-1.2

  56. [56]

    Costello and Gordon Pennycook and David G

    Thomas H. Costello and Gordon Pennycook and David G. Rand , title =. Science , volume =. 2024 , doi =. https://www.science.org/doi/pdf/10.1126/science.adq1814 , abstract =

  57. [57]

    Political Analysis , volume=

    Out of one, many: Using language models to simulate human samples , author=. Political Analysis , volume=. 2023 , publisher=

  58. [58]

    Systematic Biases in LLM Simulations of Debates

    Taubenfeld, Amir and Dover, Yaniv and Reichart, Roi and Goldstein, Ariel. Systematic Biases in LLM Simulations of Debates. Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing. 2024. doi:10.18653/v1/2024.emnlp-main.16

  59. [59]

    Proceedings of the 40th International Conference on Machine Learning , pages =

    Whose Opinions Do Language Models Reflect? , author =. Proceedings of the 40th International Conference on Machine Learning , pages =. 2023 , editor =

  60. [60]

    Bricknell, Adam , year=. Mapping

  61. [61]

    2025 , month=

    The Void , author=. 2025 , month=

  62. [62]

    Hangfan Zhang and Zhiyao Cui and Qiaosheng Zhang and Shuyue Hu , booktitle=. Multi-. 2025 , url=

  63. [63]

    2026 , month=

    The Bliss Attractor , author=. 2026 , month=

  64. [64]

    The assis- tant axis: Situating and stabilizing the default persona of language models.arXiv preprint arXiv:2601.10387, 2026

    The assistant axis: Situating and stabilizing the default persona of language models , author=. arXiv preprint arXiv:2601.10387 , year=

  65. [65]

    The Hot Mess of

    Alexander H. The Hot Mess of. The Fourteenth International Conference on Learning Representations , year=

  66. [66]

    Collective AI can amplify tiny perturbations into divergent decisions

    Chaotic Dynamics in Multi-LLM Deliberation , author=. arXiv preprint arXiv:2603.09127 , year=

  67. [67]

    The Twelfth International Conference on Learning Representations , year=

    Self-consuming generative models go mad , author=. The Twelfth International Conference on Learning Representations , year=

  68. [68]

    Artificial

    Jiang, Liwei and Chai, Yuanjun and Li, Margaret and Liu, Mickel and Fok, Raymond and Dziri, Nouha and Tsvetkov, Yulia and Sap, Maarten and Albalak, Alon and Choi, Yejin , month = oct, year =. Artificial. doi:10.48550/arXiv.2510.22954 , abstract =

  69. [69]

    Nehring, Jan and Gabryszak, Aleksandra and Jürgens, Pascal and Burchardt, Aljoscha and Schaffer, Stefan and Spielkamp, Matthias and Stark, Birgit , editor =. Large. Proceedings of the 2024. 2024 , keywords =

  70. [70]

    LLMs Get Lost In Multi-Turn Conversation

    Laban, Philippe and Hayashi, Hiroaki and Zhou, Yingbo and Neville, Jennifer , month = may, year =. doi:10.48550/arXiv.2505.06120 , abstract =

  71. [71]

    Ratnakar, Shivam and Raghavendra, Sanjay , month = oct, year =. The. doi:10.48550/arXiv.2510.16712 , abstract =

  72. [72]

    Abdulhai, Marwa and White, Isadora and Wan, Yanming and Qureshi, Ibrahim and Leibo, Joel and Kleiman-Weiner, Max and Jaques, Natasha , month = mar, year =. How. doi:10.48550/arXiv.2603.18161 , abstract =

  73. [73]

    Corrupted by

    Guzman Piedrahita, David and Yang, Yongjin and Sachan, Mrinmaya and Ramponi, Giorgia and Schölkopf, Bernhard and Jin, Zhijing , month = jun, year =. Corrupted by. doi:10.48550/arXiv.2506.23276 , abstract =

  74. [74]

    Old Habits Die Hard: How Conversational History Geometrically Traps LLMs

    Simhi, Adi and Barez, Fazl and Tutek, Martin and Belinkov, Yonatan and Cohen, Shay B. , month = feb, year =. Old. doi:10.48550/arXiv.2603.03308 , abstract =

  75. [75]

    Qiu, Tianyi Alex and He, Zhonghao and Chugh, Tejasveer and Kleiman-Weiner, Max , month = jun, year =. The. doi:10.48550/arXiv.2506.06166 , abstract =