pith. sign in

arxiv: 2605.17353 · v1 · pith:URR2PHUTnew · submitted 2026-05-17 · 💻 cs.CY

You Can't Fool Us: Understanding the Resilience of LLM-driven Agent Communities to Misinformation

Pith reviewed 2026-05-19 23:14 UTC · model grok-4.3

classification 💻 cs.CY
keywords misinformation resilienceLLM agent communitiesactively open-minded thinkingpolitical ideologybelief correctionagent-based simulationintervention design
0
0 comments X

The pith

Higher open-minded thinking in simulated communities reduces misinformation uptake and speeds recovery, while polarization leaves more lingering support.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper constructs synthetic communities of LLM agents that differ along two dimensions: Actively Open-minded Thinking, which tracks willingness to seek evidence and revise beliefs, and Political Ideology, which tracks identity-driven interpretation of claims. It exposes these communities to credible misinformation shocks and tracks how trust rises and then falls through agent interactions. Results show that higher AOT produces both lower initial acceptance and stronger recovery after the trust peak, while moderate ideology supports more complete correction than polarized stances. Intervention tests indicate that persuasion and fact-checking help agents move from questioning to outright denial and support withdrawal more effectively than accuracy prompts or source warnings.

Core claim

Across systematically varied AOT-PI communities, higher AOT improves both resistance to misinformation uptake and recovery after trust peaks. PI shapes the recovery pathway: ideologically moderate communities recover more reliably, while polarized communities retain more residual support. Stance-level analysis shows that resilience depends on whether agents move from questioning a claim to denying or correcting it and withdrawing prior support. Intervention experiments further show that persuasion and fact checking better support post-peak correction, whereas accuracy prompts mainly induce early caution and source warnings have weaker effects.

What carries the argument

LLM-based agent simulation that assigns Actively Open-minded Thinking (AOT) and Political Ideology (PI) traits to model how communities process and recover from misinformation shocks through interaction and correction.

If this is right

  • Higher AOT reduces initial trust in false claims and accelerates return to lower support levels after a peak.
  • Moderate political ideology enables more reliable movement from questioning to active correction and support withdrawal.
  • Polarized ideology leaves higher residual support even after recovery begins.
  • Persuasion and fact-checking interventions improve post-peak correction more than accuracy prompts or source warnings.
  • Resilience emerges from specific stance transitions rather than from uniform skepticism.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same simulation method could be used to pre-test combinations of traits or interventions before field studies with people.
  • Designing public programs that increase open-minded thinking might improve community-level correction of false claims.
  • Validation against human data would be required before treating the simulation outputs as direct predictions for policy.

Load-bearing premise

LLM agents assigned AOT and PI traits can faithfully reproduce the psychological processes and social interactions that shape real human community responses to misinformation.

What would settle it

Run parallel experiments with real human groups whose AOT and PI levels have been measured in advance, then compare their uptake curves, recovery trajectories, stance shifts, and responses to the same interventions against the patterns produced by the agent simulations.

Figures

Figures reproduced from arXiv: 2605.17353 by Chichen Lin, Han Xiao, Kangbo Hu, Weijian Fan, Yijie Jin, Yongbin Wang, Zhanzhan Zhao, Zhihui Ying.

Figure 1
Figure 1. Figure 1: The same misinformation can drive different com [PITH_FULL_IMAGE:figures/full_fig_p001_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Overview of CoSim. The framework consists of four stages: (A) community construction, where psychological trait scores are sampled from predefined community distributions and mapped into calibrated persona prompts; (B) misinformation challenge, where verified misinformation cases are collected and injected into the community by a spreader agent; (C) social interaction simulation, where agents interact, upd… view at source ↗
Figure 3
Figure 3. Figure 3: Community design and resilience outcomes. (A) Community design crosses four AOT distributions with four PI distributions; blue solid and orange dashed curves show the corresponding AOT and PI densities. (B) Resilience map plots robustness and recovery scores, where higher values indicate stronger resilience. Dashed lines mark median splits, stars mark the highest and lowest composite scores, and the inset … view at source ↗
Figure 4
Figure 4. Figure 4: Cell-level behavioral mechanisms across AOT-by-PI profiles. (A) Query generation measures the pre-peak ten￾dency to question misinformation-related information. (B) Deny gain measures the post-peak conversion from scrutiny to explicit rejection. (C) Support release measures the post-peak reduction in misinformation support, with positive values indi￾cating support withdrawal and negative values indicating … view at source ↗
Figure 5
Figure 5. Figure 5: Intervention effects in the robustness–recovery space. We evaluate four representative communities selected from RQ1: G10, G13, G11, and G01, which cover the four combinations of high/low robustness and high/low recovery. Each panel compares the control condition with four interventions: accuracy prompt, persuasion, fact checking, and source warning. The x￾axis reports robustness, where higher values indic… view at source ↗
Figure 6
Figure 6. Figure 6: Intervention-induced stance shifts before and after the trust peak. Values denote percentage-point changes relative to the control condition, averaged across the representative communities in [PITH_FULL_IMAGE:figures/full_fig_p008_6.png] view at source ↗
Figure 7
Figure 7. Figure 7: Agent alignment evaluation across Actively Open-minded Thinking and Political Ideology levels. RMSE and MAE [PITH_FULL_IMAGE:figures/full_fig_p016_7.png] view at source ↗
Figure 8
Figure 8. Figure 8: Distribution of the retained misinformation pool. [PITH_FULL_IMAGE:figures/full_fig_p022_8.png] view at source ↗
read the original abstract

Misinformation resilience is a dynamic community process: communities differ not only in whether they initially trust false claims, but also in how they recover through interaction, questioning, correction, and support withdrawal. We study this process with an LLM-based agent simulation that constructs synthetic communities along two theoretically motivated dimensions: Actively Open-minded Thinking (AOT), which captures evidence-seeking and willingness to revise beliefs, and Political Ideology (PI), which captures identity-based interpretation of contested claims. These two traits allow us to examine how evidence-oriented reasoning and ideological alignment jointly shape community responses to credible misinformation shocks. Across systematically varied AOT-PI communities, we find that higher AOT improves both resistance to misinformation uptake and recovery after trust peaks. PI shapes the recovery pathway: ideologically moderate communities recover more reliably, while polarized communities retain more residual support. Stance-level analysis shows that resilience depends on whether agents move from questioning a claim to denying or correcting it and withdrawing prior support. Intervention experiments further show that persuasion and fact checking better support post-peak correction, whereas accuracy prompts mainly induce early caution and source warnings have weaker effects. Together, this work provides a mechanism-level account of community misinformation resilience, showing how psychological composition and intervention design shape whether communities move from misinformation exposure toward correction or persistent support.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

3 major / 2 minor

Summary. The paper uses LLM-based agent simulations to construct synthetic communities varying along Actively Open-minded Thinking (AOT) and Political Ideology (PI) dimensions. It examines dynamic responses to credible misinformation shocks, reporting that higher AOT improves initial resistance to uptake and post-peak recovery, while moderate PI communities recover more reliably than polarized ones. Stance transitions (from questioning to denial/correction and support withdrawal) and intervention experiments (persuasion, fact-checking, accuracy prompts, source warnings) are used to derive a mechanism-level account of community resilience.

Significance. If the simulation results hold under validation, the work offers a mechanism-level view of how cognitive traits and intervention types jointly influence misinformation recovery pathways, extending beyond static uptake measures. Systematic variation of AOT and PI and the focus on recovery dynamics are positive features. Significance is limited by the absence of external anchoring to human data or controls for simulation artifacts.

major comments (3)
  1. [§3 (Agent Construction and Trait Assignment)] §3 (Agent Construction and Trait Assignment): The central claims attribute resistance, recovery, and residual support patterns specifically to AOT and PI. However, the manuscript provides no ablation experiments that isolate these trait prompts from base LLM priors, default response tendencies, or prompt phrasing effects. This is load-bearing, as the mechanism-level account requires demonstrating that the observed stance transitions are driven by the assigned traits rather than simulation artifacts.
  2. [§4 (Results on AOT/PI Effects)] §4 (Results on AOT/PI Effects): No direct comparison is reported to existing human-subject experiments on AOT and misinformation belief updating or recovery. Without such grounding, it remains unclear whether the simulated patterns reproduce or extend known human dynamics, weakening the generalizability of the claim that higher AOT improves both resistance and recovery.
  3. [§5 (Intervention Experiments)] §5 (Intervention Experiments): The finding that persuasion and fact-checking outperform accuracy prompts and source warnings for post-peak correction rests on single-model runs without reported robustness checks across different LLMs or prompt rephrasings. This is load-bearing for the intervention-design implications.
minor comments (2)
  1. [Abstract and §2] The abstract and §2 could more explicitly state the exact discrete levels or ranges used for AOT and PI to allow replication.
  2. [Results figures] Recovery trajectory figures would benefit from explicit reporting of run-to-run variance or statistical tests on the AOT and PI main effects.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for their constructive comments, which identify key areas to bolster the validity of our simulation-based findings on community resilience to misinformation. We address each point below and commit to revisions that enhance the isolation of trait effects, grounding in human literature, and robustness of intervention results.

read point-by-point responses
  1. Referee: §3 (Agent Construction and Trait Assignment): The central claims attribute resistance, recovery, and residual support patterns specifically to AOT and PI. However, the manuscript provides no ablation experiments that isolate these trait prompts from base LLM priors, default response tendencies, or prompt phrasing effects. This is load-bearing, as the mechanism-level account requires demonstrating that the observed stance transitions are driven by the assigned traits rather than simulation artifacts.

    Authors: We agree that demonstrating the specific contribution of the AOT and PI trait assignments is crucial for the mechanism-level claims. Our design uses the same base LLM and varies only the trait-describing prompts across conditions, which provides initial control for model priors. To address this directly, we will add ablation experiments in the revised manuscript, including conditions with scrambled or neutral trait prompts and alternative phrasings of the AOT/PI descriptions. These will be reported in an expanded §3 to confirm that stance transition patterns are driven by the intended traits. revision: yes

  2. Referee: §4 (Results on AOT/PI Effects): No direct comparison is reported to existing human-subject experiments on AOT and misinformation belief updating or recovery. Without such grounding, it remains unclear whether the simulated patterns reproduce or extend known human dynamics, weakening the generalizability of the claim that higher AOT improves both resistance and recovery.

    Authors: We acknowledge the value of linking simulation results to human empirical findings. Although the paper emphasizes novel dynamic aspects of community recovery that are challenging to observe in human studies, we will revise the discussion in §4 to include explicit comparisons with prior human-subject research on AOT and misinformation. This will reference studies demonstrating that higher AOT reduces susceptibility to false claims and discuss how our findings on post-peak recovery extend these insights to community-level processes. revision: yes

  3. Referee: §5 (Intervention Experiments): The finding that persuasion and fact-checking outperform accuracy prompts and source warnings for post-peak correction rests on single-model runs without reported robustness checks across different LLMs or prompt rephrasings. This is load-bearing for the intervention-design implications.

    Authors: The intervention results were obtained using a consistent high-capacity LLM to ensure comparability across the extensive simulation runs. We agree that robustness to model choice and prompt variations strengthens the practical implications. In the revised version, we will include additional runs using a second LLM and varied prompt formulations for the key interventions, reporting consistency in the relative effectiveness of persuasion and fact-checking. These checks will be added to §5. revision: yes

Circularity Check

0 steps flagged

No significant circularity: simulation outcomes emerge from upstream trait definitions and rules

full rationale

The paper constructs synthetic communities by assigning AOT and PI traits to LLM agents via prompts, then runs interaction simulations under misinformation shocks. Reported patterns (higher AOT improving resistance/recovery; PI modulating residual support) are direct outputs of these defined rules and stance-transition observations, not quantities fitted to the same results or reduced by construction to inputs. No equations, parameter fitting to target outcomes, self-citations as load-bearing premises, or ansatz smuggling appear in the abstract or setup. The work is a standard simulation study whose central claims remain independent of the reported findings themselves.

Axiom & Free-Parameter Ledger

2 free parameters · 1 axioms · 0 invented entities

The central claim depends on the domain assumption that LLM agents can represent human belief updating and social correction processes when parameterized by AOT and PI. Free parameters include the discrete levels chosen for AOT and PI to create community variations and the specific intervention prompt designs.

free parameters (2)
  • AOT levels
    Discrete values assigned to agents to vary evidence-seeking and belief revision across communities.
  • PI levels
    Ideological positions assigned to agents to vary identity-based interpretation of claims.
axioms (1)
  • domain assumption LLM agents can model human-like belief updating, questioning, correction, and support withdrawal when given psychological trait prompts.
    Invoked when constructing synthetic communities and interpreting stance changes and recovery.

pith-pipeline@v0.9.0 · 5781 in / 1367 out tokens · 42320 ms · 2026-05-19T23:14:09.041291+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

13 extracted references · 13 canonical work pages · 2 internal anchors

  1. [1]

    Borah, A.; Mihalcea, R.; and Perez-Rosas, V

    Belief-Sim: Towards Belief-Driven Simulation of De- mographic Misinformation Susceptibility.arXiv preprint arXiv:2603.03585. Borah, A.; Mihalcea, R.; and Perez-Rosas, V . 2026. Per- suasion at Play: Understanding Misinformation Dynamics in Demographic-Aware Human-LLM Interactions. InPro- ceedings of the 19th Conference of the European Chapter of the Assoc...

  2. [2]

    LLM Agents Grounded in Self-Reports Enable General-Purpose Simulation of Individuals

    Generative agent simulations of 1,000 people.arXiv preprint arXiv:2411.10109. Pennycook, G.; Epstein, Z.; Mosleh, M.; Arechar, A. A.; Eckles, D.; and Rand, D. G. 2021. Shifting attention to accu- racy can reduce misinformation online.Nature, 592(7855): 590–595. Pennycook, G.; and Rand, D. G. 2019. Lazy, not biased: Susceptibility to partisan fake news is ...

  3. [3]

    Roozenbeek, J.; Freeman, A

    How accurate are accuracy-nudge interventions? A preregistered direct replication of Pennycook et al.Psycho- logical Science, 32(7): 1169–1178. Roozenbeek, J.; Freeman, A. L. J.; and van der Linden, S

  4. [4]

    Beyond the Crowd: LLM-Augmented Community Notes for Governing Health Misinformation

    Susceptibility to misinformation is consistent across question framings and response modes and better explained by myside bias and partisanship than analytical thinking. Judgment and Decision Making, 17(3): 547–573. Roozenbeek, J.; and van der Linden, S. 2019. Fake news game confers psychological resistance against online misin- formation.Palgrave Communi...

  5. [5]

    For most authors... (a) Would answering this research question advance sci- ence without violating social contracts, such as violat- ing privacy norms, perpetuating unfair profiling, exac- erbating the socio-economic divide, or implying dis- respect to societies or cultures? The study uses con- trolled LLM agent simulations to analyze misinforma- tion res...

  6. [6]

    (a) Did you clearly state the assumptions underlying all theoretical results? The paper does not present for- mal theoretical results or proofs

    Additionally, if your study involves hypotheses testing... (a) Did you clearly state the assumptions underlying all theoretical results? The paper does not present for- mal theoretical results or proofs. It is organized around research questions and controlled simulation experi- ments. (b) Have you provided justifications for all theoretical re- sults? Th...

  7. [7]

    (a) Did you state the full set of assumptions of all theo- retical results? The paper does not include theoretical proofs

    Additionally, if you are including theoretical proofs... (a) Did you state the full set of assumptions of all theo- retical results? The paper does not include theoretical proofs. (b) Did you include complete proofs of all theoretical re- sults? The paper does not include theoretical proofs

  8. [8]

    the cost

    Additionally, if you ran machine learning experiments... (a) Did you include the code, data, and instructions needed to reproduce the main experimental results, ei- ther in the supplemental material or as a URL? We do not include the full code at submission because the CoSim implementation is still under active develop- ment, including ongoing extensions ...

  9. [9]

    (a) If your work uses existing assets, did you cite the cre- ators? We cite the existing LLM backbones, simula- tion related work, and misinformation research used in the study

    Additionally, if you are using existing assets, such as code, data, models, or curating and releasing new assets, without compromising anonymity... (a) If your work uses existing assets, did you cite the cre- ators? We cite the existing LLM backbones, simula- tion related work, and misinformation research used in the study. (b) Did you mention the license...

  10. [10]

    (a) Did you include the full text of instructions given to participants and screenshots? The study does not in- volve crowdsourcing or human participants

    Additionally, if you used crowdsourcing or conducted research with human subjects,without compromising anonymity... (a) Did you include the full text of instructions given to participants and screenshots? The study does not in- volve crowdsourcing or human participants. (b) Did you describe any potential participant risks, with mentions of Institutional R...

  11. [11]

    Decimals are allowed, preferably with 1 or 2 decimal places

    Provide a credibility score from 0 to 10. Decimals are allowed, preferably with 1 or 2 decimal places

  12. [12]

    Provide a detailed and evidence- based explanation

  13. [13]

    score": 6.7, 35

    Output only a JSON object in the following format: 32 33{ 34"score": 6.7, 35"reason": "The claim appears to have a reliable source, but some details lack explicit evidence, so it is slightly credible." 36} Social Interaction Simulation Prompts of Social Interaction SimulationThe social in- teraction module uses three prompts. Prompt C1 updates each agent’...