arxiv: 2604.13047 · v1 · submitted 2026-03-13 · 💻 cs.SI · cs.AI· cs.CY

Recognition: 1 theorem link

· Lean Theorem

Integration of Deep Reinforcement Learning and Agent-based Simulation to Explore Strategies Counteracting Information Disorder

Luigi Lomasto , Andrea Camoia , Alfonso Guarino , Nicola Lettieri , Delfina Malandrino , Rocco Zaccagnino

Authors on Pith no claims yet

Pith reviewed 2026-05-15 11:30 UTC · model grok-4.3

classification 💻 cs.SI cs.AIcs.CY

keywords agent-based modelingdeep reinforcement learninginformation disorderfake newsmisinformation mitigationsocial simulationpolicy optimization

0 comments

The pith

Combining agent-based simulation with deep reinforcement learning identifies conditions under which policies can limit misinformation spread.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper integrates an agent-based model that simulates the spread of fake news and user interactions on social media with deep reinforcement learning to automatically discover and test containment strategies. This hybrid method lets researchers run controlled experiments on different policies and observe their effects on information disorder dynamics without real-world deployment. A sympathetic reader would care because it supplies a repeatable way to compare interventions and surface the specific settings where they work, addressing a persistent public problem where ad-hoc responses have often fallen short. Preliminary runs already hint at patterns in policy success, while also pointing to broader methodological gains in linking simulation with machine learning.

Core claim

An agent-based model that captures complex fake news propagation and user behavior on social platforms, when coupled with deep reinforcement learning, can learn strategies that mitigate the spread of misinformation, and early experiments reveal the conditions under which specific policies succeed.

What carries the argument

An agent-based model simulating fake news dynamics and containment effects, trained via deep reinforcement learning to optimize policies that reduce misinformation propagation.

If this is right

Policies discovered in the simulation can be ranked by their effectiveness under varying network sizes, user behaviors, and news characteristics.
The same integrated setup can be reused to compare multiple containment tactics such as fact-checking prompts or content throttling.
The approach supplies quantitative evidence on the timing and targeting of interventions rather than relying on intuition alone.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Feeding real platform traces into the model could test whether the learned policies transfer beyond the simulated environment.
The method could be adapted to study other information disorders such as coordinated inauthentic behavior or conspiracy amplification.
Extending the reinforcement-learning reward function to include secondary effects like user trust erosion might yield more robust policies.

Load-bearing premise

The agent-based model must faithfully represent the real-world spread of fake news and the way users respond to it on social media.

What would settle it

If independent runs or real social-media traces show that the learned policies produce no measurable reduction in misinformation spread compared with baseline conditions, the central claim would be refuted.

Figures

Figures reproduced from arXiv: 2604.13047 by Alfonso Guarino, Andrea Camoia, Delfina Malandrino, Luigi Lomasto, Nicola Lettieri, Rocco Zaccagnino.

**Figure 1.** Figure 1: Visual abstract of tiers’architecture 3.1 Model-Driven Tier (MDT) The MDT comprises an ABM constructed based on the phenomenon designed in Section 2. During the simulation process, MDT provides information pertinent to network dynamics (simulation data content) and news propagation to the DDT (see Section 3.2). As described above, the basic agents, that depict the users on social networks, can have thre… view at source ↗

**Figure 3.** Figure 3: Average Virality Trend without Super-Agent. [PITH_FULL_IMAGE:figures/full_fig_p006_3.png] view at source ↗

**Figure 2.** Figure 2: Average Virality Trend without Super-Agent. [PITH_FULL_IMAGE:figures/full_fig_p006_2.png] view at source ↗

**Figure 5.** Figure 5: Average Virality Trend with Super-Agent. Action every 4 [PITH_FULL_IMAGE:figures/full_fig_p007_5.png] view at source ↗

**Figure 7.** Figure 7: Average Virality Trend with Super-Agent. Differences. [PITH_FULL_IMAGE:figures/full_fig_p007_7.png] view at source ↗

read the original abstract

In recent years, the spread of fake news has triggered a growing interest in Information Disorders (ID) on social media, a phenomenon that has become a focal point of research across fields ranging from complexity theory and computer science to cognitive sciences. Overall, such a body of research can be traced back to two main approaches. On the one hand, there are works focused on exploiting data mining to analyze the content of news and related metadata data-driven approach; on the other hand, works are aiming at making sense of the phenomenon at hand and their evolution using explicit simulation models model-driven approach). In this paper, we integrate these approaches to explore strategies for counteracting IDs. Heading in this direction, we put together: i. an Agent-Based model to simulate in a scientifically sound way both complex fake news dynamics and the effects produced by containment strategies therein; ii. Deep Reinforcement Learning to learn the strategies that can better mitigate the spread of misinformation. The outcomes of our work unfold on different levels. From a substantive point of view, the results of preliminary experiments started providing interesting cues about the conditions under which given policies can mitigate the spread of misinformation. From a technical and methodological point of view, we scratched the surface of promising and worthy research topics like the integration of social simulation and artificial intelligence and the enhancement of social science simulation environments.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper sketches a hybrid ABM-DRL setup for exploring anti-misinformation policies but supplies no validation of the simulation against real data.

read the letter

The main point is that this work pairs an agent-based model of fake news spread with deep reinforcement learning so the RL agent can search for effective containment strategies inside the simulated environment. That integration is the actual new element beyond the separate data-mining and pure-simulation lines cited in the background. The setup lets the ABM generate trajectories that include user behavior and policy effects, then hands control to DRL for policy optimization. Preliminary runs are said to surface some conditions under which certain policies slow misinformation, which at least shows the loop can close and produce outputs. The framing is straightforward and connects the two literatures without overclaiming. The central weakness is the missing calibration and validation steps for the ABM. The abstract calls the simulation scientifically sound, yet no procedure is described for matching parameters to observed cascade sizes, engagement rates, or platform datasets, and no sensitivity checks appear. Without those anchors the cues remain inside an untested model whose distance from real social media is unknown, so transferability stays an open question. The experiments are preliminary and report no metrics or baselines. This is aimed at computational social scientists who already work with ABMs or RL in social systems and want to see the two combined on a policy question. A reader in that group could use it as a concrete example of the hybrid method even if more grounding is required. I would bring it to a reading group to discuss the integration mechanics. It is too early to cite, but the idea is coherent enough that a serious referee could usefully press on validation and experiment design.

Referee Report

1 major / 0 minor

Summary. The paper proposes integrating an Agent-Based Model (ABM) to simulate complex fake news dynamics and containment strategies on social media with Deep Reinforcement Learning (DRL) to learn optimal mitigation policies. Preliminary experiments are described as yielding interesting cues about conditions under which given policies can reduce misinformation spread, while also highlighting the methodological value of combining social simulation with AI.

Significance. If the ABM were shown to be calibrated and validated against empirical data, the integration could provide a useful framework for exploring policy effectiveness in information disorder scenarios, advancing the combination of simulation and reinforcement learning in social science. The preliminary nature of the results and absence of quantitative validation currently limit the strength of any substantive claims about real-world applicability.

major comments (1)

[Abstract and §1] Abstract and §1 (model description): The claim that the ABM simulates 'in a scientifically sound way' both fake news dynamics and containment effects is unsupported. No calibration procedure, comparison of simulated cascade sizes or engagement rates to empirical distributions (e.g., Twitter/Facebook datasets), performance metrics, baselines, or sensitivity analysis are reported, leaving the reliability of the DRL-derived policy cues unknown.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for the constructive feedback and for recognizing the potential of integrating ABM with DRL to explore mitigation strategies for information disorder. We agree that the preliminary character of the experiments and the lack of explicit empirical calibration limit the strength of claims about real-world applicability. We will revise the manuscript to address the specific concerns raised.

read point-by-point responses

Referee: [Abstract and §1] Abstract and §1 (model description): The claim that the ABM simulates 'in a scientifically sound way' both fake news dynamics and containment effects is unsupported. No calibration procedure, comparison of simulated cascade sizes or engagement rates to empirical distributions (e.g., Twitter/Facebook datasets), performance metrics, baselines, or sensitivity analysis are reported, leaving the reliability of the DRL-derived policy cues unknown.

Authors: We accept this criticism. The phrase 'in a scientifically sound way' in the abstract and introduction is not supported by the reported experiments, which remain exploratory. The ABM draws on standard mechanisms from the literature on information diffusion (e.g., threshold models and network topologies), but no calibration against empirical cascade-size distributions or engagement rates was performed. We will revise the abstract and Section 1 to remove or qualify this phrasing, explicitly state the preliminary and illustrative nature of the results, add a dedicated limitations subsection discussing the absence of empirical validation, and outline concrete steps for future calibration and sensitivity analysis. revision: yes

Circularity Check

0 steps flagged

No circularity: ABM generates independent trajectories; DRL optimizes policies without self-referential reduction

full rationale

The paper describes an integration of an agent-based model (ABM) to simulate misinformation dynamics and deep reinforcement learning (DRL) to optimize mitigation strategies. No equations, fitted parameters, or predictions are presented that reduce by construction to the model's inputs. The ABM produces simulation trajectories, and the RL component learns policies from those trajectories without the outputs being fed back to redefine the ABM or its parameters. No self-citations are invoked as load-bearing uniqueness theorems or ansatzes. The preliminary experimental cues arise directly from running the described simulation-optimization loop, which is self-contained and does not exhibit any of the enumerated circular patterns. Lack of external calibration is a validity concern, not a circularity issue.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central claim depends on the unverified fidelity of the agent-based model to real information disorder processes and on the assumption that DRL can discover transferable strategies inside that model; no explicit free parameters, invented entities, or additional axioms are stated in the abstract.

axioms (1)

domain assumption The agent-based model faithfully represents real-world information disorder dynamics
Invoked when the paper states that the ABM simulates fake news dynamics in a scientifically sound way.

pith-pipeline@v0.9.0 · 5561 in / 1145 out tokens · 30693 ms · 2026-05-15T11:30:49.577734+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

26 extracted references · 26 canonical work pages

[1]

[Abbruzzeseet al., 2021 ] Roberto Abbruzzese, Angelo Gaeta, Vincenzo Loia, Luigi Lomasto, and Francesco Orciuoli. Detecting influential news in online commu- nities: an approach based on hexagons of opposition generated by three-way decisions and probabilistic rough sets.Information Sciences, 578:364–377,

work page 2021
[2]

Developing an agent-based model to minimize spreading of malicious information in dynamic social networks.Computational and Mathemati- cal Organization Theory, pages 1–16,

[Alassadet al., 2023 ] Mustafa Alassad, Muhammad Nihal Hussain, and Nitin Agarwal. Developing an agent-based model to minimize spreading of malicious information in dynamic social networks.Computational and Mathemati- cal Organization Theory, pages 1–16,

work page 2023
[3]

Attention- enriched mini-bert fake news analyzer using the arabic lan- guage.Future Internet, 15(2):44,

[Alawadhet al., 2023 ] Husam M Alawadh, Amerah Al- abrah, Talha Meraj, and Hafiz Tayyab Rauf. Attention- enriched mini-bert fake news analyzer using the arabic lan- guage.Future Internet, 15(2):44,

work page 2023
[4]

Social media and fake news in the 2016 elec- tion.Journal of economic perspectives, 31(2):211–236,

[Allcott and Gentzkow, 2017] Hunt Allcott and Matthew Gentzkow. Social media and fake news in the 2016 elec- tion.Journal of economic perspectives, 31(2):211–236,

work page 2017
[5]

Sentiment anal- ysis for fake news detection.Electronics, 10(11):1348,

[Alonsoet al., 2021 ] Miguel A Alonso, David Vilares, Car- los G´omez-Rodr´ıguez, and Jes´us Vilares. Sentiment anal- ysis for fake news detection.Electronics, 10(11):1348,

work page 2021
[6]

Content based fake news detection with machine and deep learning: a systematic review.Neurocomputing,

[Capuanoet al., 2023 ] Nicola Capuano, Giuseppe Fenza, Vincenzo Loia, and Francesco David Nota. Content based fake news detection with machine and deep learning: a systematic review.Neurocomputing,

work page 2023
[7]

Q- learning: Theory and applications.Annual Review of Statistics and Its Application, 7:279–301,

[Clifton and Laber, 2020] Jesse Clifton and Eric Laber. Q- learning: Theory and applications.Annual Review of Statistics and Its Application, 7:279–301,

work page 2020
[8]

On agent-based modeling and computational social science.Frontiers in psychology, 5:668,

[Conte and Paolucci, 2014] Rosaria Conte and Mario Paolucci. On agent-based modeling and computational social science.Frontiers in psychology, 5:668,

work page 2014
[9]

Hybrid and lightweight detection of third party tracking: Design, implementation, and evaluation.Computer Networks, 167:106993,

[Cozzaet al., 2020 ] Federico Cozza, Alfonso Guarino, Francesco Isernia, Delfina Malandrino, Antonio Rapuano, Raffaele Schiavone, and Rocco Zaccagnino. Hybrid and lightweight detection of third party tracking: Design, implementation, and evaluation.Computer Networks, 167:106993,

work page 2020
[10]

A generalized deep markov random fields framework for fake news de- tection.IJCAI International Joint Conference on Artificial Intelligence, 2023-August:4758 – 4765,

[Donget al., 2023 ] Yiqi Dong, Dongxiao He, Xiaobao Wang, Yawen Li, Xiaowen Su, and Di Jin. A generalized deep markov random fields framework for fake news de- tection.IJCAI International Joint Conference on Artificial Intelligence, 2023-August:4758 – 4765,

work page 2023
[11]

On ran- dom graphs i.Publ

[ERDdS and R&wi, 1959] P ERDdS and A R&wi. On ran- dom graphs i.Publ. math. debrecen, 6(290-297):18,

work page 1959
[12]

Filter bubbles, echo chambers, and online news consumption.Public opinion quarterly, 80(S1):298– 320,

[Flaxmanet al., 2016 ] Seth Flaxman, Sharad Goel, and Justin M Rao. Filter bubbles, echo chambers, and online news consumption.Public opinion quarterly, 80(S1):298– 320,

work page 2016
[13]

Centrality in social net- works conceptual clarification.Social networks, 1(3):215– 239,

[Freeman, 1978] Linton C Freeman. Centrality in social net- works conceptual clarification.Social networks, 1(3):215– 239,

work page 1978
[14]

A novel approach based on rough set theory for analyzing information disorder.Ap- plied Intelligence, 53(12):15993–16014,

[Gaetaet al., 2023 ] Angelo Gaeta, Vincenzo Loia, Luigi Lo- masto, and Francesco Orciuoli. A novel approach based on rough set theory for analyzing information disorder.Ap- plied Intelligence, 53(12):15993–16014,

work page 2023
[15]

Can we stop fake news? using agent-based modelling to evaluate countermeasures for misinformation on social media

[Gausenet al., 2021 ] Anna Gausen, Wayne Luk, and Ce Guo. Can we stop fake news? using agent-based modelling to evaluate countermeasures for misinformation on social media

work page 2021
[16]

[Jaxa-Rozen and Kwakkel, 2018] Marc Jaxa-Rozen and Jan H. Kwakkel. Pynetlogo: Linking netlogo with python. Journal of Artificial Societies and Social Simulation, 21(2):4,

work page 2018
[17]

[Lettieriet al., 2023 ] Nicola Lettieri, Alfonso Guarino, Del- fina Malandrino, and Rocco Zaccagnino. Knowledge min- ing and social dangerousness assessment in criminal jus- tice: metaheuristic integration of machine learning and graph-based inference.Artificial Intelligence and Law, 31(4):653–702,

work page 2023
[18]

Computational social sci- ence, the evolution of policy design and rule making in smart societies.Future internet, 8(2):19,

[Lettieri, 2016] Nicola Lettieri. Computational social sci- ence, the evolution of policy design and rule making in smart societies.Future internet, 8(2):19,

work page 2016
[19]

Multi-agent surveillance system of fake news spreading in scale-free networks.Procedia Computer Science, 207:2232–2241,

[Małecki and Pu´scian, 2022] Krzysztof Małecki and Sergiusz Pu ´scian. Multi-agent surveillance system of fake news spreading in scale-free networks.Procedia Computer Science, 207:2232–2241,

work page 2022
[20]

Toward a standard approach for echo chamber detection: Reddit case study.Applied Sciences, 11(12):5390,

[Moriniet al., 2021 ] Virginia Morini, Laura Pollacci, and Giulio Rossetti. Toward a standard approach for echo chamber detection: Reddit case study.Applied Sciences, 11(12):5390,

work page 2021
[21]

Domain adaptive fake news detection via reinforcement learning

[Mosallanezhadet al., 2022 ] Ahmadreza Mosallanezhad, Mansooreh Karami, Kai Shu, Michelle V Mancenido, and Huan Liu. Domain adaptive fake news detection via reinforcement learning. InProceedings of the ACM Web Conference 2022, pages 3632–3640,

work page 2022
[22]

The psychology of fake news.Trends in cognitive sciences, 25(5):388–402,

[Pennycook and Rand, 2021] Gordon Pennycook and David G Rand. The psychology of fake news.Trends in cognitive sciences, 25(5):388–402,

work page 2021
[23]

Studying fake news spreading, polarisation dynamics, and manipulation by bots: A tale of networks and language.Computer sci- ence review, 47:100531,

[Ruffoet al., 2023 ] Giancarlo Ruffo, Alfonso Semeraro, Anastasia Giachanou, and Paolo Rosso. Studying fake news spreading, polarisation dynamics, and manipulation by bots: A tale of networks and language.Computer sci- ence review, 47:100531,

work page 2023
[24]

Echo chambers and viral misinformation: Modeling fake news as complex conta- gion.PLoS one, 13(9):e0203958,

[T¨ornberg, 2018] Petter T¨ornberg. Echo chambers and viral misinformation: Modeling fake news as complex conta- gion.PLoS one, 13(9):e0203958,

work page 2018
[25]

Balanced influence maximization in social networks based on deep reinforcement learning.Neural Networks,

[Yanget al., 2023 ] Shuxin Yang, Quanming Du, Guixiang Zhu, Jie Cao, Lei Chen, Weiping Qin, and Youquan Wang. Balanced influence maximization in social networks based on deep reinforcement learning.Neural Networks,

work page 2023
[26]

[Zaccagninoet al., 2025 ] Rocco Zaccagnino, Nicola Let- tieri, Delfina Malandrino, Luigi Lomasto, Andrea Camoia, and Alfonso Guarino. Turning ai into a regulatory sand- box: exploring information disorder mitigation strategies with abm and deep reinforcement learning.Neural Com- puting and Applications, 37(22):18679–18720, 2025

work page 2025