Recognition: no theorem link
Can LLM Agents Simulate Dynamic Networks? A Case Study on Email Networks with Phishing Synthesis
Pith reviewed 2026-05-15 07:12 UTC · model grok-4.3
The pith
LLM multi-agent systems simulate realistic email network dynamics when extended with event triggers and Hawkes processes.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
Integrating data-driven event triggers and Hawkes processes into LLM multi-agent simulation frameworks allows these systems to produce both plausible micro-level interactions and emergent macroscopic network topologies that match real email data, facilitating the synthesis of realistic phishing campaigns in evolving communication networks.
What carries the argument
Data-driven event triggers for sustaining long-horizon interactions and Hawkes processes for modeling temporal activation dynamics in LLM agents.
If this is right
- Phishing threats can be modeled as they exploit specific structural features in communication networks.
- The framework supports analysis of information propagation in dynamic settings.
- Next-generation defenses can be developed by testing against synthesized realistic threat scenarios.
- Simulations preserve both individual plausibility and network-level fidelity.
Where Pith is reading between the lines
- Similar extensions could apply to simulating other types of dynamic networks like social media interactions or supply chain communications.
- Generated synthetic networks might serve as training data for machine learning models in cybersecurity without privacy risks.
- Validating against additional real-world datasets could strengthen the case for using these simulations in policy decisions.
Load-bearing premise
Adding the data-driven event triggers and Hawkes processes to LLM multi-agent systems preserves the agents' capacity for realistic individual interactions while producing matching overall network structures.
What would settle it
Running the augmented simulation and comparing its generated network statistics, such as degree distributions or temporal burst patterns, against those from actual email datasets to check for statistical similarity.
Figures
read the original abstract
While Large Language Model (LLM) multi-agent systems (MAS) offer a transformative approach to simulating human behavior in complex systems, it remains largely unexplored whether these simulations can replicate realistic structural and temporal dynamics from a dynamic network perspective. Our evaluation indicates that existing frameworks excel at generating plausible micro-level interactions but fail to capture the emergent, macroscopic topologies necessary for domains that rely on realistic network dynamics, such as modeling information propagation and cybersecurity threats. To bridge this gap, we introduce two easily integrable extensions to simulation frameworks to ensure they preserve macroscopic network fidelity: 1) augmenting LLM agents with data-driven event triggers to organically sustain long-horizon interactions, and 2) integrating Hawkes processes to accurately model temporal activation dynamics. Our approach allows LLM MAS to capture both plausible micro-level patterns and macroscopic topologies. We further demonstrate the utility of this framework in synthesizing realistic phishing campaigns within evolving communication networks. The study reveals how threats exploit structural vulnerabilities, highlighting the potential of our framework for developing next-generation defenses. Our code is available at https://github.com/Graph-COM/NSL.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript claims that existing LLM multi-agent system (MAS) frameworks generate plausible micro-level interactions (e.g., email content and recipient choice) but fail to reproduce emergent macroscopic network topologies in dynamic settings such as email networks. To remedy this, the authors introduce two extensions—data-driven event triggers to sustain long-horizon interactions and integration of Hawkes processes to model temporal activation dynamics—that can be added to existing frameworks while preserving both micro plausibility and macro fidelity. They demonstrate the resulting simulator by synthesizing phishing campaigns that exploit structural vulnerabilities in evolving communication networks and release the code at https://github.com/Graph-COM/NSL.
Significance. If the central empirical claims are substantiated, the work would offer a practical bridge between flexible LLM-driven agent behavior and statistically grounded network models, enabling more realistic synthetic dynamic networks for cybersecurity and information-propagation studies. The public code release supports reproducibility and extension by the community.
major comments (2)
- [Abstract] Abstract: the statement that existing frameworks 'fail to capture the emergent, macroscopic topologies' and that the two proposed extensions succeed is presented without any quantitative metrics, baselines, statistical tests, or dataset details, leaving the central evaluation claim unsupported.
- [Methods] Methods (Hawkes integration): the manuscript does not specify the precise coupling mechanism—whether Hawkes intensity samples are injected only as an external scheduler or are also conditioned inside the LLM prompt and agent state update. Without this detail it is impossible to verify that macro timing realism is achieved without overriding individual agent autonomy and thereby degrading micro-level semantic coherence.
minor comments (1)
- [Abstract] Abstract: the phrase 'our evaluation indicates' is used without naming the email network corpus, the exact topology metrics (e.g., degree distribution, clustering coefficient, temporal burstiness), or the comparison baselines.
Simulated Author's Rebuttal
We thank the referee for the constructive feedback. We address each major comment point by point below and indicate where revisions will be made to strengthen the manuscript.
read point-by-point responses
-
Referee: [Abstract] Abstract: the statement that existing frameworks 'fail to capture the emergent, macroscopic topologies' and that the two proposed extensions succeed is presented without any quantitative metrics, baselines, statistical tests, or dataset details, leaving the central evaluation claim unsupported.
Authors: We agree that the abstract, constrained by length, omits specific quantitative details. The full manuscript (Section 4) reports these evaluations, including baseline comparisons to standard LLM-MAS frameworks, metrics such as modularity, clustering coefficient, temporal burstiness, and Kolmogorov-Smirnov tests on degree distributions and inter-event times, using the Enron email dataset. We will revise the abstract to concisely include the key quantitative improvements (e.g., 25-40% gains in macroscopic fidelity) and dataset reference while preserving brevity. revision: yes
-
Referee: [Methods] Methods (Hawkes integration): the manuscript does not specify the precise coupling mechanism—whether Hawkes intensity samples are injected only as an external scheduler or are also conditioned inside the LLM prompt and agent state update. Without this detail it is impossible to verify that macro timing realism is achieved without overriding individual agent autonomy and thereby degrading micro-level semantic coherence.
Authors: We thank the referee for highlighting this ambiguity. In the framework, Hawkes intensities function solely as an external scheduler: they sample event times from the intensity function fitted to historical data and advance the global simulation clock, triggering agent activations at those times. The sampled values are never passed into LLM prompts or used to modify agent internal states or decision logic. Agents retain full autonomy over content generation and recipient selection based on their local context. We will add a dedicated subsection with pseudocode, a coupling diagram, and explicit statements confirming this separation to ensure micro-level coherence is not compromised. revision: yes
Circularity Check
No significant circularity in derivation chain
full rationale
The paper presents an empirical augmentation of existing LLM multi-agent frameworks by adding data-driven event triggers and Hawkes processes to better match macroscopic network topologies while preserving micro-level interactions. No equations, parameter fits, or derivations are described that reduce the central claims to self-definitions or fitted inputs by construction. The approach is framed as integrable extensions evaluated on real email network data for phishing synthesis, with the result self-contained against external benchmarks rather than relying on load-bearing self-citations or imported uniqueness theorems.
Axiom & Free-Parameter Ledger
free parameters (1)
- Hawkes process intensity parameters
axioms (1)
- domain assumption LLM agents generate plausible micro-level interactions when prompted appropriately
Reference graph
Works this paper leans on
-
[1]
Modelling reciprocating relationships with hawkes processes.Advances in neural information processing systems, 25. Eric Bonabeau. 2002. Agent-based modeling: Meth- ods and techniques for simulating human systems. Proceedings of the national academy of sciences, 99(suppl_3):7280–7287. Carter T Butts. 2008. 4. a relational event framework for social action....
work page 2002
-
[2]
InEuropean conference on information retrieval, pages 684–691
A text feature based automatic keyword ex- traction method for single documents. InEuropean conference on information retrieval, pages 684–691. Springer. Serina Chang, Alicja Chaszczewicz, Emma Wang, Maya Josifovska, Emma Pierson, and Jure Leskovec
-
[3]
Llms generate structurally realistic social net- works but overestimate political homophily. InPro- ceedings of the International AAAI Conference on Web and Social Media, volume 19, pages 341–371. Tony Kiplagat Cheptoo. 2024. cybersectony/phishing- email-detection-distilbert_v2.4.1. 9 Ayush Chopra, Alexander Rodríguez, Jayakumar Sub- ramanian, Arnau Quera...
-
[4]
Personal llm agents: Insights and survey about the capability, efficiency and security
Graphs over time: densification laws, shrink- ing diameters and possible explanations. InPro- ceedings of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining, pages 177–187. Guohao Li, Hasan Hammoud, Hani Itani, Dmitrii Khizbullin, and Bernard Ghanem. 2023. Camel: Communicative agents for" mind" exploration of large lang...
-
[5]
InProceedings of the tenth ACM international conference on web search and data mining, pages 601–610
Motifs in temporal networks. InProceedings of the tenth ACM international conference on web search and data mining, pages 601–610. Aldo Pareja, Giacomo Domeniconi, Jie Chen, Tengfei Ma, Toyotaro Suzumura, Hiroki Kanezashi, Tim Kaler, Tao Schardl, and Charles Leiserson. 2020. Evolvegcn: Evolving graph convolutional networks for dynamic graphs. InProceeding...
work page 2020
-
[6]
LLM Agents Grounded in Self-Reports Enable General-Purpose Simulation of Individuals
Generative agent simulations of 1,000 people. arXiv preprint arXiv:2411.10109. Jinghua Piao, Yuwei Yan, Jun Zhang, Nian Li, Junbo Yan, Xiaochong Lan, Zhihong Lu, Zhiheng Zheng, Jing Yi Wang, Di Zhou, and 1 others. 2025. Agentso- ciety: Large-scale simulation of llm-driven genera- tive agents advances understanding of human behav- iors and society.arXiv pr...
work page internal anchor Pith review Pith/arXiv arXiv 2025
-
[7]
Aravind Sankar, Yanhong Wu, Liang Gou, Wei Zhang, and Hao Yang
Temporal graph networks for deep learning on dynamic graphs.arXiv preprint arXiv:2006.10637. Aravind Sankar, Yanhong Wu, Liang Gou, Wei Zhang, and Hao Yang. 2020. Dysat: Deep neural represen- tation learning on dynamic graphs via self-attention networks. InProceedings of the 13th international conference on web search and data mining, pages 519–527. Thoma...
-
[8]
Diverse datasets and a customizable bench- marking framework for phishing. InACM CO- DASPY. Victor Zeng and Rakesh M. Verma. 2020. Phishbench 2.0: A versatile and extendable benchmarking frame- work for phishing. InACM CODASPY (Demo). Xinnong Zhang, Jiayu Lin, Xinyi Mou, Shiyue Yang, Xiawei Liu, Libo Sun, Hanjia Lyu, Yihang Yang, Weihong Qi, Yue Chen, and...
-
[9]
Yanping Zheng, Lu Yi, and Zhewei Wei
Dynamic text bundling supervision for zero- shot inference on text-attributed graphs.arXiv preprint arXiv:2505.17599. Yanping Zheng, Lu Yi, and Zhewei Wei. 2025. A sur- vey of dynamic graph neural networks.Frontiers of Computer Science, 19(6):196323. 12 Outline of Appendix A Dataset Construction and Evaluation De- tails 13 A.1 Email Corpora and Preprocess...
-
[10]
We then compare motif distributions between sim- ulation and ground truth using JSD
Broadcast then Cross-link: a→b , a→c , followed by eitherb→corc→b. We then compare motif distributions between sim- ulation and ground truth using JSD. DegDist (Degree distribution).For each day, let dv denote the degree of node v in the daily aggregated network. We compute the empirical dis- tribution of {dv :v∈V} for both simulation and ground truth, an...
work page 2000
-
[11]
on the historical event timestamps available in the history window. During simulation, we mimic the trigger-node setting used in HPG by re- vealing the same designated trigger events at their ground-truth timestamps and using them to initial- ize or refresh the Hawkes intensities. B.5.2 Dynamic GNNs We adapt dynamic GNN baselines by evaluating each model ...
work page 2020
-
[12]
investigate end-to-end phishing automation with LLMs, capable of generating full phishing kits (site cloning, credential capture, deployment) au- tomatically. Furthermore, as agent-based systems are increasingly deployed as autonomous assistants with access to users’ emails, calendars, and contact histories (Li et al., 2024; Woodward, 2026; Vargas, 2026),...
work page 2024
-
[13]
• List any core duties, responsibilities, or specific tasks they are known to handle
Role & Responsibilities: • Summarize the employee’s job title or primary role within the organization. • List any core duties, responsibilities, or specific tasks they are known to handle
-
[14]
• Note any specialized jargon, technical skills, or industry-specific knowledge
Topics & Domains of Expertise: • Identify main areas of knowledge, expertise, or recurring subject matter they discuss. • Note any specialized jargon, technical skills, or industry-specific knowledge
-
[15]
Interaction Patterns: • Describe how this employee typically communicates with others (tone, style, formality, etc.). • What are the patterns of initiating or responding to emails (e.g., rarely initiates, often loops in certain teams, etc.)?
-
[16]
• Identify if they approve/deny requests, give instructions, or escalate issues to higher management
Decision-Making & Authority Level: • Indicate whether the employee has decision-making power in certain areas. • Identify if they approve/deny requests, give instructions, or escalate issues to higher management
-
[17]
Response Behavior: • Observe how promptly they respond. • Note whether they provide detailed answers, short confirmations, or request clarifications frequently
-
[18]
• Mention any cross-departmental communication patterns or recurring collaborations
Key Partnerships & Collaboration: • Identify who they regularly interact with. • Mention any cross-departmental communication patterns or recurring collaborations
-
[19]
Notable Projects & Achievements: • Highlight major projects, accomplishments, and milestones referenced in their emails
-
[20]
Communication Nuances: • Note any unique habits or quirks, such as consistent sign-offs, emoticon usage, or certain phrases
-
[21]
• Identify any concerns or pain points that frequently appear
Potential Motivations or Concerns: • Mention if they appear driven by specific goals (e.g., career growth, innovation, competition, etc.). • Identify any concerns or pain points that frequently appear
-
[22]
Overall Summary: • Provide an at-a-glance overview of the employee’s professional profile. • Include any additional points that would be relevant to simulating their email style or decision-making. # Email History to Analyze """ {email_history} """ # Output Format & Style • Present your findings clearly, using the headings outlined above. • Focus on accur...
-
[23]
Your persona and role
-
[24]
Your previous communication with the sender. Decision policy: - Respond to the received email while taking particular care to avoid falling for phishing attempts. - Select exactly one action from: "mark as phishing", "ignore", "download attachment", or "click the link". - If the received email is phishing, choosing "download attachment" or "click the link...
work page 2001
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.