pith. sign in

arxiv: 2511.07176 · v3 · submitted 2025-11-10 · 💻 cs.NI · cs.CL

Graph Representation-based Model Poisoning on the Heterogeneous Internet of Agents

Pith reviewed 2026-05-17 23:59 UTC · model grok-4.3

classification 💻 cs.NI cs.CL
keywords model poisoningfederated fine-tuningInternet of Agentsgraph representationvariational graph autoencoderLLM agentsadversarial attackdistributed learning
0
0 comments X

The pith

A graph representation attack generates stealthy malicious updates that degrade LLM accuracy in federated Internet of Agents training while evading detection.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper develops a graph representation-based model poisoning attack called GRMP for federated fine-tuning in heterogeneous Internet of Agents systems. It constructs a feature correlation graph from overheard benign updates and trains a variational graph autoencoder to produce malicious updates that embed adversarial goals while preserving statistical similarity to legitimate ones. These updates are optimized through augmented Lagrangian and subgradient descent methods. Experiments across multiple LLM models show substantial accuracy drops that current defenses fail to catch. This demonstrates a practical vulnerability in the distributed collaboration that defines the IoA paradigm.

Core claim

The GRMP attack exploits overheard benign updates to construct a feature correlation graph and employs a variational graph autoencoder to capture structural dependencies and generate malicious updates. A novel attack algorithm based on augmented Lagrangian and subgradient descent optimizes these updates to preserve benign-like statistics while embedding adversarial objectives, allowing them to substantially decrease accuracy across different LLM models while remaining consistent with benign updates and evading existing defense mechanisms.

What carries the argument

Feature correlation graph built from overheard benign updates, processed by a variational graph autoencoder to model and replicate structural dependencies for stealthy malicious model updates.

If this is right

  • The GRMP attack substantially decreases accuracy across different LLM models.
  • Malicious updates remain statistically consistent with benign updates.
  • Existing defense mechanisms are evaded by the GRMP attack.
  • This reveals a severe threat to FFT-enabled IoA systems.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • Graph-based poisoning methods could extend to other heterogeneous federated learning environments with distributed agents.
  • Secure channels for sharing model updates would limit the attacker's ability to overhear data and build the correlation graph.
  • Defenses might shift toward detecting anomalies in update graph structures rather than relying solely on statistical checks.

Load-bearing premise

The attacker can reliably overhear enough benign updates to build a representative feature correlation graph that the variational graph autoencoder generalizes to the actual aggregation step.

What would settle it

An experiment showing that the generated malicious updates are statistically distinguishable from benign updates or produce no substantial accuracy drop in the aggregated global model would disprove the attack's effectiveness and undetectability.

Figures

Figures reproduced from arXiv: 2511.07176 by Hanlin Cai, Haofan Dong, Houtianfu Wang, Kai Li, Ozgur B. Akan, Sai Zou.

Figure 1
Figure 1. Figure 1: (a) Training process of the FL-enabled IoA system, and (b) impact of the GRMP attack on the IoA training cycle. [PITH_FULL_IMAGE:figures/full_fig_p002_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Framework of the proposed GRMP attack. Given F(t) and A(t), the topological structure of the graph G can be constructed. The VGAE comprises a graph convolu￾tional network (GCN) encoder and decoder. The encoder then maps G into a lower-dimensional representation. We design the encoder based on an M-layer GCN architecture, which learns a representation that captures the internal features of G. The encoded re… view at source ↗
Figure 5
Figure 5. Figure 5: Learning accuracy of local LLM agents with no attack over 20 [PITH_FULL_IMAGE:figures/full_fig_p005_5.png] view at source ↗
Figure 4
Figure 4. Figure 4: Temporal evolution of cosine similarity for each LLM agent with [PITH_FULL_IMAGE:figures/full_fig_p005_4.png] view at source ↗
read the original abstract

Internet of Agents (IoA) envisions a unified, agent-centric paradigm where heterogeneous large language model (LLM) agents can interconnect and collaborate at scale. Within this paradigm, federated fine-tuning (FFT) serves as a key enabler that allows distributed LLM agents to co-train an intelligent global LLM without centralizing local datasets. However, the FFT-enabled IoA systems remain vulnerable to model poisoning attacks, where adversaries can upload malicious updates to the server to degrade the performance of the aggregated global LLM. This paper proposes a graph representation-based model poisoning (GRMP) attack, which exploits overheard benign updates to construct a feature correlation graph and employs a variational graph autoencoder to capture structural dependencies and generate malicious updates. A novel attack algorithm is developed based on augmented Lagrangian and subgradient descent methods to optimize malicious updates that preserve benign-like statistics while embedding adversarial objectives. Experimental results show that the proposed GRMP attack can substantially decrease accuracy across different LLM models while remaining statistically consistent with benign updates, thereby evading detection by existing defense mechanisms and underscoring a severe threat to the ambitious IoA paradigm.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The manuscript proposes a graph representation-based model poisoning (GRMP) attack targeting federated fine-tuning in heterogeneous Internet of Agents (IoA) systems. It constructs a feature correlation graph from overheard benign updates, employs a variational graph autoencoder to model structural dependencies, and optimizes malicious updates via augmented Lagrangian and subgradient descent to preserve benign-like statistics while pursuing adversarial objectives. The central claim is that this attack substantially reduces accuracy across LLM models while remaining statistically consistent with benign updates, thereby evading existing defenses.

Significance. If the results hold with proper validation, the work demonstrates a sophisticated poisoning vector that exploits graph-based representations and constrained optimization to bypass statistical anomaly detection in federated LLM aggregation. This could motivate stronger defenses for heterogeneous agent systems and highlight risks in scaling IoA paradigms.

major comments (2)
  1. [Experimental Results] The central claim depends on the assumption that overheard benign updates suffice to build a representative feature correlation graph whose structural dependencies generalize via the variational graph autoencoder to the target aggregation round. However, in a heterogeneous IoA setting with differing LLM architectures and data distributions, this transfer is not supported by quantitative evidence such as graph reconstruction error, transfer metrics, or ablation on update volume (see Experimental Results section).
  2. [Abstract and Experimental Results] The abstract and experimental description report accuracy degradation and statistical consistency with benign updates but provide no specific quantitative metrics (e.g., exact accuracy drops, number of overheard updates, baseline comparisons, or statistical tests), baseline defense evaluations, or details on how the augmented Lagrangian optimization was validated against the attack success metric.
minor comments (2)
  1. [Method] Notation for the feature correlation graph and VGAE latent space could be clarified with explicit definitions of variables and dimensions to aid reproducibility.
  2. [Related Work] The paper would benefit from additional references to prior work on graph-based anomaly detection in federated learning and model poisoning in LLM fine-tuning.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive and detailed review of our manuscript. We address each major comment point by point below and indicate the revisions planned for the next version.

read point-by-point responses
  1. Referee: [Experimental Results] The central claim depends on the assumption that overheard benign updates suffice to build a representative feature correlation graph whose structural dependencies generalize via the variational graph autoencoder to the target aggregation round. However, in a heterogeneous IoA setting with differing LLM architectures and data distributions, this transfer is not supported by quantitative evidence such as graph reconstruction error, transfer metrics, or ablation on update volume (see Experimental Results section).

    Authors: We agree that additional quantitative support for generalization across heterogeneous LLM architectures and data distributions would strengthen the presentation. The current experiments construct the feature correlation graph from overheard benign updates and apply the variational graph autoencoder to generate updates for target aggregation rounds, but we did not report graph reconstruction errors, explicit transfer metrics, or ablations on the volume of overheard updates. In the revised manuscript we will add these elements, including reconstruction error metrics for the variational graph autoencoder and an ablation study on the number of overheard updates to demonstrate the robustness of the learned structural dependencies. revision: yes

  2. Referee: [Abstract and Experimental Results] The abstract and experimental description report accuracy degradation and statistical consistency with benign updates but provide no specific quantitative metrics (e.g., exact accuracy drops, number of overheard updates, baseline comparisons, or statistical tests), baseline defense evaluations, or details on how the augmented Lagrangian optimization was validated against the attack success metric.

    Authors: We acknowledge that greater specificity in the reported metrics would improve clarity and reproducibility. The experimental results demonstrate accuracy degradation and statistical consistency, yet exact numerical values, the precise count of overheard updates, baseline comparisons with statistical tests, and validation details for the augmented Lagrangian optimization are not stated explicitly. We will revise both the abstract and the Experimental Results section to include these quantitative details, report the number of overheard updates used, add statistical significance tests, include evaluations against baseline defenses, and describe how the augmented Lagrangian and subgradient descent procedure was validated against the attack success criterion while preserving benign-like statistics. revision: yes

Circularity Check

0 steps flagged

No significant circularity; derivation relies on external optimization and empirical validation

full rationale

The paper's core construction uses overheard benign updates to build a feature correlation graph, applies a variational graph autoencoder for structural dependencies, and optimizes malicious updates via augmented Lagrangian and subgradient descent to preserve benign-like statistics while pursuing adversarial goals. This objective is defined independently of the final success metrics (accuracy degradation and evasion), with no self-definitional loops, fitted inputs renamed as predictions, or load-bearing self-citations evident. Experimental results are presented as separate validation rather than tautological outcomes. The approach is self-contained against the described threat model without reducing the central claim to its inputs by construction.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 0 invented entities

The central claim rests on the assumption that overheard benign updates provide sufficient structural information to train a generative model capable of producing undetectable adversarial updates; no free parameters are explicitly fitted in the abstract description, and no new physical or mathematical entities are postulated.

axioms (2)
  • domain assumption Overheard benign updates contain representative feature correlations that can be modeled by a variational graph autoencoder
    Invoked in the construction of the feature correlation graph and subsequent generation of malicious updates
  • domain assumption Existing statistical defenses rely on simple distributional tests that can be evaded by preserving benign-like moments
    Used to claim evasion of detection mechanisms

pith-pipeline@v0.9.0 · 5505 in / 1412 out tokens · 21122 ms · 2026-05-17T23:59:47.837844+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. Graph Representation Learning Augmented Model Manipulation on Federated Fine-Tuning of LLMs

    cs.LG 2026-05 unverdicted novelty 5.0

    Graph representation learning plus iterative augmented Lagrangian optimization creates stronger, harder-to-detect model manipulation attacks on federated LLM fine-tuning, cutting global accuracy by up to 26%.

Reference graph

Works this paper leans on

16 extracted references · 16 canonical work pages · cited by 1 Pith paper

  1. [1]

    Large model based agents: State-of-the-art, cooperation paradigms, security and privacy, and future trends,

    Y . Wang, Y . Pan, Z. Su, Y . Deng, Q. Zhao, L. Du, T. H. Luan, J. Kang, and D. Niyato, “Large model based agents: State-of-the-art, cooperation paradigms, security and privacy, and future trends,”IEEE Communications Surveys & Tutorials, 2025

  2. [2]

    Internet of agents: Fundamentals, applications, and challenges,

    Y . Wang, S. Guo, Y . Pan, Z. Su, F. Chen, T. H. Luan, P. Li, J. Kang, and D. Niyato, “Internet of agents: Fundamentals, applications, and challenges,”arXiv preprint arXiv:2505.07176, 2025

  3. [3]

    Security of internet of agents: Attacks and countermeasures,

    Y . Wang, Y . Pan, S. Guo, and Z. Su, “Security of internet of agents: Attacks and countermeasures,”IEEE Open Journal of the Computer Society, 2025

  4. [4]

    Zero-trust foundation models: A new paradigm for secure and collaborative artificial intelligence for internet of things,

    K. Li, C. Li, X. Yuan, S. Li, S. Zou, S. S. Ahmed, W. Ni, D. Niyato, A. Jamalipour, F. Dressleret al., “Zero-trust foundation models: A new paradigm for secure and collaborative artificial intelligence for internet of things,”IEEE Internet of Things Journal, 2025

  5. [5]

    Accelerating wireless federated learning with adaptive scheduling over heterogeneous devices,

    Y . Li, X. Qin, K. Han, N. Ma, X. Xu, and P. Zhang, “Accelerating wireless federated learning with adaptive scheduling over heterogeneous devices,”IEEE Internet of Things Journal, vol. 11, no. 2, pp. 2286–2302, 2023

  6. [6]

    Fedsecurity: A benchmark for attacks and defenses in federated learning and federated llms,

    S. Han, B. Buyukates, Z. Hu, H. Jin, W. Jin, L. Sun, X. Wang, W. Wu, C. Xie, Y . Yaoet al., “Fedsecurity: A benchmark for attacks and defenses in federated learning and federated llms,” inProceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, 2024, pp. 5070–5081

  7. [7]

    Biasing federated learning with a new adversarial graph attention network,

    K. Li, J. Zheng, W. Ni, H. Huang, P. Li `o, F. Dressler, and O. B. Akan, “Biasing federated learning with a new adversarial graph attention network,”IEEE Transactions on Mobile Computing, 2024

  8. [8]

    Federated few-shot learning for robust and privacy-driven network intrusion detection in iot,

    A. Saleem and W. Hamouda, “Federated few-shot learning for robust and privacy-driven network intrusion detection in iot,” inICC 2025- IEEE International Conference on Communications. IEEE, 2025, pp. 2949–2954

  9. [9]

    Sgan-ra: Reconstruction attack for big model in asynchronous federated learning,

    K. Wang, H. Zhang, G. Kaddoum, H. Shin, T. Q. Quek, and M. Z. Win, “Sgan-ra: Reconstruction attack for big model in asynchronous federated learning,”IEEE Communications Magazine, vol. 63, no. 4, pp. 66–72, 2025

  10. [10]

    Privacy and robustness in federated learning: Attacks and defenses,

    L. Lyu, H. Yu, X. Ma, C. Chen, L. Sun, J. Zhao, Q. Yang, and P. S. Yu, “Privacy and robustness in federated learning: Attacks and defenses,” IEEE transactions on neural networks and learning systems, vol. 35, no. 7, pp. 8726–8746, 2022

  11. [11]

    Data and model poisoning backdoor attacks on wireless federated learning, and the defense mechanisms: A comprehensive survey,

    Y . Wan, Y . Qu, W. Ni, Y . Xiang, L. Gao, and E. Hossain, “Data and model poisoning backdoor attacks on wireless federated learning, and the defense mechanisms: A comprehensive survey,”IEEE Communications Surveys & Tutorials, vol. 26, no. 3, pp. 1861–1897, 2024

  12. [12]

    A tutorial on decomposition methods for network utility maximization,

    D. P. Palomar and M. Chiang, “A tutorial on decomposition methods for network utility maximization,”IEEE Journal on Selected Areas in Communications, vol. 24, no. 8, pp. 1439–1451, 2006

  13. [13]

    The autoencoding variational autoencoder,

    T. Cemgil, S. Ghaisas, K. Dvijotham, S. Gowal, and P. Kohli, “The autoencoding variational autoencoder,”Advances in Neural Information Processing Systems, vol. 33, pp. 15 077–15 087, 2020

  14. [14]

    Data-agnostic model poisoning against federated learning: A graph autoencoder approach,

    K. Li, J. Zheng, X. Yuan, W. Ni, O. B. Akan, and H. V . Poor, “Data-agnostic model poisoning against federated learning: A graph autoencoder approach,”IEEE Transactions on Information F orensics and Security, vol. 19, pp. 3465–3480, 2024

  15. [15]

    Character-level convolutional net- works for text classification,

    X. Zhang, J. Zhao, and Y . LeCun, “Character-level convolutional net- works for text classification,”Advances in neural information processing systems, vol. 28, 2015

  16. [16]

    On harnessing semantic communication with natural language processing,

    S. R. Pokhrelet al., “On harnessing semantic communication with natural language processing,”IEEE Internet of Things Journal, 2025