From Privacy to Workflow Integrity: Communication-Graph Metadata in Autonomous Agent Interoperability

Bijaya Dangol

arxiv: 2606.07150 · v3 · pith:2W5EI7QOnew · submitted 2026-06-05 · 💻 cs.CR · cs.AI· cs.MA· cs.NI

From Privacy to Workflow Integrity: Communication-Graph Metadata in Autonomous Agent Interoperability

Bijaya Dangol This is my paper

Pith reviewed 2026-06-27 21:40 UTC · model grok-4.3

classification 💻 cs.CR cs.AIcs.MAcs.NI

keywords agent interoperabilitycommunication graph metadataworkflow integrityA2A protocolmetadata leakageprivacy propertiesindistinguishability gamesautonomous agents

0 comments

The pith

Communication graph metadata in agent systems allows recovery of task classes at six times chance level from passive observation and even from workflow openings alone.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper establishes that agent-interoperability protocols expose communication graphs—who contacts whom, when, and how often—creating risks to workflow integrity that exceed standard privacy concerns. Endpoints carry capability labels and interactions tie directly to autonomous actions, so an observer can identify a recurring workflow from its start and potentially intervene before completion. Experiments on real A2A traffic show a label-blind classifier recovers task class from metadata at six times chance, including from the opening sequence alone. Defense-aware attacks do not remove the signal, but only the full set of defined privacy properties brings recovery near chance. Recoverability and the ability to act on it under budget limits are shown to be distinct.

Core claim

The communication graph in agent systems is exposed across independent trust domains and coupled to autonomous actions, enabling an observer to recover a task's class from passive metadata at 6x chance level, including from only its opening. A defense-aware adversary does not overturn this recovery, and only the complete set of transport- and bootstrap-layer privacy properties reduces it toward chance. Under a fixed budget an adversary captures 0.63 of a clairvoyant attacker's advantage on the corpus, with 0.41 from a workflow's opening alone, governed by top-ranked precision rather than overall accuracy.

What carries the argument

The communication-graph metadata together with transport- and bootstrap-layer privacy properties equipped with indistinguishability-game semantics, tested via label-blind classification on an A2A traffic corpus and generative models.

If this is right

Only the complete set of privacy properties prevents high-accuracy recovery of workflow classes from metadata.
Acting on recovered information under a fixed budget is distinct from mere recoverability and is limited by top-ranked precision.
A2A case studies reveal that metadata-protecting bindings must address implicit identity assumptions in the protocol.
Passive metadata alone suffices for task classification without access to message content.
Workflow integrity requires protecting the graph even when content is encrypted by address-based transports.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

This framing suggests agent protocols may need to adopt content-protecting bindings with metadata hiding from the design stage rather than retrofitting.
The separation of recoverability from actionable advantage under budget could extend to other autonomous chained systems such as sensor networks or robotic workflows.
Generative models used as controlled instruments in the evaluation open a path to simulate leaks and test defenses before deployment.
If the graph leak persists across domains, it may require new standards that treat workflow recognition as a first-class integrity property.

Load-bearing premise

Exposure of the communication graph across independent trust domains, when coupled to autonomous action, makes the metadata distinctively consequential for workflow integrity rather than privacy alone.

What would settle it

A measurement on real A2A traffic showing that applying the full set of privacy properties still leaves a classifier recovering task classes significantly above chance would falsify the claim that only the complete properties drive recovery toward chance.

Figures

Figures reproduced from arXiv: 2606.07150 by Bijaya Dangol.

**Figure 2.** Figure 2: Task class recovered from communication [PITH_FULL_IMAGE:figures/full_fig_p010_2.png] view at source ↗

**Figure 4.** Figure 4: Accuracy under each property (rows) for each adversary view (columns) in the controlled model; red is leaking, green is protected. The network observer falls only when unlinkability and metadata minimization are combined (“both”); the registry observer falls only to discovery privacy. No single property suffices. The real-corpus ladder is in §9.12. binding exposes it too, where metadata minimization wit… view at source ↗

**Figure 5.** Figure 5: Actuation. Capture ratio κ by privacy property, at an early decision deadline (f = 0.2) and a budget equal to one task class’s mass, averaged over targets (error bars span ±1.96 standard errors across target classes; chance, the blind baseline, is 0). The integrity analogue of [PITH_FULL_IMAGE:figures/full_fig_p013_5.png] view at source ↗

**Figure 6.** Figure 6: Actuation is the product of inference and budget, vanishing on either edge. Left: capture ratio [PITH_FULL_IMAGE:figures/full_fig_p014_6.png] view at source ↗

**Figure 7.** Figure 7: General actuation. Left: capture ratio against the class-determined share of value [PITH_FULL_IMAGE:figures/full_fig_p016_7.png] view at source ↗

read the original abstract

Agent-interoperability protocols such as A2A and MCP standardize what agents say to one another but assume address-based transport. Whether over HTTP(S) or a content-protecting binding such as MLS-based SLIM, these transports protect message content yet leave the communication graph exposed: which agent contacts which, when, and how often. In agent systems this graph is more consequential than a privacy framing suggests. Endpoints are capability-labeled, workflows are structured and chained, and interactions are coupled to actions, so an observer recovers more than past relationships: it can recognize a recurring pending workflow from its opening and, at machine speed, act on it before it completes. The threat is one of workflow integrity, not privacy alone. We give a threat model for the communication graph and locate what makes its metadata distinctively consequential: not stronger fingerprinting but exposure across independent trust domains, coupled to autonomous action. We define transport- and bootstrap-layer privacy properties, give them an indistinguishability-game semantics, evaluate transports, and give an A2A case study where a metadata-protecting binding surfaces its implicit identity assumptions. On a corpus of real multi-agent A2A traffic from the official reference agents, on a live A2A binding, and with a generative model as a controlled instrument, a label-blind classifier recovers a task's class from passive metadata at 6x chance, and from only its opening; a defense-aware adversary does not overturn this, and only the full set of properties drives recovery toward chance. Acting on the leak is distinct from recoverability: under a fixed budget an adversary captures 0.63 of a clairvoyant attacker's advantage on the corpus (0.41 from a workflow's opening), governed by top-ranked precision rather than overall accuracy, so integrity and privacy come apart under defense.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper shows metadata from A2A traffic lets a classifier recover task class at 6x chance even from openings, but the workflow-integrity framing over privacy is asserted more than measured.

read the letter

The main takeaway is the empirical result: on real A2A reference traffic plus a generative model, a label-blind classifier pulls task class from passive metadata at 6x chance, and still does well from the opening alone. A budget-limited adversary captures 0.63 of a clairvoyant attacker's edge, driven by top-ranked precision rather than overall accuracy. That part is concrete and new in the agent-interoperability setting.

The paper does a few things cleanly. It gives transport and bootstrap privacy properties an indistinguishability-game semantics, evaluates them on actual bindings, and runs the case study on live A2A traffic. The separation of recoverability from actionable advantage under fixed budget is a useful distinction, and the claim that only the full property set drives recovery to chance is worth checking.

The softer part is the central reframing. The authors locate the threat in exposure across independent trust domains plus coupling to autonomous action, which they say makes this workflow integrity rather than privacy. The experiments measure recovery and precision well, but they do not directly test whether the leaked metadata enables integrity violations (preemption, injection) that would not already be covered by a privacy analysis. The assumption that autonomous chaining across domains makes the metadata distinctively consequential is plausible but stays at the level of the threat model rather than a quantified demonstration.

Classifier details are thin in the abstract—no feature list, error bars, or exclusion rules—so the 6x figure is hard to stress-test without the full methods. No obvious circularity or heavy self-citation.

This is for people working on agent protocols, secure multi-agent systems, and applied crypto for AI. It flags a gap in current standards that deserves referee time even if the integrity angle needs more evidence.

Referee Report

2 major / 2 minor

Summary. The paper claims that agent-interoperability protocols (A2A, MCP) protect message content but expose communication-graph metadata (who contacts whom, timing, frequency), which is more consequential than a privacy framing because endpoints are capability-labeled, workflows are structured/chained, and interactions couple to autonomous actions across trust domains. This enables an observer to recognize a pending workflow from its opening and act at machine speed, creating a workflow-integrity threat. The authors supply a threat model, define transport- and bootstrap-layer privacy properties via indistinguishability games, evaluate bindings, present an A2A case study, and report experiments on a real multi-agent A2A corpus, a live binding, and a generative model showing a label-blind classifier recovers task class at 6× chance (0.41 from opening alone), a defense-aware adversary does not overturn the result, only the full property set drives recovery to chance, and a budgeted adversary captures 0.63 of a clairvoyant attacker’s advantage (governed by top-ranked precision).

Significance. If the results hold, the work is significant for distinguishing workflow integrity from privacy in autonomous agent systems and for supplying concrete recoverability numbers plus an adversary budget analysis on real A2A traffic. Credit is due for the use of an official reference-agent corpus, live A2A binding, and generative model as a controlled instrument, which together provide quantitative, reproducible evidence on metadata leakage.

major comments (2)

[§ Threat Model] § Threat Model: the central claim that exposure across independent trust domains coupled to autonomous action makes the metadata distinctively consequential for workflow integrity (rather than privacy alone) is asserted via the threat model but is not quantitatively demonstrated; the reported experiments measure only recoverability (6× chance, 0.41 from opening) and top-ranked precision, with no direct test of whether recovered metadata enables integrity violations (preemption, injection) that would be impossible under a pure privacy framing.
[§ Evaluation] § Evaluation (classifier and adversary experiments): the manuscript reports classifier performance and the 0.63 budget-capture figure but omits details on classifier features, error bars, and corpus exclusion criteria; these omissions are load-bearing for assessing whether the 6× chance result and the defense-aware adversary conclusion are robust.

minor comments (2)

[§ Privacy Properties] Notation for the privacy properties and indistinguishability games could be clarified with an explicit table mapping each property to its game definition.
[Figures 4–6] Figure captions for the corpus and generative-model results should state the number of runs and whether error bars represent standard deviation or confidence intervals.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive review and for recognizing the significance of the work, including the use of the official reference-agent corpus and live binding. We address the two major comments point by point below.

read point-by-point responses

Referee: [§ Threat Model] § Threat Model: the central claim that exposure across independent trust domains coupled to autonomous action makes the metadata distinctively consequential for workflow integrity (rather than privacy alone) is asserted via the threat model but is not quantitatively demonstrated; the reported experiments measure only recoverability (6× chance, 0.41 from opening) and top-ranked precision, with no direct test of whether recovered metadata enables integrity violations (preemption, injection) that would be impossible under a pure privacy framing.

Authors: The threat model distinguishes workflow integrity from privacy by noting that metadata exposure spans independent trust domains and couples directly to autonomous actions, enabling an observer to recognize a pending workflow from its opening and intervene at machine speed. The experiments establish that recoverability reaches 0.41 from the opening alone (6× chance) and that a budgeted adversary captures 0.63 of clairvoyant advantage, governed by top-ranked precision. These metrics indicate that an observer can identify workflows early enough to enable preemption or injection before completion, which a pure privacy framing does not capture. We agree that the paper does not include a direct empirical test (e.g., simulated preemption success rates using recovered labels). We will revise the discussion to explicitly connect the recoverability and budget-capture results to the feasibility of integrity violations and will add a limitations paragraph noting the absence of such direct violation experiments as future work. revision: partial
Referee: [§ Evaluation] § Evaluation (classifier and adversary experiments): the manuscript reports classifier performance and the 0.63 budget-capture figure but omits details on classifier features, error bars, and corpus exclusion criteria; these omissions are load-bearing for assessing whether the 6× chance result and the defense-aware adversary conclusion are robust.

Authors: We accept that these details were omitted and are load-bearing. The revised manuscript will add: a complete list of features used by the label-blind classifier, error bars or confidence intervals on all performance numbers (including the 0.41 opening accuracy and 0.63 budget-capture figure), and the precise exclusion criteria applied to the official reference-agent A2A corpus. These additions will allow readers to evaluate the robustness of the 6× chance result and the defense-aware adversary findings. revision: yes

Circularity Check

0 steps flagged

No significant circularity; empirical results on external corpus are independent of inputs.

full rationale

The paper reports measured classifier performance (6x chance recovery, 0.63 budget capture) on a corpus of real multi-agent A2A traffic from official reference agents plus a live binding and generative model. No equations or self-citations reduce the central claims (threat model distinguishing workflow integrity from privacy via exposure across trust domains plus autonomous action) to fitted parameters or definitional equivalence. The recoverability results are presented as direct measurements rather than predictions forced by construction. This is the normal case of a self-contained empirical study against external benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central claim rests on domain assumptions about agent systems stated in the abstract; no free parameters or invented entities are introduced.

axioms (1)

domain assumption Endpoints are capability-labeled, workflows are structured and chained, and interactions are coupled to actions.
Invoked to explain why communication graph metadata is more consequential than in general privacy settings.

pith-pipeline@v0.9.1-grok · 5866 in / 1244 out tokens · 27890 ms · 2026-06-27T21:40:23.158740+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

38 extracted references · 8 canonical work pages · 1 internal anchor

[1]

Agent2agent (a2a) protocol specification, 2026.https://a2a- protocol.org/

A2A Project (Linux Foundation). Agent2agent (a2a) protocol specification, 2026.https://a2a- protocol.org/

2026
[2]

SLIM: Secure low-latency interactive messaging, 2026.https://github.com/agntcy/ slim; IETF draft draft-mpsb-agntcy-slim

AGNTCY. SLIM: Secure low-latency interactive messaging, 2026.https://github.com/agntcy/ slim; IETF draft draft-mpsb-agntcy-slim

2026
[3]

Security Threat Modeling for Emerging AI-Agent Protocols: A Comparative Analysis of MCP, A2A, Agora, and ANP

Zeynab Anbiaee, Mahdi Rabbani, Mansur Mi- rani, GunjanPiya, IgorOpushnyev, AliGhorbani, and Sajjad Dadkhah. Security threat modeling for emerging AI-agent protocols: A comparative analysis of MCP, A2A, Agora, and ANP, 2026. arXiv:2602.11327

work page internal anchor Pith review Pith/arXiv arXiv 2026
[4]

Model context protocol, 2025.https: //modelcontextprotocol.io/

Anthropic. Model context protocol, 2025.https: //modelcontextprotocol.io/

2025
[5]

RFC 9420: The messaging layer security (mls) protocol, 2023.https://www.rfc- editor.org/rfc/rfc9420

Richard Barnes, Benjamin Beurdouche, Raphael Robert, Jon Millican, Emad Omara, and Katriel Cohn-Gordon. RFC 9420: The messaging layer security (mls) protocol, 2023.https://www.rfc- editor.org/rfc/rfc9420

2023
[6]

Var-CNN: A data-efficient web- site fingerprinting attack based on deep learning

Sanjit Bhat, David Lu, Albert Kwon, and Srini- vas Devadas. Var-CNN: A data-efficient web- site fingerprinting attack based on deep learning. Proceedings on Privacy Enhancing Technologies (PoPETs), 2019(4), 2019

2019
[7]

A systematic ap- proach to developing and evaluating website fin- gerprinting defenses

Xiang Cai, Rishab Nithyanand, Tao Wang, Rob Johnson, and Ian Goldberg. A systematic ap- proach to developing and evaluating website fin- gerprinting defenses. InACM Conference on Computer and Communications Security (CCS), 2014

2014
[8]

David L. Chaum. Untraceable electronic mail, return addresses, and digital pseudonyms.Com- munications of the ACM, 24(2):84–90, 1981

1981
[9]

Morley Mao, and Paramvir Bahl

Xu Chen, Ming Zhang, Z. Morley Mao, and Paramvir Bahl. Automating network application dependency discovery: Experiences, limitations, and new solutions. In8th USENIX Symposium on Operating Systems Design and Implementa- tion (OSDI), 2008

2008
[10]

Flash Boys 2.0: Fron- trunning in decentralized exchanges, miner ex- tractable value, and consensus instability

PhilipDaian, StevenGoldfeder, TylerKell, Yunqi Li, Xueyuan Zhao, Iddo Bentov, Lorenz Brei- denbach, and Ari Juels. Flash Boys 2.0: Fron- trunning in decentralized exchanges, miner ex- tractable value, and consensus instability. In IEEE Symposium on Security and Privacy (S&P), 2020

2020
[11]

Alex Davidson, Jana Iyengar, and Christopher A. Wood. RFC 9576: The privacy pass archi- tecture, 2024. https://www.rfc-editor.org/ rfc/rfc9576. 21

2024
[12]

Towards measuring anonymity

Claudia Díaz, Stefaan Seys, Joris Claessens, and Bart Preneel. Towards measuring anonymity. In Privacy Enhancing Technologies (PET), 2002

2002
[13]

Tor: The second-generation onion router

Roger Dingledine, Nick Mathewson, and Paul Syverson. Tor: The second-generation onion router. InUSENIX Security Symposium, 2004

2004
[14]

Abul Ehtesham, Aditi Singh, Gaurav Kumar Gupta, and Saket Kumar. A survey of agent in- teroperability protocols: Model context protocol (MCP), agent communication protocol (ACP), agent-to-agent protocol (A2A), and agent net- work protocol (ANP), 2025. arXiv:2505.02279

work page arXiv 2025
[15]

AgentLeak: A full-stack bench- mark for privacy leakage in multi-agent LLM systems, 2026

Faouzi El Yagoubi, Godwin Badu-Marfo, and Ranwa AlMallah. AgentLeak: A full-stack bench- mark for privacy leakage in multi-agent LLM systems, 2026. arXiv:2602.11510

work page arXiv 2026
[16]

Zero-delay lightweight defenses against website fingerprint- ing

Jiajun Gong and Tao Wang. Zero-delay lightweight defenses against website fingerprint- ing. InUSENIX Security Symposium, 2020

2020
[17]

Surakav: Generating realistic traces for a strong website fingerprinting defense

Jiajun Gong, Wuqi Zhang, Charles Zhang, and Tao Wang. Surakav: Generating realistic traces for a strong website fingerprinting defense. In IEEE Symposium on Security and Privacy (S&P), 2022

2022
[18]

k- fingerprinting: A robust scalable website finger- printing technique

Jamie Hayes and George Danezis. k- fingerprinting: A robust scalable website finger- printing technique. InUSENIX Security Sympo- sium, 2016

2016
[19]

Ronald A. Howard. Information value theory. IEEE Transactions on Systems Science and Cy- bernetics, 2(1), 1966

1966
[20]

Network-level prompt and trait leakage in local research agents, 2025

Hyejun Jeong, Mohammadreza Teymoorianfard, Abhinav Kumar, Amir Houmansadr, and Eu- gene Bagdasarian. Network-level prompt and trait leakage in local research agents, 2025. arXiv:2508.20282

work page arXiv 2025
[21]

Toward an effi- cient website fingerprinting defense

Marc Juarez, Mohsen Imani, Mike Perry, Clau- dia Diaz, and Matthew Wright. Toward an effi- cient website fingerprinting defense. InEuropean Symposium on Research in Computer Security (ESORICS), 2016

2016
[22]

Improving Google A2A protocol: Protecting sen- sitive data and mitigating unintended harms in multi-agent systems, 2025

Yedidel Louck, Ariel Stulman, and Amit Dvir. Improving Google A2A protocol: Protecting sen- sitive data and mitigating unintended harms in multi-agent systems, 2025. arXiv:2505.12490

work page arXiv 2025
[23]

Security analysis of agentic AI communica- tion protocols: A comparative evaluation, 2025

Yedidel Louck, Ariel Stulman, and Amit Dvir. Security analysis of agentic AI communica- tion protocols: A comparative evaluation, 2025. arXiv:2511.03841

work page arXiv 2025
[24]

Characterizing microser- vice dependency and performance: Alibaba trace analysis

Shutian Luo, Huanle Xu, Chengzhi Lu, Kejiang Ye, Guoyao Xu, Liping Zhang, Yu Ding, Jian He, and Chengzhong Xu. Characterizing microser- vice dependency and performance: Alibaba trace analysis. InACM Symposium on Cloud Comput- ing (SoCC), 2021

2021
[25]

DeepCorr: Strong flow correla- tion attacks on Tor using deep learning

Milad Nasr, Alireza Bahramali, and Amir Houmansadr. DeepCorr: Strong flow correla- tion attacks on Tor using deep learning. InACM Conference on Computer and Communications Security (CCS), 2018

2018
[26]

The Nym network: The next generation of privacy infrastruc- ture, 2021

Nym Technologies. The Nym network: The next generation of privacy infrastruc- ture, 2021. Whitepaper. https://nym.com/nym- whitepaper.pdf

2021
[27]

Schwartz

Hamid Ould-Brahim, Tommy Pauly, and Ben- jamin M. Schwartz. Chunked Oblivious HTTP messages. Internet-Draft draft-ietf-ohai-chunked- ohttp, IETF OHAI WG (RFC Ed. Queue),
[28]

https://datatracker.ietf.org/doc/ draft-ietf-ohai-chunked-ohttp/
[29]

Website fingerprinting in onion routing based anonymization networks

Andriy Panchenko, Lukas Niessen, Andreas Zin- nen, and Thomas Engel. Website fingerprinting in onion routing based anonymization networks. InACM Workshop on Privacy in the Electronic Society (WPES), 2011

2011
[30]

SimpleX messaging protocol (SMP), 2024

Evgeny Poberezkin. SimpleX messaging protocol (SMP), 2024. https://github.com/simplex- chat/simplexmq/blob/stable/protocol/ simplex-messaging.md

2024
[31]

RFC 9298: Proxying UDP in HTTP, 2022

David Schinazi. RFC 9298: Proxying UDP in HTTP, 2022. https://www.rfc-editor.org/ rfc/rfc9298

2022
[32]

Open challenges in multi-agent security: Towards secure systems of interacting AI agents,

Christian Schroeder de Witt, Klaudia Krawiecka, et al. Open challenges in multi-agent security: Towards secure systems of interacting AI agents,
[33]

Towards an information theoretic metric for anonymity

Andrei Serjantov and George Danezis. Towards an information theoretic metric for anonymity. InPrivacy Enhancing Technologies (PET), 2002

2002
[34]

Staude- meyer, and Henrich C

Mohsen Shirali, Tobias Tefke, Ralf C. Staude- meyer, and Henrich C. Poehls. A survey on anonymous communication systems with a fo- cus on dining cryptographers networks, 2022. arXiv:2212.08275. 22

work page arXiv 2022
[35]

Deep fingerprinting: Under- mining website fingerprinting defenses with deep learning

Payap Sirinam, Mohsen Imani, Marc Juarez, and Matthew Wright. Deep fingerprinting: Under- mining website fingerprinting defenses with deep learning. InACM Conference on Computer and Communications Security (CCS), 2018

2018
[36]

Martin Thomson and Christopher A. Wood. RFC 9458: ObliviousHTTP,2024. https://www.rfc- editor.org/rfc/rfc9458

2024
[37]

Threat model for decentralized cre- dentials, 2026

W3C. Threat model for decentralized cre- dentials, 2026. W3C, 20 January 2026. https://www.w3.org/TR/threat-model- decentralized-credentials/

2026
[38]

Exposing LLM user privacy via traffic fingerprint analysis: A study of privacy risks in LLM agent interac- tions, 2025

Yixiang Zhang, Xinhao Deng, Zhongyi Gu, Yihao Chen, Ke Xu, Qi Li, and Jianping Wu. Exposing LLM user privacy via traffic fingerprint analysis: A study of privacy risks in LLM agent interac- tions, 2025. arXiv:2510.07176. 23

work page arXiv 2025

[1] [1]

Agent2agent (a2a) protocol specification, 2026.https://a2a- protocol.org/

A2A Project (Linux Foundation). Agent2agent (a2a) protocol specification, 2026.https://a2a- protocol.org/

2026

[2] [2]

SLIM: Secure low-latency interactive messaging, 2026.https://github.com/agntcy/ slim; IETF draft draft-mpsb-agntcy-slim

AGNTCY. SLIM: Secure low-latency interactive messaging, 2026.https://github.com/agntcy/ slim; IETF draft draft-mpsb-agntcy-slim

2026

[3] [3]

Security Threat Modeling for Emerging AI-Agent Protocols: A Comparative Analysis of MCP, A2A, Agora, and ANP

Zeynab Anbiaee, Mahdi Rabbani, Mansur Mi- rani, GunjanPiya, IgorOpushnyev, AliGhorbani, and Sajjad Dadkhah. Security threat modeling for emerging AI-agent protocols: A comparative analysis of MCP, A2A, Agora, and ANP, 2026. arXiv:2602.11327

work page internal anchor Pith review Pith/arXiv arXiv 2026

[4] [4]

Model context protocol, 2025.https: //modelcontextprotocol.io/

Anthropic. Model context protocol, 2025.https: //modelcontextprotocol.io/

2025

[5] [5]

RFC 9420: The messaging layer security (mls) protocol, 2023.https://www.rfc- editor.org/rfc/rfc9420

Richard Barnes, Benjamin Beurdouche, Raphael Robert, Jon Millican, Emad Omara, and Katriel Cohn-Gordon. RFC 9420: The messaging layer security (mls) protocol, 2023.https://www.rfc- editor.org/rfc/rfc9420

2023

[6] [6]

Var-CNN: A data-efficient web- site fingerprinting attack based on deep learning

Sanjit Bhat, David Lu, Albert Kwon, and Srini- vas Devadas. Var-CNN: A data-efficient web- site fingerprinting attack based on deep learning. Proceedings on Privacy Enhancing Technologies (PoPETs), 2019(4), 2019

2019

[7] [7]

A systematic ap- proach to developing and evaluating website fin- gerprinting defenses

Xiang Cai, Rishab Nithyanand, Tao Wang, Rob Johnson, and Ian Goldberg. A systematic ap- proach to developing and evaluating website fin- gerprinting defenses. InACM Conference on Computer and Communications Security (CCS), 2014

2014

[8] [8]

David L. Chaum. Untraceable electronic mail, return addresses, and digital pseudonyms.Com- munications of the ACM, 24(2):84–90, 1981

1981

[9] [9]

Morley Mao, and Paramvir Bahl

Xu Chen, Ming Zhang, Z. Morley Mao, and Paramvir Bahl. Automating network application dependency discovery: Experiences, limitations, and new solutions. In8th USENIX Symposium on Operating Systems Design and Implementa- tion (OSDI), 2008

2008

[10] [10]

Flash Boys 2.0: Fron- trunning in decentralized exchanges, miner ex- tractable value, and consensus instability

PhilipDaian, StevenGoldfeder, TylerKell, Yunqi Li, Xueyuan Zhao, Iddo Bentov, Lorenz Brei- denbach, and Ari Juels. Flash Boys 2.0: Fron- trunning in decentralized exchanges, miner ex- tractable value, and consensus instability. In IEEE Symposium on Security and Privacy (S&P), 2020

2020

[11] [11]

Alex Davidson, Jana Iyengar, and Christopher A. Wood. RFC 9576: The privacy pass archi- tecture, 2024. https://www.rfc-editor.org/ rfc/rfc9576. 21

2024

[12] [12]

Towards measuring anonymity

Claudia Díaz, Stefaan Seys, Joris Claessens, and Bart Preneel. Towards measuring anonymity. In Privacy Enhancing Technologies (PET), 2002

2002

[13] [13]

Tor: The second-generation onion router

Roger Dingledine, Nick Mathewson, and Paul Syverson. Tor: The second-generation onion router. InUSENIX Security Symposium, 2004

2004

[14] [14]

Abul Ehtesham, Aditi Singh, Gaurav Kumar Gupta, and Saket Kumar. A survey of agent in- teroperability protocols: Model context protocol (MCP), agent communication protocol (ACP), agent-to-agent protocol (A2A), and agent net- work protocol (ANP), 2025. arXiv:2505.02279

work page arXiv 2025

[15] [15]

AgentLeak: A full-stack bench- mark for privacy leakage in multi-agent LLM systems, 2026

Faouzi El Yagoubi, Godwin Badu-Marfo, and Ranwa AlMallah. AgentLeak: A full-stack bench- mark for privacy leakage in multi-agent LLM systems, 2026. arXiv:2602.11510

work page arXiv 2026

[16] [16]

Zero-delay lightweight defenses against website fingerprint- ing

Jiajun Gong and Tao Wang. Zero-delay lightweight defenses against website fingerprint- ing. InUSENIX Security Symposium, 2020

2020

[17] [17]

Surakav: Generating realistic traces for a strong website fingerprinting defense

Jiajun Gong, Wuqi Zhang, Charles Zhang, and Tao Wang. Surakav: Generating realistic traces for a strong website fingerprinting defense. In IEEE Symposium on Security and Privacy (S&P), 2022

2022

[18] [18]

k- fingerprinting: A robust scalable website finger- printing technique

Jamie Hayes and George Danezis. k- fingerprinting: A robust scalable website finger- printing technique. InUSENIX Security Sympo- sium, 2016

2016

[19] [19]

Ronald A. Howard. Information value theory. IEEE Transactions on Systems Science and Cy- bernetics, 2(1), 1966

1966

[20] [20]

Network-level prompt and trait leakage in local research agents, 2025

Hyejun Jeong, Mohammadreza Teymoorianfard, Abhinav Kumar, Amir Houmansadr, and Eu- gene Bagdasarian. Network-level prompt and trait leakage in local research agents, 2025. arXiv:2508.20282

work page arXiv 2025

[21] [21]

Toward an effi- cient website fingerprinting defense

Marc Juarez, Mohsen Imani, Mike Perry, Clau- dia Diaz, and Matthew Wright. Toward an effi- cient website fingerprinting defense. InEuropean Symposium on Research in Computer Security (ESORICS), 2016

2016

[22] [22]

Improving Google A2A protocol: Protecting sen- sitive data and mitigating unintended harms in multi-agent systems, 2025

Yedidel Louck, Ariel Stulman, and Amit Dvir. Improving Google A2A protocol: Protecting sen- sitive data and mitigating unintended harms in multi-agent systems, 2025. arXiv:2505.12490

work page arXiv 2025

[23] [23]

Security analysis of agentic AI communica- tion protocols: A comparative evaluation, 2025

Yedidel Louck, Ariel Stulman, and Amit Dvir. Security analysis of agentic AI communica- tion protocols: A comparative evaluation, 2025. arXiv:2511.03841

work page arXiv 2025

[24] [24]

Characterizing microser- vice dependency and performance: Alibaba trace analysis

Shutian Luo, Huanle Xu, Chengzhi Lu, Kejiang Ye, Guoyao Xu, Liping Zhang, Yu Ding, Jian He, and Chengzhong Xu. Characterizing microser- vice dependency and performance: Alibaba trace analysis. InACM Symposium on Cloud Comput- ing (SoCC), 2021

2021

[25] [25]

DeepCorr: Strong flow correla- tion attacks on Tor using deep learning

Milad Nasr, Alireza Bahramali, and Amir Houmansadr. DeepCorr: Strong flow correla- tion attacks on Tor using deep learning. InACM Conference on Computer and Communications Security (CCS), 2018

2018

[26] [26]

The Nym network: The next generation of privacy infrastruc- ture, 2021

Nym Technologies. The Nym network: The next generation of privacy infrastruc- ture, 2021. Whitepaper. https://nym.com/nym- whitepaper.pdf

2021

[27] [27]

Schwartz

Hamid Ould-Brahim, Tommy Pauly, and Ben- jamin M. Schwartz. Chunked Oblivious HTTP messages. Internet-Draft draft-ietf-ohai-chunked- ohttp, IETF OHAI WG (RFC Ed. Queue),

[28] [28]

https://datatracker.ietf.org/doc/ draft-ietf-ohai-chunked-ohttp/

[29] [29]

Website fingerprinting in onion routing based anonymization networks

Andriy Panchenko, Lukas Niessen, Andreas Zin- nen, and Thomas Engel. Website fingerprinting in onion routing based anonymization networks. InACM Workshop on Privacy in the Electronic Society (WPES), 2011

2011

[30] [30]

SimpleX messaging protocol (SMP), 2024

Evgeny Poberezkin. SimpleX messaging protocol (SMP), 2024. https://github.com/simplex- chat/simplexmq/blob/stable/protocol/ simplex-messaging.md

2024

[31] [31]

RFC 9298: Proxying UDP in HTTP, 2022

David Schinazi. RFC 9298: Proxying UDP in HTTP, 2022. https://www.rfc-editor.org/ rfc/rfc9298

2022

[32] [32]

Open challenges in multi-agent security: Towards secure systems of interacting AI agents,

Christian Schroeder de Witt, Klaudia Krawiecka, et al. Open challenges in multi-agent security: Towards secure systems of interacting AI agents,

[33] [33]

Towards an information theoretic metric for anonymity

Andrei Serjantov and George Danezis. Towards an information theoretic metric for anonymity. InPrivacy Enhancing Technologies (PET), 2002

2002

[34] [34]

Staude- meyer, and Henrich C

Mohsen Shirali, Tobias Tefke, Ralf C. Staude- meyer, and Henrich C. Poehls. A survey on anonymous communication systems with a fo- cus on dining cryptographers networks, 2022. arXiv:2212.08275. 22

work page arXiv 2022

[35] [35]

Deep fingerprinting: Under- mining website fingerprinting defenses with deep learning

Payap Sirinam, Mohsen Imani, Marc Juarez, and Matthew Wright. Deep fingerprinting: Under- mining website fingerprinting defenses with deep learning. InACM Conference on Computer and Communications Security (CCS), 2018

2018

[36] [36]

Martin Thomson and Christopher A. Wood. RFC 9458: ObliviousHTTP,2024. https://www.rfc- editor.org/rfc/rfc9458

2024

[37] [37]

Threat model for decentralized cre- dentials, 2026

W3C. Threat model for decentralized cre- dentials, 2026. W3C, 20 January 2026. https://www.w3.org/TR/threat-model- decentralized-credentials/

2026

[38] [38]

Exposing LLM user privacy via traffic fingerprint analysis: A study of privacy risks in LLM agent interac- tions, 2025

Yixiang Zhang, Xinhao Deng, Zhongyi Gu, Yihao Chen, Ke Xu, Qi Li, and Jianping Wu. Exposing LLM user privacy via traffic fingerprint analysis: A study of privacy risks in LLM agent interac- tions, 2025. arXiv:2510.07176. 23

work page arXiv 2025