pith. machine review for the scientific record. sign in

arxiv: 2605.03213 · v2 · submitted 2026-05-04 · 💻 cs.CR · cs.AI

Recognition: 1 theorem link

When Agents Handle Secrets: A Survey of Confidential Computing for Agentic AI

Authors on Pith no claims yet

Pith reviewed 2026-05-08 17:52 UTC · model grok-4.3

classification 💻 cs.CR cs.AI
keywords agentic AIconfidential computingtrusted execution environmentsLLM agentssecurity surveyTEE platformsremote attestationmulti-agent security
0
0 comments X

The pith

Agentic AI lacks a unified end-to-end confidential computing framework despite mature hardware primitives.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

Agentic AI systems that plan, invoke tools, maintain memory, and coordinate with peer agents create new security risks because they accumulate sensitive context and credentials across pipelines no single party controls. Standard software defenses can be bypassed by privileged attackers such as compromised cloud operators. Confidential computing supplies hardware-rooted isolation through Trusted Execution Environments and remote attestation for verifiable trust. The survey organizes six TEE platforms into a taxonomy, builds an agent-centric threat model across perception to coordination layers, compares transferable versus new defenses, and lists open problems including compound attestation for agent chains and GPU performance at scale. It concludes that targeted deployments are feasible now but no coherent production-grade security substrate exists yet.

Core claim

The paper establishes that agentic AI introduces a distinct threat surface involving prompt injection, context exfiltration, credential theft, and inter-agent message poisoning that software-only defenses cannot reliably address. It synthesizes a taxonomy of six TEE platforms covering deployment roles and tradeoffs, maps nine security goals to an agent-centric threat model spanning five layers, distinguishes CC techniques that transfer from single-call inference from those requiring new agentic designs, and identifies six open challenges such as compound attestation for multi-hop chains and LLM-scale GPU-TEE performance. The central claim is that while several hardware trust primitives are成熟

What carries the argument

Unified taxonomy of six TEE platforms (Intel SGX, Intel TDX, AMD SEV-SNP, ARM TrustZone, ARM CCA, NVIDIA H100 CC) paired with an agent-centric threat model across perception, planning, memory, action, and coordination layers.

If this is right

  • TEEs can isolate agent code and data from privileged system software and operators.
  • Remote attestation enables verifiable trust across distributed agent deployments.
  • New agent-specific designs are required for persistent memory and multi-hop coordination beyond single-call inference techniques.
  • Compound attestation mechanisms are needed to secure chains of delegated agent tasks.
  • GPU-TEE implementations must reach practical performance at LLM scale for widespread use.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • CC primitives could support safer credential handling during agent delegation without exposing secrets to coordinating parties.
  • Standardization of security goals across emerging protocols such as MCP and A2A might accelerate adoption of TEE-based protections.
  • Agent frameworks could incorporate TEE support by default to reduce reliance on cloud provider trust.
  • Real-world testing of multi-agent workflows with compound attestation would clarify remaining integration gaps.

Load-bearing premise

Current defenses operate entirely within the software stack and can be silently bypassed by a sufficiently privileged adversary such as a compromised cloud operator.

What would settle it

Successful deployment of a production-scale agentic AI system using an integrated CC framework that resists attacks from a compromised cloud operator, or discovery of a practical bypass in existing TEE protections for persistent agent memory and multi-agent coordination.

Figures

Figures reproduced from arXiv: 2605.03213 by Hamed Haddadi, Javad Forough, Marios Kogias.

Figure 1
Figure 1. Figure 1: Analytical lens of the survey. This figure organizes the paper through five connected abstractions: agent components, adversaries, security goals, confidential computing mechanisms, and unresolved research gaps. It explains both the organization of the survey and the logic behind its comparative analysis. much of the current evidence base comes from confidential LLM inference, which informs perception and … view at source ↗
Figure 2
Figure 2. Figure 2: Illustrative single-agent trust-boundary model. One attested TEE / confidential VM encloses the LLM-based agent core, local memory, credentials, retrieval mediation, and tool-policy checks before data or actions leave the runtime. Retrieval backends, external storage, APIs / tools, and other I/O paths remain outside or only partially covered by the trust boundary. agent-to-agent delegation across organizat… view at source ↗
Figure 3
Figure 3. Figure 3: Illustrative multi-agent trust-boundary model. Different LLM-based agents execute inside separate attested TEEs / confidential VMs, so security depends on remote attestation, authenticated delegation channels, and platform-specific shared-state mechanisms. Provenance, freshness, and cross-agent policy enforcement are conditional properties rather than universally guaranteed. External APIs / tools remain ou… view at source ↗
Figure 4
Figure 4. Figure 4: Platform landscape for confidential computing in agentic AI. The surveyed platforms occupy different regions of the deployment space rather than forming a simple best-to-worst ranking. Fine-grained TEEs such as Intel SGX are often attractive for small trusted components, confidential￾runtime platforms such as Intel TDX and AMD SEV-SNP are better suited to deployable cloud runtimes, and accelerator-backed c… view at source ↗
Figure 4
Figure 4. Figure 4: Platform landscape for confidential computing in agentic AI. The surveyed platforms occupy different regions of the deployment space rather than forming a simple best-to-worst ranking. Fine-grained TEEs such as Intel SGX are often attractive for small trusted components, confidential￾runtime platforms such as Intel TDX and AMD SEV-SNP are better suited to deployable cloud runtimes, and accelerator-backed c… view at source ↗
read the original abstract

Agentic AI systems, specifically LLM-driven agents that plan, invoke tools, maintain persistent memory, and delegate tasks to peer agents via protocols such as MCP and A2A, introduce a threat surface that differs materially from standalone model inference. Agents accumulate sensitive context, hold credentials, and operate across pipelines no single party fully controls, enabling prompt injection, context exfiltration, credential theft, and inter-agent message poisoning. Current defenses operate entirely within the software stack and can be silently bypassed by a sufficiently privileged adversary such as a compromised cloud operator. Confidential computing (CC) offers a hardware-rooted alternative: Trusted Execution Environments (TEEs) isolate agent code and data from privileged system software, while remote attestation enables verifiable trust across distributed deployments. This survey synthesizes the design space in four parts: (i) a unified taxonomy of six TEE platforms (Intel SGX, Intel TDX, AMD SEV-SNP, ARM TrustZone, ARM CCA, and NVIDIA H100 CC) covering deployment roles and performance tradeoffs; (ii) an agent-centric threat model spanning perception, planning, memory, action, and coordination layers mapped to nine security goals; (iii) a comparative survey of CC-based defenses distinguishing findings that transfer from single-call inference versus what requires new agentic designs; and (iv) six open challenges including compound attestation for multi-hop agent chains and GPU-TEE performance at LLM scale. While several hardware trust primitives appear mature enough for targeted deployments, no broadly established end-to-end framework yet binds them into a coherent security substrate for production agentic AI.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

1 major / 3 minor

Summary. The manuscript is a survey synthesizing confidential computing (CC) for agentic AI systems driven by LLMs that plan, use tools, maintain memory, and coordinate via protocols like MCP and A2A. It highlights that these systems create a distinct threat surface (prompt injection, context exfiltration, credential theft, message poisoning) not covered by software-stack defenses, which can be bypassed by privileged adversaries such as compromised cloud operators. The work structures its contribution in four parts: (i) a unified taxonomy of six TEE platforms (Intel SGX, Intel TDX, AMD SEV-SNP, ARM TrustZone, ARM CCA, NVIDIA H100 CC) with deployment roles and performance tradeoffs; (ii) an agent-centric threat model across perception, planning, memory, action, and coordination layers mapped to nine security goals; (iii) a comparative survey of CC-based defenses, separating transferable results from single-call inference versus those needing new agentic designs; and (iv) six open challenges including compound attestation for multi-hop chains and GPU-TEE performance at LLM scale. The central claim is that hardware trust primitives are mature enough for targeted deployments but no broadly established end-to-end CC framework yet exists for production agentic AI.

Significance. If the synthesis holds, the paper offers a timely, structured reference that bridges confidential computing and the emerging agentic AI paradigm. By providing a TEE taxonomy, a tailored threat model, and an explicit enumeration of gaps (e.g., compound attestation, scale issues), it can direct future research toward integrated security substrates rather than piecemeal defenses. The observational nature of the central claim is strengthened by the four-part organization that systematically identifies what transfers from existing CC work and what does not.

major comments (1)
  1. Abstract and part (iv): The claim that 'no broadly established end-to-end framework yet binds [the primitives] into a coherent security substrate' is load-bearing for the paper's contribution. The manuscript should explicitly define the criteria for an 'end-to-end framework' (e.g., coverage of all five agent layers, support for multi-party attestation, and production-scale performance) so that the negative conclusion can be evaluated against the gaps enumerated in the open challenges.
minor comments (3)
  1. Part (ii): The agent-centric threat model is described as spanning five layers and mapping to nine security goals, but the text does not enumerate the nine goals or provide an explicit mapping table. Adding such a table would make the subsequent transfer analysis in part (iii) easier to follow and verify.
  2. Part (i): The taxonomy of the six TEE platforms includes deployment roles and performance tradeoffs, yet the manuscript provides no quantitative benchmarks or overhead figures (e.g., latency or throughput impact on LLM inference). Including even summary metrics with citations would strengthen the comparative value of the taxonomy.
  3. Overall: The abstract states that 'current defenses operate entirely within the software stack,' but the comparative survey in part (iii) should include a short subsection confirming that no hardware-rooted or hybrid defenses for agentic workflows were overlooked in the reviewed literature.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for this constructive comment, which helps strengthen the clarity of our central claim. We address it directly below and will incorporate the requested definition into the revised manuscript.

read point-by-point responses
  1. Referee: Abstract and part (iv): The claim that 'no broadly established end-to-end framework yet binds [the primitives] into a coherent security substrate' is load-bearing for the paper's contribution. The manuscript should explicitly define the criteria for an 'end-to-end framework' (e.g., coverage of all five agent layers, support for multi-party attestation, and production-scale performance) so that the negative conclusion can be evaluated against the gaps enumerated in the open challenges.

    Authors: We agree that an explicit definition of 'end-to-end framework' is needed to make the claim evaluable. In the revised abstract and section (iv), we will define an end-to-end CC framework for agentic AI as one that (1) supplies hardware-rooted isolation and attestation across all five agent layers (perception, planning, memory, action, coordination), (2) supports compound and multi-party attestation for multi-hop interactions under protocols such as A2A, and (3) achieves production-scale performance for LLM-driven workloads without prohibitive overheads. With this definition, the six open challenges (including compound attestation and GPU-TEE scaling at LLM size) demonstrate that existing TEE applications remain piecemeal or limited to single-call inference, confirming that no such integrated substrate yet exists. We will update the text accordingly. revision: yes

Circularity Check

0 steps flagged

No significant circularity in this literature survey

full rationale

This paper is a survey synthesizing prior work on TEE platforms, threat models, and defenses for agentic AI. It contains no equations, derivations, fitted parameters, predictions, or self-referential definitions. The central claim (absence of a mature end-to-end CC framework) is an observational synthesis of gaps identified from external citations, not a reduction to the paper's own inputs. All content is externally grounded; no load-bearing step reduces by construction to a self-citation chain or internal fit.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

As a survey, the paper introduces no new free parameters, axioms, or invented entities; it relies entirely on established concepts and cited literature from confidential computing and AI security.

pith-pipeline@v0.9.0 · 5590 in / 1027 out tokens · 70036 ms · 2026-05-08T17:52:01.800698+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

114 extracted references · 64 canonical work pages · 9 internal anchors

  1. [1]

    [Online]

    OpenAI, “Swarm,” GitHub repository, 2024. [Online]. Available: https://github.com/openai/swarm

  2. [2]

    Model context protocol,

    Anthropic, “Model context protocol,” Specification and documentation,

  3. [3]

    Available: https://modelcontextprotocol.io/

    [Online]. Available: https://modelcontextprotocol.io/

  4. [4]

    Ignore Previous Prompt: Attack Techniques For Language Models

    F. Perez and I. Ribeiro, “Ignore previous prompt: Attack techniques for language models,”arXiv preprint arXiv:2211.09527, 2022, neurIPS 2022 ML Safety Workshop. [Online]. Available: https: //arxiv.org/abs/2211.09527

  5. [5]

    Not what you've signed up for: Compromising Real-World LLM-Integrated Applications with Indirect Prompt Injection

    K. Greshake, S. Abdelnabi, S. Mishra, C. Endres, T. Holz, and M. Fritz, “Not what you’ve signed up for: Compromising real- world LLM-integrated applications with indirect prompt injection,” arXiv preprint arXiv:2302.12173, 2023. [Online]. Available: https: //arxiv.org/abs/2302.12173

  6. [6]

    InjecAgent: Benchmarking indirect prompt injections in tool-integrated large language model agents,

    Q. Zhan, Z. Liang, Z. Ying, and D. Kang, “InjecAgent: Benchmarking indirect prompt injections in tool-integrated large language model agents,” inFindings of the Association for Computational Linguistics: ACL 2024, 2024, pp. 10 471–10 506. [Online]. Available: https: //aclanthology.org/2024.findings-acl.624/

  7. [7]

    EchoLeak: The First Real-World Zero-Click Prompt Injection Exploit in a Production LLM System,

    P. Reddy and A. S. Gujral, “EchoLeak: The first real-world zero-click prompt injection exploit in a production LLM system,” inProceedings of the AAAI Symposium Series, vol. 7, 2025, pp. 303–311, arXiv:2509.10540. CVE-2025-32711. [Online]. Available: https://ojs.aaai.org/index.php/AAAI-SS/article/view/36899

  8. [8]

    Available: https://arxiv.org/abs/2601.05293

    S. J. Lazer, K. Aryal, M. Gupta, and E. Bertino, “A survey of agentic AI and cybersecurity: Challenges, opportunities and use- case prototypes,”arXiv preprint arXiv:2601.05293, 2026. [Online]. Available: https://arxiv.org/abs/2601.05293

  9. [9]

    A survey on large language model (LLM) security and privacy: The good, the bad, and the ugly,

    Y . Yao, J. Duan, K. Xu, Y . Cai, Z. Sun, and Y . Zhang, “A survey on large language model (LLM) security and privacy: The good, the bad, and the ugly,”High-Confidence Computing, vol. 4, no. 2, p. 100211,

  10. [10]
  11. [11]

    Security and Privacy Challenges of Large Language Models: A Survey

    B. C. Das, M. H. Amini, and Y . Wu, “Security and privacy challenges of large language models: A survey,”ACM Computing Surveys, vol. 57, no. 6, pp. 152:1–152:39, 2025. [Online]. Available: https://dl.acm.org/doi/10.1145/3712001

  12. [12]

    Zhang, Z

    K. Zhang, Z. Su, P.-Y . Chen, E. Bertino, X. Zhang, and N. Li, “LLM agents should employ security principles,”arXiv preprint arXiv:2505.24019, 2025. [Online]. Available: https://arxiv.org/abs/ 2505.24019

  13. [13]

    GuardNet: Graph- attention filtering for jailbreak defense in large language models,

    J. Forough, M. Maheri, and H. Haddadi, “GuardNet: Graph- attention filtering for jailbreak defense in large language models,” arXiv preprint arXiv:2509.23037, 2025. [Online]. Available: https: //arxiv.org/abs/2509.23037

  14. [15]
  15. [16]

    Iron: Private inference on transformers,

    M. Hao, H. Li, H. Chen, P. Xing, G. Xu, and T. Zhang, “Iron: Private inference on transformers,” inAdvances in Neural Information Processing Systems (NeurIPS 2022), 2022. [Online]. Available: https://papers.nips.cc/paper files/paper/2022/hash/ 64e2449d74f84e5b1a5c96ba7b3d308e-Abstract-Conference.html

  16. [17]

    CrypTen: Secure multi- party computation meets machine learning,

    B. Knott, S. Venkataraman, A. Y . Hannun, S. Sengupta, M. Ibrahim, and L. van der Maaten, “CrypTen: Secure multi- party computation meets machine learning,” inAdvances in Neural Information Processing Systems (NeurIPS 2021), 2021, pp. 4961–

  17. [18]

    Available: https://proceedings.neurips.cc/paper/2021/ hash/2754518221cfbc8d25c13a06a4cb8421-Abstract.html

    [Online]. Available: https://proceedings.neurips.cc/paper/2021/ hash/2754518221cfbc8d25c13a06a4cb8421-Abstract.html

  18. [19]

    A survey on private transformer inference,

    Y . Li, X. Zhou, Y . Wang, L. Qian, and J. Zhao, “A survey on private transformer inference,”arXiv preprint arXiv:2412.08145, 2024. [Online]. Available: https://arxiv.org/abs/2412.08145

  19. [21]

    Available: https://arxiv.org/abs/2408.03561

    [Online]. Available: https://arxiv.org/abs/2408.03561

  20. [22]

    No free lunch theorem for privacy-preserving LLM inference,

    X. Zhang, Y . Pang, Y . Kang, W. Chen, L. Fan, H. Jin, and Q. Yang, “No free lunch theorem for privacy-preserving LLM inference,” Artificial Intelligence, vol. 341, p. 104293, 2025. [Online]. Available: https://doi.org/10.1016/j.artint.2025.104293

  21. [23]

    PermLLM: Private inference of large language models within 3 seconds under W AN,

    F. Zheng, C. Chen, Z. Han, and X. Zheng, “PermLLM: Private inference of large language models within 3 seconds under W AN,”arXiv preprint arXiv:2405.18744, 2024. [Online]. Available: https://arxiv.org/abs/2405.18744

  22. [24]

    Intel SGX explained,

    V . Costan and S. Devadas, “Intel SGX explained,”IACR Cryptology ePrint Archive, vol. 2016, p. 86, 2016. [Online]. Available: https://eprint.iacr.org/2016/086

  23. [25]

    Intel Trust Domain Extensions (Intel TDX) module 1.5 base architecture specification,

    Intel, “Intel Trust Domain Extensions (Intel TDX) module 1.5 base architecture specification,” Architecture specification, 2023. [Online]. Available: https://www.intel.com/content/www/us/en/ developer/articles/technical/intel-trust-domain-extensions.html

  24. [26]

    SEV-SNP: Strengthening VM isolation with integrity protection and more,

    Advanced Micro Devices, “SEV-SNP: Strengthening VM isolation with integrity protection and more,” Technical white paper, 2020. [Online]. Available: https://www.amd.com/system/files/TechDocs/56860.pdf

  25. [27]

    Building a secure system using TrustZone technology,

    Arm Limited, “Building a secure system using TrustZone technology,” Technical white paper, 2009. [Online]. Available: https://developer. arm.com/documentation/prd29-genc-009492/latest

  26. [28]

    Arm confidential compute architecture,

    Arm, “Arm confidential compute architecture,” Architecture documentation, 2023. [Online]. Available: https://www.arm.com/ architecture/security-features/arm-confidential-compute-architecture

  27. [29]

    Confidential computing on NVIDIA H100 GPUs for secure and trustworthy AI,

    NVIDIA, “Confidential computing on NVIDIA H100 GPUs for secure and trustworthy AI,” Technical blog and product documentation, 2023. [Online]. Available: https://developer.nvidia.com/blog/confidential- computing-on-h100-gpus-for-secure-and-trustworthy-ai/

  28. [30]

    Unlocking the future of data security: Confidential computing as a strategic imperative,

    International Data Corporation (IDC), “Unlocking the future of data security: Confidential computing as a strategic imperative,” IDC, commissioned by the Confidential Computing Consortium, Tech. Rep. IDC #US53866125, 2025. [Online]. Available: https://confidentialcomputing.io/wp- content/uploads/sites/10/2025/ 11/US53866125.pdf

  29. [31]

    AI agents under threat: A survey of key security challenges and future pathways,

    Z. Deng, Y . Guo, C. Han, W. Ma, J. Xiong, S. Wen, and Y . Xiang, “AI agents under threat: A survey of key security challenges and future pathways,”ACM Computing Surveys, vol. 57, no. 7, pp. 182:1–182:36, 2025

  30. [32]

    A survey on trustworthy LLM agents: Threats and countermeasures,

    M. Yu, F. Meng, X. Zhou, S. Wang, J. Mao, L. Pan, T. Chen, K. Wang, X. Li, Y . Zhang, B. An, and Q. Wen, “A survey on trustworthy LLM agents: Threats and countermeasures,” in Proceedings of the 31st ACM SIGKDD Conference on Knowledge Discovery and Data Mining, 2025, pp. 6216–6226. [Online]. Available: https://doi.org/10.1145/3711896.3736561

  31. [33]

    Privacy-preserving LLM inference in practice: A comparative survey of techniques, trade-offs, and deployability,

    D. Andreoletti, A. Rudi, E. Carpanzano, F. Lelli, and T. Leidi, “Privacy-preserving LLM inference in practice: A comparative survey of techniques, trade-offs, and deployability,” Cryptology ePrint Archive, Paper 2026/105, 2026. [Online]. Available: https: //eprint.iacr.org/2026/105

  32. [34]

    TEESlice: Protecting sensitive neural network models in trusted execution environments when attackers have pre-trained models,

    D. Li, Z. Zhang, M. Yao, Y . Cai, Y . Guo, and X. Chen, “TEESlice: Protecting sensitive neural network models in trusted execution environments when attackers have pre-trained models,” arXiv preprint arXiv:2411.09945, 2024. [Online]. Available: https: //arxiv.org/abs/2411.09945

  33. [35]

    Towards confidential and efficient LLM inference with dual privacy protection,

    H. Yu, Y . Wang, F. Dai, D. Liu, H. Fan, and X. Gu, “Towards confidential and efficient LLM inference with dual privacy protection,” inDatabase Systems for Advanced Applications (DASFAA 2025), ser. Lecture Notes in Computer Science, vol. 15990. Springer, 2026, pp. 609–618, arXiv:2509.09091. [Online]. Available: https://doi.org/10.1007/978-981-95-4155-3 42

  34. [36]

    Proof-of-concept for private local-to-cloud LLM chat via trusted execution,

    D. Biderman, A. Narayan, and C. R ´e, “Proof-of-concept for private local-to-cloud LLM chat via trusted execution,” inProceedings of the 42nd International Conference on Machine Learning (ICML 2025), ser. Proceedings of Machine Learning Research, vol. 267, 2025. [Online]. Available: https://icml.cc/virtual/2025/51781

  35. [37]

    Trusted AI agents in the cloud,

    T. Bodea, M. Misono, J. Pritzi, P. Sabanic, T. Sommer, H. Unnibhavi, D. Schall, N. Santos, D. Stavrakakis, and P. Bhatotia, “Trusted AI agents in the cloud,”arXiv preprint arXiv:2512.05951, 2025. [Online]. Available: https://arxiv.org/abs/2512.05951

  36. [38]

    CAEC: Confidential, Attestable, and Efficient Inter-CVM Communication with Arm CCA

    S. Abdollahi, A. Al Sadi, M. Kogias, D. Kotz, and H. Haddadi, “Confidential, attestable, and efficient inter-CVM communication with Arm CCA,”arXiv preprint arXiv:2512.01594, 2025. [Online]. Available: https://arxiv.org/abs/2512.01594

  37. [39]

    Breaking the protocol: Security analysis of the model context protocol specification and prompt injection vulnerabilities in tool-integrated LLM agents,

    N. Maloyan and D. Namiot, “Breaking the protocol: Security analysis of the model context protocol specification and prompt injection vulnerabilities in tool-integrated LLM agents,”arXiv preprint arXiv:2601.17549, 2026. [Online]. Available: https://arxiv.org/abs/ 2601.17549

  38. [40]

    Blocka2a: Towards secure and verifiable agent-to-agent interoperability,

    Z. Zou, Z. Liu, L. Zhao, and Q. Zhan, “BlockA2A: Towards secure and verifiable agent-to-agent interoperability,”arXiv preprint arXiv:2508.01332, 2025. [Online]. Available: https://arxiv.org/abs/ 2508.01332

  39. [41]

    Regulation (EU) 2024/1689 of the european parliament and of the council of 13 june 2024 laying down harmonised rules on artificial intelligence (AI Act),

    European Union, “Regulation (EU) 2024/1689 of the european parliament and of the council of 13 june 2024 laying down harmonised rules on artificial intelligence (AI Act),” Official Journal of the European Union, 2024. [Online]. Available: https://eur- lex.europa.eu/eli/reg/2024/1689/oj

  40. [42]

    Regulation (EU) 2022/2554 of the european parliament and of the council of 14 december 2022 on digital operational resilience for the financial sector (DORA),

    European Parliament and of the Council, “Regulation (EU) 2022/2554 of the european parliament and of the council of 14 december 2022 on digital operational resilience for the financial sector (DORA),” Official Journal of the European Union, 2022. [Online]. Available: https://eur-lex.europa.eu/eli/reg/2022/2554/oj

  41. [43]

    Multi-Agent Collaboration Mechanisms: A Survey of LLMs

    K.-T. Tran, D. Dao, M.-D. Nguyen, Q.-V . Pham, B. O’Sullivan, and H. D. Nguyen, “Multi-agent collaboration mechanisms: A survey of LLMs,”arXiv preprint arXiv:2501.06322, 2025. [Online]. Available: https://arxiv.org/abs/2501.06322

  42. [44]

    Multi-agent coordination across diverse applications: A survey.arXiv preprint arXiv:2502.14743, 2025

    L. Sun, Y . Yang, Q. Duan, Y . Shi, C. Lyu, Y .-C. Chang, C.-T. Lin, and Y . Shen, “Multi-agent coordination across diverse applications: A survey,”arXiv preprint arXiv:2502.14743, 2025. [Online]. Available: https://arxiv.org/abs/2502.14743

  43. [45]

    Toolsandbox: A stateful, conversational, interactive evaluation benchmark for LLM tool use capabilities,

    J. Lu, T. Holleis, Y . Zhang, B. Aumayer, F. Nan, H. Bai, S. Ma, S. Ma, M. Li, G. Yin, Z. Wang, and R. Pang, “Toolsandbox: A stateful, conversational, interactive evaluation benchmark for LLM tool use capabilities,” inFindings of the Association for Computational Linguistics: NAACL 2025, 2025, pp. 1160–1183. [Online]. Available: https://aclanthology.org/2...

  44. [46]

    Chain-of-Thought Prompting Elicits Reasoning in Large Language Models

    J. Wei, X. Wang, D. Schuurmans, M. Bosma, B. Ichter, F. Xia, E. Chi, Q. V . Le, and D. Zhou, “Chain-of-thought prompting elicits reasoning in large language models,” inAdvances in Neural Information Processing Systems (NeurIPS 2022), vol. 35, 2022, pp. 24 824–24 837, arXiv:2201.11903. [Online]. Available: https://proceedings.neurips. cc/paper files/paper/...

  45. [47]

    ReAct: Synergizing Reasoning and Acting in Language Models

    S. Yao, J. Zhao, D. Yu, N. Du, I. Shafran, K. Narasimhan, and Y . Cao, “ReAct: Synergizing reasoning and acting in language models,” in11th International Conference on Learning Representations (ICLR 2023), 2023, arXiv:2210.03629. [Online]. Available: https: //arxiv.org/abs/2210.03629

  46. [48]

    Agent2agent (A2A) protocol,

    Google, “Agent2agent (A2A) protocol,” Protocol documentation, 2025. [Online]. Available: https://google-a2a.github.io/A2A/specification/

  47. [49]

    Blueprint, Bootstrap, and Bridge: A Security Look at NVIDIA GPU Confidential Computing

    Z. Gu, E. Valdez, S. Ahmed, J. J. Stephen, M. Le, H. Jamjoom, S. Zhao, and Z. Lin, “Blueprint, bootstrap, and bridge: A security look at NVIDIA GPU confidential computing,”arXiv preprint arXiv:2507.02770, 2025. [Online]. Available: https://arxiv.org/abs/ 2507.02770

  48. [50]

    Confidential computing or cryptographic computing?: Tradeoffs between cryptography and hardware enclaves,

    R. A. Popa, “Confidential computing or cryptographic computing?: Tradeoffs between cryptography and hardware enclaves,”ACM Queue, vol. 22, no. 2, 2024. [Online]. Available: https://queue.acm.org/detail. cfm?id=3664295

  49. [51]

    Why should i trust your code?

    A. Delignat-Lavaud, C. Fournet, K. Vaswani, S. Clebsch, M. Riechert, M. Costa, and M. Russinovich, “Why should i trust your code?” Communications of the ACM, vol. 67, no. 1, pp. 68–76, 2024. [Online]. Available: https://cacm.acm.org/practice/why-should-i-trust- your-code/

  50. [52]

    LeftoverLocals: Listening to LLM responses through leaked GPU local memory,

    T. Sorensen and H. Khlaaf, “LeftoverLocals: Listening to LLM responses through leaked GPU local memory,”arXiv preprint arXiv:2401.16603, 2024. [Online]. Available: https://arxiv.org/abs/ 2401.16603

  51. [53]

    A first look at efficient and secure on-device LLM inference against KV leakage,

    H. Yang, D. Zhang, Y . Zhao, Y . Li, and Y . Liu, “A first look at efficient and secure on-device LLM inference against KV leakage,” in Workshop on Mobility in the Evolving Internet Architecture (MobiArch 2024), 2024, pp. 13–18, arXiv:2409.04040. [Online]. Available: https://doi.org/10.1145/3691555.3696827

  52. [54]

    Security Threat Modeling for Emerging AI-Agent Protocols: A Comparative Analysis of MCP, A2A, Agora, and ANP

    Z. Anbiaee, M. Rabbani, M. Mirani, G. Piya, I. Opushnyev, A. Ghorbani, and S. Dadkhah, “Security threat modeling for emerging AI-agent protocols: A comparative analysis of MCP, A2A, agora, and ANP,”arXiv preprint arXiv:2602.11327, 2026. [Online]. Available: https://arxiv.org/abs/2602.11327

  53. [56]

    Titzer, Heather Miller, and Phillip B

    [Online]. Available: https://arxiv.org/abs/2502.08966

  54. [57]

    Innovative technology for CPU based attestation and sealing,

    I. Anati, S. Gueron, S. Johnson, and V . Scarlata, “Innovative technology for CPU based attestation and sealing,” inProceedings of the 2nd International Workshop on Hardware and Architectural Support for Security and Privacy (HASP 2013), 2013, p. 7. [Online]. Available: https://www.intel.com/content/dam/develop/external/us/ en/documents/hasp- 2013- innova...

  55. [58]

    Formal specification and verification of architecturally-defined attestation mechanisms in Arm CCA and Intel TDX,

    M. U. Sardar, T. Fossati, S. Frost, and S. Xiong, “Formal specification and verification of architecturally-defined attestation mechanisms in Arm CCA and Intel TDX,”IEEE Access, vol. 12, 2024. [Online]. Available: https://ieeexplore.ieee.org/document/10373038/

  56. [59]

    Available: http://dx.doi.org/10.1145/3773080

    F. He, T. Zhu, D. Ye, B. Liu, W. Zhou, and P. S. Yu, “The emerged security and privacy of LLM agent: A survey with case studies,” ACM Computing Surveys, vol. 58, no. 6, pp. 1–36, 2025. [Online]. Available: https://dl.acm.org/doi/10.1145/3773080

  57. [60]

    Confidential computing across edge-to-cloud for machine learning: A survey study,

    S. M. Zobaed and M. Amini Salehi, “Confidential computing across edge-to-cloud for machine learning: A survey study,”Software: Practice and Experience, vol. 55, no. 5, pp. 896–924, 2025, arXiv:2307.16447. [Online]. Available: https://onlinelibrary.wiley.com/ doi/10.1002/spe.3398

  58. [61]

    Survey of research on confidential computing,

    D. Feng, Y . Qin, W. Feng, W. Li, K. Shang, and H. Ma, “Survey of research on confidential computing,”IET Communications, vol. 18, no. 9, pp. 535–556, 2024. [Online]. Available: https: //ietresearch.onlinelibrary.wiley.com/doi/10.1049/cmu2.12759

  59. [62]

    Slalom: Fast, Verifiable and Private Execution of Neural Networks in Trusted Hardware

    F. Tram `er and D. Boneh, “Slalom: Fast, verifiable and private execution of neural networks in trusted hardware,” in7th International Conference on Learning Representations (ICLR 2019), 2019, arXiv:1806.03287. [Online]. Available: https://arxiv.org/abs/1806.03287

  60. [63]

    Chiron: Privacy-preserving machine learning as a service,

    T. Hunt, C. Song, R. Shokri, V . Shmatikov, and E. Witchel, “Chiron: Privacy-preserving machine learning as a service,”arXiv preprint arXiv:1803.05961, 2018. [Online]. Available: https://arxiv.org/abs/ 1803.05961

  61. [64]

    ShadowNet: A secure and efficient on-device model inference system for convolutional neural networks,

    Z. Sun, R. Sun, L. Lu, and A. Mislove, “ShadowNet: A secure and efficient on-device model inference system for convolutional neural networks,” in2023 IEEE Symposium on Security and Privacy (S&P), 2023, pp. 1538–1555, arXiv:2011.05378. [Online]. Available: https://arxiv.org/abs/2011.05378

  62. [65]

    GAZELLE: A low latency framework for secure neural network inference,

    C. Juvekar, V . Vaikuntanathan, and A. Chandrakasan, “GAZELLE: A low latency framework for secure neural network inference,” in27th USENIX Security Symposium (USENIX Security 18), 2018, pp. 1651–1668, arXiv:1801.05507. [Online]. Available: https: //www.usenix.org/conference/usenixsecurity18/presentation/juvekar

  63. [66]

    SecureML: A system for scalable privacy-preserving machine learning,

    P. Mohassel and Y . Zhang, “SecureML: A system for scalable privacy-preserving machine learning,” in2017 IEEE Symposium on Security and Privacy (S&P), 2017, pp. 19–38. [Online]. Available: https://ieeexplore.ieee.org/document/7958569

  64. [67]

    DELPHI: A cryptographic inference service for neural networks,

    P. Mishra, R. Lehmkuhl, A. Srinivasan, W. Zheng, and R. A. Popa, “DELPHI: A cryptographic inference service for neural networks,” in 29th USENIX Security Symposium (USENIX Security 20), 2020, pp. 2505–2522. [Online]. Available: https://www.usenix.org/conference/ usenixsecurity20/presentation/mishra

  65. [68]

    CryptoNets: Applying neural networks to encrypted data with high throughput and accuracy,

    R. Gilad-Bachrach, N. Dowlin, K. Laine, K. Lauter, M. Naehrig, and J. Wernsing, “CryptoNets: Applying neural networks to encrypted data with high throughput and accuracy,” inProceedings of the 33rd International Conference on Machine Learning (ICML 2016), ser. Proceedings of Machine Learning Research, vol. 48, 2016, pp. 201–210. [Online]. Available: http:...

  66. [69]

    CrypTFlow2: Practical 2-party secure inference,

    D. Rathee, M. Rathee, N. Kumar, N. Chandran, D. Gupta, A. Rastogi, and R. Sharma, “CrypTFlow2: Practical 2-party secure inference,” inProceedings of the 2020 ACM SIGSAC Conference on Computer and Communications Security (CCS 2020), 2020, pp. 325–342, arXiv:2010.06457. [Online]. Available: https://dl.acm.org/doi/10.1145/3372297.3417274

  67. [70]

    Calibrating noise to sensitivity in private data analysis,

    C. Dwork, F. McSherry, K. Nissim, and A. Smith, “Calibrating noise to sensitivity in private data analysis,” inTheory of Cryptography (TCC 2006), ser. Lecture Notes in Computer Science, vol. 3876, 2006, pp. 265–284. [Online]. Available: https://link.springer.com/chapter/10. 1007/11681878 14

  68. [71]

    Brendan McMahan, Sarvar Patel, Daniel Ramage, Aaron Segal, and Karn Seth

    K. Bonawitz, V . Ivanov, B. Kreuter, A. Marcedone, H. B. McMahan, S. Patel, D. Ramage, A. Segal, and K. Seth, “Practical secure aggregation for privacy-preserving machine learning,” in Proceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security (CCS 2017), 2017, pp. 1175–1191. [Online]. Available: https://dl.acm.org/doi/10.1145/3...

  69. [72]

    Fltrust: Byzantine- robust federated learning via trust bootstrapping,

    X. Cao, M. Fang, J. Liu, and N. Z. Gong, “FLTrust: Byzantine-robust federated learning via trust bootstrapping,” inNetwork and Distributed System Security Symposium (NDSS 2021), 2021, arXiv:2012.13995. [Online]. Available: https://www.ndss- symposium.org/ndss- paper/ fltrust-byzantine-robust-federated-learning-via-trust-bootstrapping/

  70. [73]

    CryptDB: Protecting confidentiality with encrypted query processing,

    R. A. Popa, C. M. S. Redfield, N. Zeldovich, and H. Balakrishnan, “CryptDB: Protecting confidentiality with encrypted query processing,” inProceedings of the 23rd ACM Symposium on Operating Systems Principles (SOSP 2011), 2011, pp. 85–100. [Online]. Available: https://dl.acm.org/doi/10.1145/2043556.2043566

  71. [74]

    Opaque: An oblivious and encrypted distributed analytics platform,

    W. Zheng, A. Dave, J. G. Beekman, R. A. Popa, J. E. Gonzalez, and I. Stoica, “Opaque: An oblivious and encrypted distributed analytics platform,” in14th USENIX Symposium on Networked Systems Design and Implementation (NSDI 2017), 2017, pp. 283–

  72. [75]

    Available: https://www.usenix.org/conference/nsdi17/ technical-sessions/presentation/zheng

    [Online]. Available: https://www.usenix.org/conference/nsdi17/ technical-sessions/presentation/zheng

  73. [76]

    EnclaveDB: A secure database using SGX,

    C. Priebe, K. Vaswani, and M. Costa, “EnclaveDB: A secure database using SGX,” in2018 IEEE Symposium on Security and Privacy (S&P), 2018, pp. 264–278. [Online]. Available: https://ieeexplore.ieee.org/document/8418608

  74. [77]

    VC3: Trustworthy data analytics in the cloud using SGX,

    F. Schuster, M. Costa, C. Fournet, C. Gkantsidis, M. Peinado, G. Mainar-Ruiz, and M. Russinovich, “VC3: Trustworthy data analytics in the cloud using SGX,” in2015 IEEE Symposium on Security and Privacy (S&P), 2015, pp. 38–54. [Online]. Available: https://ieeexplore.ieee.org/document/7163017

  75. [78]

    StealthDB: a scalable encrypted database with full SQL query support,

    D. Vinayagamurthy, A. Gribov, and S. Gorbunov, “StealthDB: a scalable encrypted database with full SQL query support,” Proceedings on Privacy Enhancing Technologies (PoPETS), vol. 2019, no. 2, pp. 370–388, 2019. [Online]. Available: https: //petsymposium.org/2019/paperlist.php

  76. [79]

    Big data analytics over encrypted datasets with seabed,

    A. Papadimitriou, R. Bhagwan, N. Chandran, R. Ramjee, A. Haeberlen, H. Singh, A. Modi, and S. Badrinarayanan, “Big data analytics over encrypted datasets with seabed,” in12th USENIX Symposium on Operating Systems Design and Implementation (OSDI 2016), 2016, pp. 587–602. [Online]. Available: https://www.usenix.org/conference/ osdi16/technical-sessions/pres...

  77. [80]

    SPEICHER: Securing LSM-based key-value stores using shielded execution,

    M. Bailleu, J. Thalheim, P. Bhatotia, C. Fetzer, M. Honda, and K. Vaswani, “SPEICHER: Securing LSM-based key-value stores using shielded execution,” in17th USENIX Conference on File and Storage Technologies (FAST 2019), 2019, pp. 173–190. [Online]. Available: https://www.usenix.org/conference/fast19/presentation/bailleu

  78. [81]

    ShieldStore: Shielded in-memory key-value storage with SGX,

    T. Kim, J. Park, J. Woo, S. Jeon, and J. Huh, “ShieldStore: Shielded in-memory key-value storage with SGX,” inProceedings of the Fourteenth EuroSys Conference (EuroSys 2019), 2019, pp. 14:1–14:15. [Online]. Available: https://dl.acm.org/doi/10.1145/3302424.3303951

  79. [82]

    Ryoan: A distributed sandbox for untrusted computation on secret data,

    T. Hunt, Z. Zhu, Y . Xu, S. Peter, and E. Witchel, “Ryoan: A distributed sandbox for untrusted computation on secret data,” in12th USENIX Symposium on Operating Systems Design and Implementation (OSDI 2016), 2016, pp. 533–549. [Online]. Available: https://www.usenix. org/conference/osdi16/technical-sessions/presentation/hunt

  80. [83]

    Shielding applications from an untrusted cloud with haven,

    A. Baumann, M. Peinado, and G. Hunt, “Shielding applications from an untrusted cloud with haven,” in11th USENIX Symposium on Operating Systems Design and Implementation (OSDI 2014), 2014, pp. 267–283. [Online]. Available: https://www.usenix.org/conference/ osdi14/technical-sessions/presentation/baumann

Showing first 80 references.