Recognition: 2 theorem links
· Lean TheoremAgentCrypt: Advancing Privacy and (Secure) Computation in AI Agent Collaboration
Pith reviewed 2026-05-16 23:53 UTC · model grok-4.3
The pith
AgentCrypt ensures privacy for tagged data in AI agent collaborations remains protected even if models err.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
We introduce AgentCrypt, a three-tiered framework for secure agent communication that adds a deterministic protection layer atop any AI platform. AgentCrypt spans the full spectrum of privacy needs: from unrestricted data exchange (Level 1), to context-aware masking (Level 2), up to fully encrypted computation using Homomorphic Encryption (Level 3). Unlike prompt-based defenses, our approach guarantees that tagged data privacy is strictly preserved even when the underlying model errs. Security is decoupled from the agent's probabilistic reasoning, ensuring sensitive data remains protected throughout the computational lifecycle.
What carries the argument
The three-tiered privacy levels incorporating homomorphic encryption to enable secure computation without data exposure.
If this is right
- Collaborative computation becomes possible on data previously locked in silos.
- Security guarantees hold independently of the AI model's accuracy or reasoning quality.
- A new benchmark dataset supports evaluation of privacy in critical tasks.
- Implementation works across different agent architectures such as LangGraph and Google ADK.
- Context-aware privacy supports better regulatory compliance for AI systems.
Where Pith is reading between the lines
- This approach could be adapted to other probabilistic AI systems beyond agents to enforce privacy.
- Performance impacts of higher encryption levels may need balancing in real deployments.
- Agents might use the framework to dynamically choose privacy levels based on detected risks.
- Long-term, it could enable broader adoption of AI in privacy-sensitive domains like finance or medicine.
Load-bearing premise
The three-tiered controls with homomorphic encryption and masking can be added to current agent platforms with reasonable performance costs while ensuring agents properly use the privacy tags.
What would settle it
Observe whether sensitive data remains unexposed in a scenario where an agent is instructed to perform a private computation but the model incorrectly reasons about the privacy level or leaks context anyway.
Figures
read the original abstract
As AI agents increasingly operate in complex environments, ensuring reliable, context-aware privacy is critical for regulatory compliance. Traditional access controls are insufficient because privacy risks often arise after access is granted; agents may inadvertently compromise privacy during reasoning by messaging humans, leaking context to peers, or executing unsafe tool calls. Existing approaches typically treat privacy as a binary constraint, overlooking nuanced, computation-dependent requirements. Furthermore, Large Language Model (LLM) agents are inherently probabilistic, lacking formal guarantees for security-critical operations. To address this, we introduce AgentCrypt, a three-tiered framework for secure agent communication that adds a deterministic protection layer atop any AI platform. AgentCrypt spans the full spectrum of privacy needs: from unrestricted data exchange (Level 1), to context-aware masking (Level 2), up to fully encrypted computation using Homomorphic Encryption (Level 3). Unlike prompt-based defenses, our approach guarantees that tagged data privacy is strictly preserved even when the underlying model errs. Security is decoupled from the agent's probabilistic reasoning, ensuring sensitive data remains protected throughout the computational lifecycle. AgentCrypt enables collaborative computation on otherwise inaccessible data, overcoming barriers like data silos. We implemented and validated it using LangGraph and Google ADK, demonstrating versatility across architectures. Finally, we introduce a benchmark dataset simulating privacy-critical tasks to enable systematic evaluation and foster the development of trustworthy, regulatable machine learning systems.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper introduces AgentCrypt, a three-tiered framework for privacy-preserving AI agent collaboration. It defines three privacy levels: Level 1 for unrestricted exchange, Level 2 for context-aware masking, and Level 3 for homomorphic encryption-based secure computation. The central claim is that the framework adds a deterministic protection layer to any AI platform, guaranteeing that tagged data privacy is strictly preserved even if the underlying LLM makes errors, by decoupling security from probabilistic reasoning. The work includes implementations on LangGraph and Google ADK, and introduces a benchmark dataset for privacy-critical tasks.
Significance. If the deterministic enforcement mechanism can be shown to intercept all potential leakage paths, this framework could enable secure collaborative computations across data silos in multi-agent systems, which is a significant advancement for privacy in AI. The benchmark dataset is a valuable contribution for future evaluations. However, the lack of formal security analysis and empirical results currently limits its demonstrated impact.
major comments (2)
- Abstract: The claim that 'tagged data privacy is strictly preserved even when the underlying model errs' and that 'security is decoupled from the agent's probabilistic reasoning' lacks any explicit security model, invariant, or proof demonstrating that the three-tier controls intercept all data flows and cannot be bypassed via agent-generated code, unmediated tool calls, or LangGraph/ADK workflows.
- Implementation and Evaluation: No security proofs, performance measurements, overhead analysis, or detailed results from the introduced benchmark dataset are provided to support the claims of strict privacy preservation and practical integration.
minor comments (1)
- Abstract: A table or diagram summarizing the three privacy levels, their mechanisms, and enforcement points would improve clarity of the framework description.
Simulated Author's Rebuttal
We thank the referee for the constructive feedback on our manuscript. We address the major comments point by point below and outline revisions to strengthen the formal security description and empirical evaluation.
read point-by-point responses
-
Referee: Abstract: The claim that 'tagged data privacy is strictly preserved even when the underlying model errs' and that 'security is decoupled from the agent's probabilistic reasoning' lacks any explicit security model, invariant, or proof demonstrating that the three-tier controls intercept all data flows and cannot be bypassed via agent-generated code, unmediated tool calls, or LangGraph/ADK workflows.
Authors: We agree the abstract would benefit from an explicit reference to the security model. The three-tier framework enforces deterministic controls at the messaging and tool-call layers: Level 2 applies context-aware masking before any LLM reasoning occurs, and Level 3 routes sensitive data through homomorphic encryption primitives that never expose plaintext to the agent. In the revision we will add a concise statement of the core invariants to the abstract and expand Section 3 with a formal description of the interception points, including mediation of all outgoing messages and tool invocations in both LangGraph and ADK runtimes. We will also discuss residual risks from arbitrary code generation as a limitation. revision: partial
-
Referee: Implementation and Evaluation: No security proofs, performance measurements, overhead analysis, or detailed results from the introduced benchmark dataset are provided to support the claims of strict privacy preservation and practical integration.
Authors: The present version prioritizes framework design and architectural integration. We accept that quantitative support is needed. The revised manuscript will include a new evaluation section reporting (i) latency and memory overhead measured on the LangGraph and Google ADK prototypes, (ii) results from the benchmark dataset showing successful privacy preservation across the three tiers, and (iii) a high-level security argument based on the deterministic invariants. Full cryptographic proofs remain future work but the invariants will be stated formally. revision: yes
Circularity Check
No significant circularity; framework is a new construction on standard primitives
full rationale
The paper introduces AgentCrypt as an external deterministic layer using homomorphic encryption (Level 3) and context-aware masking (Level 2) atop existing agent platforms. No equations, fitted parameters, or predictions appear in the provided text. The decoupling claim is presented as an architectural guarantee rather than a self-referential definition or renamed empirical pattern. No self-citations load-bear uniqueness theorems or ansatzes; the approach cites standard cryptographic primitives and demonstrates implementation on LangGraph/ADK without reducing the central security invariant to its own inputs. The derivation remains self-contained against external benchmarks.
Axiom & Free-Parameter Ledger
axioms (2)
- domain assumption Homomorphic encryption schemes can be integrated into agent collaboration workflows with acceptable computational overhead.
- domain assumption Data items can be reliably tagged and the privacy level enforcement can be maintained independently of the agent's reasoning process.
invented entities (1)
-
AgentCrypt three-tier framework
no independent evidence
Lean theorems connected to this paper
-
IndisputableMonolith/Cost/FunctionalEquation.leanwashburn_uniqueness_aczel unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
guarantees that tagged data privacy is strictly preserved even when the underlying model errs. Security is decoupled from the agent's probabilistic reasoning
-
IndisputableMonolith/Foundation/AbsoluteFloorClosure.leanabsolute_floor_iff_bare_distinguishability unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
deterministic security wrapper to enforce the policies... every agent output is encrypted and policy-checked prior to release
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
Forward citations
Cited by 1 Pith paper
-
Security Considerations for Multi-agent Systems
No existing AI security framework covers a majority of the 193 identified multi-agent system threats in any category, with OWASP Agentic Security Initiative achieving the highest overall coverage at 65.3%.
Reference graph
Works this paper leans on
-
[1]
https://github.com/langchain-ai/langgraph, 2024
Langgraph: Building language agents as graphs. https://github.com/langchain-ai/langgraph, 2024
work page 2024
-
[2]
Openfhe: Open-source fully homomorphic encryption library
Ahmad Al Badawi, Jack Bates, Flavio Bergamaschi, David Bruce Cousins, Saroja Erabelli, Nicholas Genise, Shai Halevi, Hamish Hunt, Andrey Kim, Yongwoo Lee, Zeyu Liu, Daniele Micciancio, Ian Quah, Yuriy Polyakov, Saraswathy R.V., Kurt Rohloff, Jonathan Saylor, Dmitriy Suponitsky, Matthew Triplett, Vinod Vaikuntanathan, and Vincent Zucca. Openfhe: Open-sourc...
work page 2022
-
[3]
Identity-based encryption from the weil pairing
Dan Boneh and Matthew Franklin. Identity-based encryption from the weil pairing. SIAM Journal on Computing , 32(3):586--615, 2003
work page 2003
-
[4]
Hannah Brown, Katherine Lee, Fatemehsadat Mireshghallah, Reza Shokri, and Florian Tram\` e r. What does it mean for a language model to preserve privacy? In Proceedings of the 2022 ACM Conference on Fairness, Accountability, and Transparency , FAccT '22, page 2280–2292, New York, NY, USA, 2022. Association for Computing Machinery
work page 2022
-
[5]
Extracting training data from large language models
Nicholas Carlini, Florian Tram \`e r, Eric Wallace, Matthew Jagielski, Ariel Herbert-Voss, Katherine Lee, Adam Roberts, Tom Brown, Dawn Song, \'U lfar Erlingsson, Alina Oprea, and Colin Raffel. Extracting training data from large language models. In 30th USENIX Security Symposium (USENIX Security 21) , pages 2633--2650. USENIX Association, August 2021
work page 2021
-
[6]
Homomorphic encryption for arithmetic of approximate numbers
Jung Hee Cheon, Andrey Kim, Miran Kim, and Yongsoo Song. Homomorphic encryption for arithmetic of approximate numbers. In Tsuyoshi Takagi and Thomas Peyrin, editors, Advances in Cryptology -- ASIACRYPT 2017 , pages 409--437, Cham, 2017. Springer International Publishing
work page 2017
-
[7]
Tfhe: Fast fully homomorphic encryption over the torus
Ilaria Chillotti, Nicolas Gama, Mariya Georgieva, and Malika Izabachène. Tfhe: Fast fully homomorphic encryption over the torus. Journal of Cryptology , 33, 04 2019
work page 2019
-
[8]
A2a: Account-to-account payments protocol
A2A Project Contributors. A2a: Account-to-account payments protocol. https://github.com/a2aproject/A2A, 2025. Accessed: 2025-12-02
work page 2025
-
[9]
Mind2web: Towards a generalist agent for the web
Xiang Deng, Yu Gu, Boyuan Zheng, Shijie Chen, Sam Stevens, Boshi Wang, Huan Sun, and Yu Su. Mind2web: Towards a generalist agent for the web. Advances in Neural Information Processing Systems , 36:28091--28114, 2023
work page 2023
-
[10]
Do membership inference attacks work on large language models?, 2024
Michael Duan, Anshuman Suri, Niloofar Mireshghallah, Sewon Min, Weijia Shi, Luke Zettlemoyer, Yulia Tsvetkov, Yejin Choi, David Evans, and Hannaneh Hajishirzi. Do membership inference attacks work on large language models?, 2024
work page 2024
-
[11]
Avia Efrat and Omer Levy. The turking test: Can language models understand instructions? arXiv preprint arXiv:2010.11982 , 2020
-
[12]
Understanding social reasoning in language models with language models
Kanishk Gandhi, Jan-Philipp Fr \"a nken, Tobias Gerstenberg, and Noah Goodman. Understanding social reasoning in language models with language models. Advances in Neural Information Processing Systems , 36:13518--13529, 2023
work page 2023
-
[13]
Fully homomorphic encryption using ideal lattices
Craig Gentry. Fully homomorphic encryption using ideal lattices. In Proceedings of the Forty-First Annual ACM Symposium on Theory of Computing , STOC '09, page 169–178, New York, NY, USA, 2009. Association for Computing Machinery
work page 2009
-
[14]
Accent: An automatic event commonsense evaluation metric for open-domain dialogue systems
Sarik Ghazarian, Yijia Shao, Rujun Han, Aram Galstyan, and Nanyun Peng. Accent: An automatic event commonsense evaluation metric for open-domain dialogue systems. arXiv preprint arXiv:2305.07797 , 2023
-
[15]
The Foundations of Cryptography - Volume 1: Basic Techniques
Oded Goldreich. The Foundations of Cryptography - Volume 1: Basic Techniques . Cambridge University Press, 2001
work page 2001
-
[16]
Google. adk-python. https://github.com/google/adk-python, 2025. Accessed: 2025-12-02
work page 2025
-
[17]
Thomas Hartvigsen, Saadia Gabriel, Hamid Palangi, Maarten Sap, Dipankar Ray, and Ece Kamar. Toxigen: A large-scale machine-generated dataset for adversarial and implicit hate speech detection. arXiv preprint arXiv:2203.09509 , 2022
-
[18]
Benchmarking large language models as ai research agents
Qian Huang, Jian Vora, Percy Liang, and Jure Leskovec. Benchmarking large language models as ai research agents. In NeurIPS 2023 Foundation Models for Decision Making Workshop , 2023
work page 2023
-
[19]
Measuring forgetting of memorized training examples
Matthew Jagielski, Om Thakkar, Florian Tramer, Daphne Ippolito, Katherine Lee, Nicholas Carlini, Eric Wallace, Shuang Song, Abhradeep Guha Thakurta, Nicolas Papernot, and Chiyuan Zhang. Measuring forgetting of memorized training examples. In The Eleventh International Conference on Learning Representations , 2023
work page 2023
-
[20]
SWE-bench: Can Language Models Resolve Real-World GitHub Issues?
Carlos E Jimenez, John Yang, Alexander Wettig, Shunyu Yao, Kexin Pei, Ofir Press, and Karthik Narasimhan. Swe-bench: Can language models resolve real-world github issues? arXiv preprint arXiv:2310.06770 , 2023
work page internal anchor Pith review Pith/arXiv arXiv 2023
-
[21]
Propile: probing privacy leakage in large language models
Siwon Kim, Sangdoo Yun, Hwaran Lee, Martin Gubri, Sungroh Yoon, and Seong Joon Oh. Propile: probing privacy leakage in large language models. In Proceedings of the 37th International Conference on Neural Information Processing Systems , NIPS '23, Red Hook, NY, USA, 2023. Curran Associates Inc
work page 2023
-
[22]
Devbench: A comprehensive benchmark for software development
Bowen Li, Wenhan Wu, Ziwei Tang, Lin Shi, John Yang, Jinyang Li, Shunyu Yao, Chen Qian, Binyuan Hui, Qicheng Zhang, et al. Devbench: A comprehensive benchmark for software development. CoRR , 2024
work page 2024
-
[23]
AgentBench: Evaluating LLMs as Agents
Xiao Liu, Hao Yu, Hanchen Zhang, Yifan Xu, Xuanyu Lei, Hanyu Lai, Yu Gu, Hangliang Ding, Kaiwen Men, Kejuan Yang, et al. Agentbench: Evaluating llms as agents. arXiv preprint arXiv:2308.03688 , 2023
work page internal anchor Pith review Pith/arXiv arXiv 2023
-
[24]
Pricing privacy – the right to know the value of your personal data
Gianclaudio Malgieri and Bart Custers. Pricing privacy – the right to know the value of your personal data. Computer Law & Security Review , 34(2):289--303, 2018
work page 2018
-
[25]
Efficient ranking, order statistics, and sorting under ckks
Federico Mazzone, Maarten Everts, Florian Hahn, and Andreas Peter. Efficient ranking, order statistics, and sorting under ckks. In 34th USENIX Security Symposium (USENIX Security '25) , Seattle, WA, aug 2025. USENIX Association
work page 2025
-
[26]
Testing language model agents safely in the wild
Silen Naihin, David Atkinson, Marc Green, Merwane Hamadi, Craig Swift, Douglas Schonholtz, Adam Tauman Kalai, and David Bau. Testing language model agents safely in the wild. arXiv preprint arXiv:2311.10538 , 2023
-
[27]
Privacy as contextual integrity
Helen Nissenbaum. Privacy as contextual integrity. Washington Law Review , 79(1):119--157, February 2004
work page 2004
-
[28]
Red Teaming Language Models with Language Models
Ethan Perez, Saffron Huang, Francis Song, Trevor Cai, Roman Ring, John Aslanides, Amelia Glaese, Nat McAleese, and Geoffrey Irving. Red teaming language models with language models. arXiv preprint arXiv:2202.03286 , 2022
work page internal anchor Pith review Pith/arXiv arXiv 2022
-
[29]
Discovering language model behaviors with model-written evaluations
Ethan Perez, Sam Ringer, Kamile Lukosiute, Karina Nguyen, Edwin Chen, Scott Heiner, Craig Pettit, Catherine Olsson, Sandipan Kundu, Saurav Kadavath, et al. Discovering language model behaviors with model-written evaluations. In Findings of the Association for Computational Linguistics: ACL 2023 , pages 13387--13434, 2023
work page 2023
-
[30]
On data banks and privacy homomorphisms
Ronald L Rivest, Len Adleman, Michael L Dertouzos, et al. On data banks and privacy homomorphisms. Foundations of secure computation , (11):169--180, 1978
work page 1978
-
[31]
Identifying the Risks of LM Agents with an LM-Emulated Sandbox
Yangjun Ruan, Honghua Dong, Andrew Wang, Silviu Pitis, Yongchao Zhou, Jimmy Ba, Yann Dubois, Chris J Maddison, and Tatsunori Hashimoto. Identifying the risks of lm agents with an lm-emulated sandbox. arXiv preprint arXiv:2309.15817 , 2023
work page internal anchor Pith review Pith/arXiv arXiv 2023
-
[32]
Fuzzy identity-based encryption
Amit Sahai and Brent Waters. Fuzzy identity-based encryption. In Ronald Cramer, editor, Advances in Cryptology -- EUROCRYPT 2005 , pages 457--473, Berlin, Heidelberg, 2005. Springer Berlin Heidelberg
work page 2005
-
[33]
Identity-based cryptosystems and signature schemes
Adi Shamir. Identity-based cryptosystems and signature schemes. In George Robert Blakley and David Chaum, editors, Advances in Cryptology , pages 47--53, Berlin, Heidelberg, 1985. Springer Berlin Heidelberg
work page 1985
-
[34]
Assisting in writing wikipedia-like articles from scratch with large language models
Yijia Shao, Yucheng Jiang, Theodore A Kanell, Peter Xu, Omar Khattab, and Monica S Lam. Assisting in writing wikipedia-like articles from scratch with large language models. arXiv preprint arXiv:2402.14207 , 2024
-
[35]
Privacylens: Evaluating privacy norm awareness of language models in action
Yijia Shao, Tianshi Li, Weiyan Shi, Yanchen Liu, and Diyi Yang. Privacylens: Evaluating privacy norm awareness of language models in action. In The Thirty-eight Conference on Neural Information Processing Systems Datasets and Benchmarks Track , 2024
work page 2024
-
[36]
Can language models solve olympiad programming? arXiv preprint arXiv:2404.10952 , 2024
Quan Shi, Michael Tang, Karthik Narasimhan, and Shunyu Yao. Can language models solve olympiad programming? arXiv preprint arXiv:2404.10952 , 2024
-
[37]
Information leakage in embedding models
Congzheng Song and Ananth Raghunathan. Information leakage in embedding models. In Proceedings of the 2020 ACM SIGSAC Conference on Computer and Communications Security , CCS '20, page 377–390, New York, NY, USA, 2020. Association for Computing Machinery
work page 2020
-
[38]
Smartplay: A benchmark for llms as intelligent agents
Yue Wu, Xuan Tang, Tom M Mitchell, and Yuanzhi Li. Smartplay: A benchmark for llms as intelligent agents. arXiv preprint arXiv:2310.01557 , 2023
-
[39]
Webshop: Towards scalable real-world web interaction with grounded language agents
Shunyu Yao, Howard Chen, John Yang, and Karthik Narasimhan. Webshop: Towards scalable real-world web interaction with grounded language agents. Advances in Neural Information Processing Systems , 35:20744--20757, 2022
work page 2022
-
[40]
WebArena: A Realistic Web Environment for Building Autonomous Agents
Shuyan Zhou, Frank F Xu, Hao Zhu, Xuhui Zhou, Robert Lo, Abishek Sridhar, Xianyi Cheng, Tianyue Ou, Yonatan Bisk, Daniel Fried, et al. Webarena: A realistic web environment for building autonomous agents. arXiv preprint arXiv:2307.13854 , 2023
work page internal anchor Pith review Pith/arXiv arXiv 2023
-
[41]
Sotopia: Interactive evaluation for social intelligence in language agents
Xuhui Zhou, Hao Zhu, Leena Mathur, Ruohong Zhang, Haofei Yu, Zhengyang Qi, Louis-Philippe Morency, Yonatan Bisk, Daniel Fried, Graham Neubig, et al. Sotopia: Interactive evaluation for social intelligence in language agents. arXiv preprint arXiv:2310.11667 , 2023
-
[42]
Red teaming chatgpt via jailbreaking: Bias, robustness, reliability and toxicity, 2023
Terry Yue Zhuo, Yujin Huang, Chunyang Chen, and Zhenchang Xing. Red teaming chatgpt via jailbreaking: Bias, robustness, reliability and toxicity, 2023
work page 2023
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.