arxiv: 2605.11360 · v1 · submitted 2026-05-12 · 💻 cs.CR · cs.AI· cs.SE

Recognition: 2 theorem links

· Lean Theorem

Options, Not Clicks: Lattice Refinement for Consent-Driven MCP Authorization

Ying Li , Yanju Chen , Peiran Wang , Issac Khabra , Faysal Hossain Shezan , Yu Feng , Yuan Tian

Authors on Pith no claims yet

Pith reviewed 2026-05-13 02:33 UTC · model grok-4.3

classification 💻 cs.CR cs.AIcs.SE

keywords risk latticeconsent managementModel Context ProtocolMCP authorizationtool invocationpolicy refinementescalation detection

0 comments

The pith

A risk lattice with refinement turns user decisions into reusable rules for scoped MCP tool consent.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper develops Conleash, a client-side middleware, to address consent fatigue in Model Context Protocol tool use by replacing broad always-allow toggles or opaque decisions with boundary-scoped authorization. It organizes tool call risks in a lattice that auto-permits safe arguments within known boundaries, escalates risky ones, enforces user-defined policies, and refines those boundaries from each decision into reusable rules. This approach aims to maintain security against dangerous escalations while minimizing prompts. Tests on 984 real traces show 98.2 percent accuracy and 99.4 percent escalation catch rate at 8.2 ms overhead, and a 16-person user study finds preference for the scoped method due to higher trust.

Core claim

Conleash models tool-call risks as a lattice to enforce boundary-scoped authorization that auto-permits calls inside safe regions, escalates those outside, applies a policy engine for invariants, and runs a refinement loop that converts each user decision into a reusable rule for future calls.

What carries the argument

risk lattice with refinement loop, which structures call arguments by risk level to separate automatic approval from escalation and converts decisions into persistent rules.

If this is right

Calls inside established boundaries receive automatic permission without further prompts.
Risky arguments trigger escalation and user review to prevent unsafe actions.
Each user decision adds a reusable rule, lowering the number of future interventions.
Policy verification overhead stays under 10 ms on real traces.
Users experience higher trust and fewer prompts than with broad permission toggles.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same lattice structure could support consent management for other agent-tool protocols beyond MCP.
Over longer use the refinement loop might stabilize into a small set of per-user rules.
Initial lattice boundaries could be seeded from common tool documentation to reduce early escalations.

Load-bearing premise

The lattice boundaries correctly classify most tool-call arguments as safe or risky without frequent misclassifications on unseen invocations.

What would settle it

A production tool call that the lattice auto-permits but later proves harmful, or a safe call that triggers repeated escalations leading to user overrides.

Figures

Figures reproduced from arXiv: 2605.11360 by Faysal Hossain Shezan, Issac Khabra, Peiran Wang, Yanju Chen, Ying Li, Yuan Tian, Yu Feng.

**Figure 1.** Figure 1: Consent dialogs when an agent calls send_email_with_attachment to email a financial report. ❶ Auto mode executes silently with no user oversight. ❷ Tool-level consent offers only Allow or Always Allow; choosing “Always Allow” silently authorizes all future emails with arbitrary recipients and attachments. ❸ ConLeash presents boundary-scoped options that let the user control which files may be attached and … view at source ↗

**Figure 2.** Figure 2: A motivating example showing how standard tool-call authorization (b) enables silent privilege escalation, whereas [PITH_FULL_IMAGE:figures/full_fig_p004_2.png] view at source ↗

**Figure 4.** Figure 4: The policy language in ConLeash. 5.2 Abstraction While the policy language defines the syntax for authorization, raw tool invocations operate in an infinite space of concrete values. To reason effectively about these invocations, we define the Abstraction procedure, which projects a raw call 𝑐 into a finite flow summary 𝜑. This procedure, central to the runtime enforcement loop (Algorithm 1), distills co… view at source ↗

**Figure 6.** Figure 6: Boundary-crossing recall on 50 real-world agent [PITH_FULL_IMAGE:figures/full_fig_p011_6.png] view at source ↗

**Figure 7.** Figure 7: Example benchmark trace. Step 1 (Ask): establishes consent bound; Step 2 (Allow): auto-permitted via consent reuse; Step 3 (Ask): sensitivity escalation; Step 5 (Deny): blocked by invariant. B User Study Supplemental Materials This appendix provides detailed materials for the user study described in §8.3, including the study design, task descriptions, and questionnaire items. B.1 User Study Design Partici… view at source ↗

**Figure 8.** Figure 8: Subjective ratings for Baseline and ConLeash (N = 16). Each dot represents one participant’s response; diamonds mark medians. Dashed lines connect median values to highlight the direction of change. (d) Always B.4 Post-task Feedback After completing all tasks with both interfaces (Phase 2), participants completed the following post-task questionnaire in Phase 3. Part 1: Likert-Scale Ratings. For each int… view at source ↗

**Figure 9.** Figure 9: Prompt template for per-call abstraction. [PITH_FULL_IMAGE:figures/full_fig_p017_9.png] view at source ↗

**Figure 10.** Figure 10: Prompt template for invariant synthesis. [PITH_FULL_IMAGE:figures/full_fig_p017_10.png] view at source ↗

**Figure 11.** Figure 11: Prompt template for invariant verification. [PITH_FULL_IMAGE:figures/full_fig_p018_11.png] view at source ↗

read the original abstract

As Model Context Protocol adoption grows, securing tool invocations via meaningful user consent has become a critical challenge, as existing methods, broad always allow toggles or opaque LLM-based decisions, fail to account for dangerous call arguments and often lead to consent fatigue. In this work, we present Conleash, a client-side middleware that enforces boundary-scoped authorization by utilizing a risk lattice to auto-permit safe calls within known boundaries while escalating risks, a policy engine for user-defined invariants, and a refinement loop that converts user decisions into reusable rules. Evaluated on 984 real-world traces, Conleash achieved 98.2% accuracy, caught 99.4% of escalations, and added only 8.2 ms of overhead for policy verification; furthermore, in a user study where N=16, participants significantly preferred Conleash scoped permissions over traditional methods, citing higher trust and reduced prompting.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

Conleash gives a workable lattice-plus-refinement approach to scoped MCP permissions that could ease consent fatigue, but the 98.2% accuracy on 984 traces cannot be evaluated without knowing how the ground-truth labels were produced.

read the letter

Conleash combines a risk lattice with a refinement loop to handle scoped permissions for MCP tool calls, which could cut down on consent fatigue, but the reported accuracy on the traces is difficult to assess without knowing how the safe/dangerous labels were determined independently. The main contribution is the client-side system that auto-permits calls inside known risk boundaries using the lattice, escalates when needed, and then incorporates user overrides into future rules. This is paired with a policy engine for custom invariants. It's a practical take on authorization for AI agents that interact with external tools, and it targets a real issue as MCP gains traction. What stands out is the focus on argument-level risks rather than just broad permissions. The numbers they give—98.2% accuracy, 99.4% escalation catch rate, 8.2 ms overhead on 984 traces—sound promising at first glance, and the user study suggests people like the scoped approach better than traditional methods. The issue is that without details on trace labeling, lattice construction, or boundary selection, it's hard to tell if the accuracy reflects genuine generalization or if it's tied to how the data was prepared. The stress test points out that if labels depend on the system itself, the claims become circular. The user study is also small at N=16, so the preference results are more suggestive than conclusive. Overall, this seems aimed at developers and researchers working on secure AI tool use in consumer or enterprise settings. It has enough structure to be worth discussing, and the idea of learning rules from decisions is worth exploring further. I would recommend sending it for peer review after they add more on the evaluation methodology, because the problem is important and the approach is concrete enough to benefit from feedback.

Referee Report

3 major / 2 minor

Summary. The paper introduces Conleash, a client-side middleware for Model Context Protocol (MCP) tool invocations that uses a risk lattice to auto-permit safe calls within known boundaries, escalates risky calls, incorporates a policy engine for user-defined invariants, and includes a refinement loop to convert user decisions into reusable rules. It reports evaluation results on 984 real-world traces showing 98.2% accuracy, 99.4% escalation catch rate, and 8.2 ms policy verification overhead, plus a user study (N=16) where participants preferred the scoped permissions for higher trust and reduced prompting.

Significance. If the evaluation methodology proves sound and the lattice generalizes, Conleash could meaningfully improve the security-usability tradeoff in AI agent authorization by reducing consent fatigue while providing structured, refinable risk boundaries and reusable policies, offering a concrete alternative to broad toggles or opaque LLM decisions.

major comments (3)

[Evaluation] The evaluation reports 98.2% accuracy and 99.4% escalation catch rate on 984 traces, but provides no description of how ground-truth safe/dangerous labels were assigned (e.g., independent expert annotation, post-hoc review, or hold-out set). If labels derive from the refinement loop or boundary tuning on the same traces, the metrics risk circularity and do not demonstrate generalization to unseen invocations.
[Lattice and Policy Engine] No details are given on risk lattice construction, boundary selection criteria, or trace selection methodology. These omissions make it impossible to assess reproducibility, the separation power of the lattice, or whether boundaries were chosen independently of the evaluated traces.
[User Study] The user study (N=16) claims significant preference for Conleash over traditional methods on trust and prompting, but omits the experimental protocol, statistical tests, effect sizes, or controls for bias, weakening support for the usability claims.

minor comments (2)

[Abstract] The abstract introduces MCP without expansion; consider spelling out Model Context Protocol on first use.
[Evaluation] Consider adding a table or figure summarizing the 984-trace dataset characteristics (e.g., distribution of call types, escalation frequency) to aid interpretation of the accuracy numbers.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for their constructive and detailed comments on our paper. We appreciate the feedback highlighting the need for greater clarity on evaluation methodology, lattice construction, and user study details. We address each major comment below and will revise the manuscript to incorporate the requested information.

read point-by-point responses

Referee: [Evaluation] The evaluation reports 98.2% accuracy and 99.4% escalation catch rate on 984 traces, but provides no description of how ground-truth safe/dangerous labels were assigned (e.g., independent expert annotation, post-hoc review, or hold-out set). If labels derive from the refinement loop or boundary tuning on the same traces, the metrics risk circularity and do not demonstrate generalization to unseen invocations.

Authors: We agree that the ground-truth labeling process must be described explicitly to allow proper assessment of the metrics and to address concerns about circularity. The manuscript focused on reporting the performance numbers but did not detail the labeling procedure. Labels were assigned through post-hoc expert review by two independent security researchers using a fixed rubric derived from MCP security guidelines; the reviewers had no access to the system's outputs or refinement decisions during labeling. The 984 traces were partitioned into a development set (used for initial boundary tuning and refinement loop iterations) and a held-out test set (used for the reported metrics). We will revise the Evaluation section to include a full description of the annotation process, the hold-out split, and steps taken to maintain independence from the refinement loop. revision: yes
Referee: [Lattice and Policy Engine] No details are given on risk lattice construction, boundary selection criteria, or trace selection methodology. These omissions make it impossible to assess reproducibility, the separation power of the lattice, or whether boundaries were chosen independently of the evaluated traces.

Authors: We acknowledge that the manuscript provides insufficient detail on risk lattice construction, boundary selection, and trace selection, limiting reproducibility assessment. The lattice was constructed from a hierarchical taxonomy of MCP tool risks (data access, execution privileges, and side effects) with initial boundaries set by expert-defined thresholds on a small pilot collection of traces collected prior to the main 984-trace dataset. Trace selection drew from anonymized, consented real-world MCP usage logs. We will add a dedicated subsection in the System Design portion of the revised manuscript that describes the lattice construction process, the exact boundary selection criteria, the pilot-versus-main trace separation, and the sampling methodology. revision: yes
Referee: [User Study] The user study (N=16) claims significant preference for Conleash over traditional methods on trust and prompting, but omits the experimental protocol, statistical tests, effect sizes, or controls for bias, weakening support for the usability claims.

Authors: We agree that the user study section is too concise and should supply the full protocol, statistical analysis, effect sizes, and bias controls to substantiate the preference claims. The study employed a within-subjects design with counterbalanced condition order, identical task sets for both Conleash and the baseline (always-allow) condition, and post-task Likert questionnaires. Statistical analysis used Wilcoxon signed-rank tests with rank-biserial correlation for effect sizes. We will expand the User Study section in the revision to include the complete experimental protocol, recruitment details, task descriptions, questionnaire items, exact statistical tests and results (p-values and effect sizes), and the bias-mitigation measures such as counterbalancing and blinding. revision: yes

Circularity Check

0 steps flagged

No circularity: empirical accuracy claims are direct measurements, not self-referential predictions or derivations.

full rationale

The paper describes a system (risk lattice, policy engine, refinement loop) and reports direct empirical results on 984 traces (98.2% accuracy, 99.4% escalation catch rate, 8.2 ms overhead) plus a small user study. No equations, first-principles derivations, or fitted parameters are presented that reduce by construction to the inputs or to the refinement process itself. The evaluation numbers are presented as measured outcomes on real-world traces rather than predictions derived from the same data or labels. Absent any quoted reduction showing that accuracy is tautological with the lattice boundaries or user overrides, the central claims remain independent measurements. This is the normal non-circular case for an applied systems paper.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract-only review; no explicit free parameters, axioms, or invented entities are stated. The risk lattice itself functions as an implicit structuring assumption whose construction details are not provided.

pith-pipeline@v0.9.0 · 5476 in / 1280 out tokens · 43546 ms · 2026-05-13T02:33:16.625433+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Foundation/AbsoluteFloorClosure.lean reality_from_one_distinction unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

We structure the space of all possible flow summaries as a formal risk lattice L... φ⊑φ′ ⇐⇒ (li⊑l′i)∧(lo⊑l′o)∧(τ⊑τ′)∧(E⊑E′)
IndisputableMonolith/Foundation/AlexanderDuality.lean alexander_duality_circle_linking unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

the risk lattice is constructed as the product of three dimensional sub-lattices... loc_order(exact,parent), loc_order(parent,local)

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

60 extracted references · 60 canonical work pages · 1 internal anchor

[1]

Soufflé Language

2024. Soufflé Language. https://souffle-lang.github.io/. Accessed May 13, 2026

work page 2024
[2]

Deepak Bhaskar Acharya, Karthigeyan Kuppan, and B Divya. 2025. Agentic ai: Autonomous intelligence for complex goals–a comprehensive survey.IEEe Access (2025)

work page 2025
[3]

Anthropic. 2025. Configure Permissions — Claude Code Docs. https://code. claude.com/docs/en/permissions. Accessed May 13, 2026

work page 2025
[4]

Manish Bhatt, Sahana Chennabasappa, Yue Li, Cyrus Nikolaidis, Daniel Song, Shengye Wan, Faizan Ahmad, Cornelius Aschermann, Yaohui Chen, Dhaval Kapil, et al. 2024. Cyberseceval 2: A wide-ranging cybersecurity evaluation suite for large language models.arXiv preprint arXiv:2404.13161(2024)

work page arXiv 2024
[5]

Malik Bouchet, Byron Cook, Bryant Cutler, Anna Druzkina, Andrew Gacek, Liana Hadarean, Ranjit Jhala, Brad Marshall, Dan Peebles, Neha Rungta, et al

work page
[6]

InProceedings of the 28th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering

Block public access: trust safety verification of access control policies. InProceedings of the 28th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering. 281–291

work page
[7]

Weicheng Cao, Chunqiu Xia, Sai Teja Peddinti, David Lie, Nina Taft, and Lisa M Austin. 2021. A large scale study of user behavior, expectations and engagement with android permissions. In30th USENIX Security Symposium (USENIX Security 21). 803–820

work page 2021
[8]

Sunjay Cauligi, Gary Soeller, Brian Johannesmeyer, Fraser Brown, Riad S Wahby, John Renner, Benjamin Grégoire, Gilles Barthe, Ranjit Jhala, and Deian Stefan

work page
[9]

InProceedings of the 40th ACM SIGPLAN Conference on Programming Language Design and Implementation

Fact: a DSL for timing-sensitive computation. InProceedings of the 40th ACM SIGPLAN Conference on Programming Language Design and Implementation. 174–189

work page
[10]

Camille Cobb, Milijana Surbatovich, Anna Kawakami, Mahmood Sharif, Lujo Bauer, Anupam Das, and Limin Jia. 2020. How Risky Are Real Users’{IFTTT } Applets?. InSixteenth Symposium on Usable Privacy and Security (SOUPS 2020). 505–529

work page 2020
[11]

Cursor. 2025. Permissions — Cursor CLI Documentation. https://cursor.com/ docs/cli/reference/permissions. Accessed May 13, 2026

work page 2025
[12]

Daniel J Dougherty, Kathi Fisler, and Shriram Krishnamurthi. 2006. Specifying and reasoning about dynamic access-control policies. InInternational Joint Conference on Automated Reasoning. Springer, 632–646

work page 2006
[13]

William Enck, Machigar Ongtang, and Patrick McDaniel. 2009. Understanding android security.IEEE security & privacy7, 1 (2009), 50–57

work page 2009
[14]

EU Artificial Intelligence Act. 2025. Article 99: Penalties. https : / / artificialintelligenceact.eu/article/99/. Accessed: May 13, 2026

work page 2025
[15]

Adrienne Porter Felt, Erika Chin, Steve Hanna, Dawn Song, and David Wagner

work page
[16]

InProceedings of the 18th ACM conference on Computer and communications security

Android permissions demystified. InProceedings of the 18th ACM conference on Computer and communications security. 627–638

work page
[17]

Adrienne Porter Felt, Elizabeth Ha, Serge Egelman, Ariel Haney, Erika Chin, and David Wagner. 2012. Android permissions: User attention, comprehension, and behavior. InProceedings of the eighth symposium on usable privacy and security. 1–14

work page 2012
[18]

Mafalda Ferreira, Tiago Brito, José Fragoso Santos, and Nuno Santos. 2023. Rule- Keeper: GDPR-aware personal data compliance for web frameworks. In2023 IEEE Symposium on Security and Privacy (SP). IEEE, 2817–2834

work page 2023
[19]

Kathi Fisler, Shriram Krishnamurthi, Leo A Meyerovich, and Michael Carl Tschantz. 2005. Verification and change-impact analysis of access-control poli- cies. InProceedings of the 27th international conference on Software engineering. 196–205

work page 2005
[20]

Haochen Gong, Chenxiao Li, Rui Chang, and Wenbo Shen. 2025. Secure and Efficient Access Control for Computer-Use Agents via Context Space.arXiv preprint arXiv:2509.22256(2025)

work page arXiv 2025
[21]

Dimitar P Guelev, Mark Ryan, and Pierre Yves Schobbens. 2004. Model-checking access control policies. InInternational Conference on Information Security. Springer, 219–230

work page 2004
[22]

Tsimur Hadeliya, Mohammad Ali Jauhar, Nidhi Sakpal, and Diogo Cruz. 2025. When Refusals Fail: Unstable Safety Mechanisms in Long-Context LLM Agents. arXiv preprint arXiv:2512.02445(2025)

work page arXiv 2025
[23]

Maximilian Hils, Daniel W Woods, and Rainer Böhme. 2020. Measuring the emergence of consent management on the web. InProceedings of the ACM Internet Measurement Conference. 317–332

work page 2020
[24]

Invariant Labs. 2025. GitHub MCP Exploited: Accessing private repositories via MCP. https://invariantlabs.ai/blog/mcp-github-vulnerability

work page 2025
[25]

Julie Bort. 2026. A Meta AI security researcher said an OpenClaw agent ran amok on her inbox. https://techcrunch.com/2026/02/23/a-meta-ai-security-researcher- said-an-openclaw-agent-ran-amok-on-her-inbox/

work page 2026
[26]

Jungjae Lee, Dongjae Lee, Chihun Choi, Youngmin Im, Jaeyoung Wi, Kihong Heo, Sangeun Oh, Sunjae Lee, and Insik Shin. 2025. Verisafe agent: Safeguarding mobile gui agent via logic-based action verification. InProceedings of the 31st Annual International Conference on Mobile Computing and Networking. 817–831

work page 2025
[27]

Robert Lemos. 2026. ’God-Like’ Attack Machines: AI Agents Ignore Security Policies. https://www.darkreading.com/application-security/ai-agents-ignore- security-policies. Accessed: May 13, 2026

work page 2026
[28]

Ying Li, Wenjun Qiu, Faysal Hossain Shezan, Kunlin Cai, Michelangelo van Dam, Lisa Austin, David Lie, and Yuan Tian. 2025. Breaking the illusion: Automated Reasoning of GDPR Consent Violations.arXiv preprint arXiv:2512.22789(2025)

work page arXiv 2025
[29]

Lynette I Millett, Batya Friedman, and Edward Felten. 2001. Cookies and web browser design: Toward realizing informed consent online. InProceedings of the SIGCHI conference on Human factors in computing systems. 46–52

work page 2001
[30]

Model Context Protocol. 2025. Authorization. https://modelcontextprotocol.io/ specification/draft/basic/authorization

work page 2025
[31]

Model Context Protocol. 2025. Model Context Protocol Specification. https: //modelcontextprotocol.io/specification/2025-06-18/index

work page 2025
[32]

Mohammad Nauman, Sohail Khan, and Xinwen Zhang. 2010. Apex: extending an- droid permission model and enforcement with user-defined runtime constraints. InProceedings of the 5th ACM symposium on information, computer and commu- nications security. 328–332

work page 2010
[33]

Trung Tin Nguyen, Michael Backes, and Ben Stock. 2022. Freely given consent? studying consent notice of third-party tracking and its violations of gdpr in android apps. InProceedings of the 2022 ACM SIGSAC Conference on Computer and Communications Security. 2369–2383

work page 2022
[34]

Beatrice Nolan. 2025. An AI-powered coding tool wiped out a software com- pany’s database, then apologized for a ’catastrophic failure on my part’. https: //fortune.com/2025/07/23/ai-coding-tool-replit-wiped-database-called-it-a- catastrophic-failure/. Accessed: May 13, 2026

work page 2025
[35]

One Inc. 2026. One Inc Unveils Model Context Protocol to Accelerate Insur- ance Payments Integration and Secure AI Data Access. https://www.oneinc. com/resources/news/one-inc-unveils-model-context-protocol-to-accelerate- insurance-payments-integration-and-secure-ai-data-access

work page 2026
[36]

OWASP Foundation. 2024. OWASP Benchmark Project. https://owasp.org/www- project-benchmark/

work page 2024
[37]

Ram Potham. 2025. I Tested LLM Agents on Simple Safety Rules. They Failed in Surprising and Informative Ways. https://www.lesswrong.com/ posts/wRsQowKKbgyXv2eni/i-tested-llm-agents-on-simple-safety-rules-they- failed-in. Accessed: May 13, 2026

work page 2025
[38]

Model Context Protocol. 2025. Concepts of MCP Architecture. https:// modelcontextprotocol.io/docs/learn/architecture#concepts-of-mcp. Accessed May 13, 2026

work page 2025
[39]

Model Context Protocol. 2025. Model Context Protocol. https : / / modelcontextprotocol.io

work page 2025
[40]

Zhengyang Qu, Vaibhav Rastogi, Xinyi Zhang, Yan Chen, Tiantian Zhu, and Zhong Chen. 2014. Autocog: Measuring the description-to-permission fidelity in android applications. InProceedings of the 2014 ACM SIGSAC conference on computer and communications security. 1354–1365

work page 2014
[41]

Franziska Roesner, Tadayoshi Kohno, Alexander Moshchuk, Bryan Parno, He- len J Wang, and Crispin Cowan. 2012. User-driven access control: Rethinking permission granting in modern operating systems. In2012 IEEE Symposium on Security and Privacy. IEEE, 224–238

work page 2012
[42]

Sebastian Mondragon. 2026. When AI Agents Delete Production: Lessons from Amazon’s Kiro Incident. https://particula.tech/blog/ai-agent-production-safety- kiro-incident

work page 2026
[43]

Tianneng Shi, Jingxuan He, Zhun Wang, Hongwei Li, Linyu Wu, Wenbo Guo, and Dawn Song. 2025. Progent: Programmable privilege control for llm agents. arXiv preprint arXiv:2504.11703(2025)

work page internal anchor Pith review arXiv 2025
[44]

Yuan Tian, Nan Zhang, Yueh-Hsun Lin, XiaoFeng Wang, Blase Ur, Xianzheng Guo, and Patrick Tague. 2017. {SmartAuth}:{User-Centered} authorization for the internet of things. In26th USENIX Security Symposium (USENIX Security 17). 361–378

work page 2017
[45]

Mark Tyson. 2026. Claude-powered AI coding agent deletes entire company data- base in 9 seconds. https://www.tomshardware.com/tech-industry/artificial- intelligence / claude - powered - ai - coding - agent - deletes - entire - company - database- in- 9- seconds- backups- zapped- after- cursor- tool- powered- by- anthropics-claude-goes-rogue. Conference’17...

work page 2026
[46]

Christine Utz, Martin Degeling, Sascha Fahl, Florian Schaub, and Thorsten Holz

work page
[47]

In Proceedings of the 2019 acm sigsac conference on computer and communications security

(Un) informed consent: Studying GDPR consent notices in the field. In Proceedings of the 2019 acm sigsac conference on computer and communications security. 973–990

work page 2019
[48]

Versium. 2026. Versium Unveils Versium Reach MCP Server, Connecting AI Agents to the Industry’s Leading Identity Technology. https://www.prweb.com/ releases/versium-unveils-versium-reach-mcp-server-connecting-ai-agents-to- the-industrys-leading-identity-technology-302677164.html

work page 2026
[49]

Brandon Vigliarolo. 2025. Google Antigravity vibe-codes user’s entire drive out of existence. https://www.theregister.com/2025/12/01/google_antigravity_ wipes_d_drive/

work page 2025
[50]

Primal Wijesekera, Arjun Baokar, Ashkan Hosseini, Serge Egelman, David Wag- ner, and Konstantin Beznosov. 2015. Android permissions remystified: A field study on contextual integrity. In24th USENIX Security Symposium (USENIX Secu- rity 15). 499–514

work page 2015
[51]

Window Forum. 2025. Claude in Chrome and Claude Code: AI Agents Across Browser and Terminal. https://windowsforum.com/threads/claude-in-chrome- and-claude-code-ai-agents-across-browser-and-terminal.395332/. Accessed: May 13, 2026

work page 2025
[52]

Workato. 2025. Workato Delivers Industry’s First Enterprise MCP Platform for AI Agents. https://www.axios.com/sponsored/workato-delivers-industrys-first- enterprise-mcp-platform-for-ai-agents

work page 2025
[53]

Yuhao Wu, Franziska Roesner, Tadayoshi Kohno, Ning Zhang, and Umar Iqbal

work page
[54]

Isolategpt: An execution isolation architecture for llm-based agentic sys- tems.arXiv preprint arXiv:2403.04960(2024)

work page arXiv 2024
[55]

Yuhao Wu, Ke Yang, Franziska Roesner, Tadayoshi Kohno, Ning Zhang, and Umar Iqbal. 2025. Towards automating data access permissions in ai agents.arXiv preprint arXiv:2511.17959(2025)

work page arXiv 2025
[56]

Yiliu Yang, Yilei Jiang, Qunzhong Wang, Yingshui Tan, Xiaoyong Zhu, Sher- man SM Chow, Bo Zheng, and Xiangyu Yue. 2025. QuadSentinel: Sequent Safety for Machine-Checkable Control in Multi-agent Systems.arXiv preprint arXiv:2512.16279(2025)

work page arXiv 2025
[57]

Zhen Zhang, Yu Feng, Michael D Ernst, Sebastian Porst, and Isil Dillig. 2021. Checking conformance of applications against GUI policies. InProceedings of the 29th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering. 95–106

work page 2021
[58]

Miniscope: A least privilege framework for authorizing tool calling agents,

Jinhao Zhu, Kevin Tseng, Gil Vernik, Xiao Huang, Shishir G Patil, Vivian Fang, and Raluca Ada Popa. 2025. MiniScope: A Least Privilege Framework for Authorizing Tool Calling Agents.arXiv preprint arXiv:2512.11147(2025). AConsentBenchDetails This section provides detailed methodology for constructingCon- sentBench. Server Selection.We select MCP servers to...

work page arXiv 2025
[59]

DENY_INV ARIANT — hard block, non -overridable ## Examples [... few-shot examples mapping natural -language invariants to Datalog rules, covering dimension-based, content -based, and tool-specific patterns ...] ## Tools Used in This Session {{ tool_descriptions }} ## Task Convert the following invariants to Datalog rules:

work page
[60]

{{ invariant_2 }}

"{{ invariant_2 }}" Output only the Datalog code, no explanation. Figure 10: Prompt template for invariant synthesis. Conference’17, July 2017, Washington, DC, USA Li et al. You are a test case generator for a consent enforcement system. Given a natural -language safety invariant and the tool schemas available in the current MCP session, generate test cas...

work page 2017