PrivScope: Task-scoped Disclosure Control for Hybrid Agentic Systems

Shafizur Rahman Seeam; Yidan Hu; Yimin (Ian) Chen; Zhengxiong Li; Zhiyuan Yu

arxiv: 2605.16630 · v2 · pith:EEO673QNnew · submitted 2026-05-15 · 💻 cs.CR · cs.AI

PrivScope: Task-scoped Disclosure Control for Hybrid Agentic Systems

Shafizur Rahman Seeam , Zhengxiong Li , Zhiyuan Yu , Yimin (Ian) Chen , Yidan Hu This is my paper

Pith reviewed 2026-05-20 16:12 UTC · model grok-4.3

classification 💻 cs.CR cs.AI

keywords privacydisclosure controlhybrid agentscloud language modelstask scopingdata leakage preventioninformation abstractionagentic systems

0 comments

The pith

Task-scoped disclosure control on device can prevent over-disclosure to cloud models in hybrid agents while preserving task performance.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper tries to establish that a local trusted governor can enforce task-scoped disclosure for hybrid agents that delegate to cloud language models. By extracting disclosure units, keeping sensitive identifiers local, and abstracting only the necessary minimal information, it reduces unnecessary exposure from persistent state and prior workflows. A reader would care if this holds because over-disclosure leads to profile leakage and higher re-identification success by attackers. If the approach works, agents can use rich local context without sending excess sensitive data to the cloud.

Core claim

PrivScope presents a trusted on-device payload governor that enforces task-scoped disclosure at the local-cloud language model boundary without requiring changes to the cloud models. The key idea is that sensitive information should reach the cloud only when required for the delegated subtask, and then only in the least revealing form that preserves utility. It extracts disclosure units from the assembled payload, keeps direct identifiers and account-linked values on device, and routes the rest through a cloud-necessity control that determines actual needs and abstracts to least-specific representations.

What carries the argument

cloud-necessity control, which determines the minimal information required for each subtask and abstracts it to the least-specific representation sufficient for the task

If this is right

Profile leakage drops to zero in the tested workflows compared to 17.7 percent without control
Attacker re-identification success is more than halved from 64.3 percent to 23.1 percent
Highest candidate recall is reached on every cloud language model tested
Task success stays close to the unprotected baseline on GPT-4o-mini and Gemini 2.5 Flash

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same boundary control could apply to other hybrid setups where local state is enriched before delegation to external services.
Abstraction rules might extend beyond text to handle structured records or other data formats in agent payloads.
Long-running agents with accumulating context could benefit from repeated scoping to limit cumulative exposure over multiple workflows.

Load-bearing premise

The cloud-necessity control can reliably determine the minimal information required for each delegated subtask and that abstraction to the least-specific representation will still allow the cloud model to complete the task successfully without needing additional context.

What would settle it

Running the same medical-booking workflows with only the abstracted data and checking whether task completion rates stay close to the unprotected baseline or whether the models request extra context to succeed.

Figures

Figures reproduced from arXiv: 2605.16630 by Shafizur Rahman Seeam, Yidan Hu, Yimin (Ian) Chen, Zhengxiong Li, Zhiyuan Yu.

**Figure 1.** Figure 1: High-level overview of PRIVSCOPE. PRIVSCOPE mediates an over-inclusive LC→CLM payload, producing a task-sufficient cloud-visible version while keeping private context on device. interactions, tool outputs, and retrieved artifacts [6], [8]. We refer to this evolving context as the agent’s working state. Working state improves personalization and reduces repeated user intervention, but it also creates a pri… view at source ↗

**Figure 2.** Figure 2: Hybrid local–cloud agent architecture. A trusted on-device [PITH_FULL_IMAGE:figures/full_fig_p003_2.png] view at source ↗

**Figure 3.** Figure 3: Overview of P [PITH_FULL_IMAGE:figures/full_fig_p006_3.png] view at source ↗

**Figure 4.** Figure 4: The extractor combines profile matching, structured-pattern [PITH_FULL_IMAGE:figures/full_fig_p006_4.png] view at source ↗

**Figure 5.** Figure 5: Role assignment partitions extracted disclosure units into [PITH_FULL_IMAGE:figures/full_fig_p007_5.png] view at source ↗

**Figure 6.** Figure 6: Task-sufficient abstraction over cloud-needed units. The b [PITH_FULL_IMAGE:figures/full_fig_p008_6.png] view at source ↗

**Figure 7.** Figure 7: Sensitivity to the local model backbone. For each backbone, the same local model serves as both the LC controller and the sanitizer. [PITH_FULL_IMAGE:figures/full_fig_p012_7.png] view at source ↗

**Figure 8.** Figure 8: On-device sanitization latency of PRIVSCOPE across five local backbones, decomposed by pipeline stage and ordered by total runtime. Cloud-necessity analysis dominates latency; unit extraction contributes < 2% across all backbones. duces this brittleness by decomposing sanitization into explicit extraction, local binding, task-necessity filtering, and targeted abstraction. The remaining variation across bac… view at source ↗

**Figure 9.** Figure 9: Cloud API cost per 1,000 tasks across three commercial [PITH_FULL_IMAGE:figures/full_fig_p013_9.png] view at source ↗

read the original abstract

Hybrid local--cloud agents enrich user requests with context from persistent working state before delegating capability-intensive subtasks to a cloud language model (CLM). While this enrichment can improve task success, it also exposes unnecessary information in the cloud-bound payload, including task-irrelevant context, carryover from prior workflows, and overly specific sensitive details, resulting in \emph{over-disclosure}. Existing solutions either isolate workflows to limit cross-workflow leakage or apply general-purpose sanitization that does not reason over LC-assembled payload scope. We present \textsc{PrivScope}, a trusted on-device payload governor that enforces \emph{task-scoped disclosure} at the local--CLM boundary, without requiring cloud-side changes. Its key idea: sensitive information should reach the cloud only when required for the delegated subtask, and then only in the least revealing form preserving utility. \textsc{PrivScope} extracts disclosure units from the assembled payload and keeps direct identifiers and account-linked values on device. The remaining units pass through cloud-necessity control, which determines what is actually needed; units that must reach the cloud are abstracted to the least-specific representation sufficient for the task. On 100 medical-booking workflows across three commercial CLMs, \textsc{PrivScope} eliminates profile leakage (0.0\% vs.\ 17.7\%), more than halves attacker re-identification (23.1\% vs.\ 64.3\%), and achieves the highest candidate recall on every CLM tested while preserving task success close to the unprotected baseline on GPT-4o-mini and Gemini 2.5 Flash. Gains hold across five local backbones and add only seconds of on-device latency on commodity hardware.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

PrivScope adds an on-device necessity check and abstraction layer for hybrid agent payloads, with reported privacy gains on medical workflows that still rest on untested control accuracy.

read the letter

PrivScope puts an on-device governor between local agent state and cloud models to enforce task-scoped disclosure. It pulls out disclosure units from the assembled payload, holds direct identifiers locally, runs a necessity check for the delegated subtask, and abstracts the rest to the least-specific form that still lets the cloud model finish the work. That combination is the concrete step beyond general sanitization or workflow isolation. On the 100 medical-booking workflows the abstract reports zero profile leakage against a 17.7 percent baseline, re-identification cut from 64.3 to 23.1 percent, top candidate recall across the three CLMs, and task success staying close to the unprotected case on GPT-4o-mini and Gemini 2.5 Flash, all with only seconds of added latency on commodity hardware. Those numbers are the clearest evidence the paper supplies. The central assumption is that the necessity control can reliably pick the minimal required units and that the abstraction step will not force the cloud model to ask for more context or fail. The abstract gives no accuracy figures for that control, no error bars, no exclusion rules, and no stress tests on ambiguous information needs, so the privacy-utility tradeoff shown here could shrink or disappear once the control is measured directly. The evaluation is also limited to one workflow family and commercial CLMs, which leaves open how the same governor behaves on other domains or open models. This is useful reading for anyone building or securing local-cloud agents who wants a scoped alternative to blanket isolation. The idea and the reported deltas are worth a referee's time even if the methods section needs expansion on the control's own performance.

Referee Report

2 major / 2 minor

Summary. PrivScope is an on-device payload governor for hybrid local-cloud agentic systems that enforces task-scoped disclosure: it extracts disclosure units from the assembled payload, retains direct identifiers and account-linked values locally, routes remaining units through a cloud-necessity control to decide what must reach the CLM, and abstracts those units to the least-specific representation that still permits task completion. The paper evaluates the system on 100 medical-booking workflows across three commercial CLMs (and five local backbones), reporting elimination of profile leakage (0.0% vs. 17.7%), more than halved attacker re-identification (23.1% vs. 64.3%), highest candidate recall on every CLM, and task success close to the unprotected baseline on GPT-4o-mini and Gemini 2.5 Flash, with only seconds of added on-device latency.

Significance. If the cloud-necessity control and abstraction steps are shown to be reliable, PrivScope would provide a concrete, deployable mechanism for reducing over-disclosure at the local-CLM boundary without requiring changes to cloud models. The empirical results on concrete leakage and re-identification metrics, together with the preservation of task utility, would constitute a useful data point for privacy engineering in agentic workflows.

major comments (2)

[Cloud-necessity control and evaluation sections] The central privacy-utility claims rest on the accuracy of the cloud-necessity control and the utility of the subsequent abstraction step. The manuscript should report independent accuracy metrics for the control (e.g., precision/recall against ground-truth necessity labels) and ablation results showing how false-positive or false-negative decisions affect both leakage and task success; without these, the reported 0.0% leakage and halved re-identification cannot be confidently attributed to the mechanism rather than to the specific 100-workflow test set.
[Evaluation] The evaluation protocol (data exclusion rules, exact definition of profile leakage and attacker re-identification, prompt templates for the three CLMs, and how task success is scored) is not described with sufficient detail to allow reproduction or to assess whether the 100 medical-booking workflows contain edge cases that would stress the necessity control.

minor comments (2)

[Threat model] Clarify the exact threat model for the attacker re-identification metric (e.g., what auxiliary information the attacker is assumed to possess).
[Discussion] Add a short discussion of failure modes when the abstraction step produces a representation that is still insufficient for the CLM.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive and detailed feedback. We address each major comment below and will revise the manuscript to incorporate the requested additions where they strengthen the work.

read point-by-point responses

Referee: [Cloud-necessity control and evaluation sections] The central privacy-utility claims rest on the accuracy of the cloud-necessity control and the utility of the subsequent abstraction step. The manuscript should report independent accuracy metrics for the control (e.g., precision/recall against ground-truth necessity labels) and ablation results showing how false-positive or false-negative decisions affect both leakage and task success; without these, the reported 0.0% leakage and halved re-identification cannot be confidently attributed to the mechanism rather than to the specific 100-workflow test set.

Authors: We agree that independent accuracy metrics and ablations would improve attribution of the observed privacy gains. In the revised manuscript we will add a new subsection reporting precision and recall of the cloud-necessity control against manually annotated ground-truth necessity labels on a held-out portion of the workflows. We will also include ablation results that inject controlled false-positive and false-negative decisions into the control and quantify the resulting changes in leakage and task-success metrics. These additions will make the causal link between the mechanism and the reported outcomes more explicit. revision: yes
Referee: [Evaluation] The evaluation protocol (data exclusion rules, exact definition of profile leakage and attacker re-identification, prompt templates for the three CLMs, and how task success is scored) is not described with sufficient detail to allow reproduction or to assess whether the 100 medical-booking workflows contain edge cases that would stress the necessity control.

Authors: We accept that the current Evaluation section omits several details required for reproducibility. The revised version will expand this section to specify the exact data exclusion rules, provide formal definitions of profile leakage and attacker re-identification, reproduce the prompt templates used with each CLM, and describe the task-success scoring procedure. We will also add a short analysis of workflow characteristics, highlighting any edge cases that could challenge the necessity control. revision: yes

Circularity Check

0 steps flagged

No circularity: empirical results rest on external CLM evaluations

full rationale

The paper's central claims consist of measured performance differences (0.0% vs 17.7% profile leakage, 23.1% vs 64.3% re-identification) obtained by running PrivScope on 100 medical-booking workflows against three commercial CLMs and five local backbones. These outcomes are direct experimental observations rather than quantities derived from fitted parameters, self-citations, or equations that reduce to the inputs by construction. The cloud-necessity control is presented as an implemented component whose accuracy is assessed via the same external-task-success and leakage metrics; no load-bearing uniqueness theorem, ansatz, or renaming of known results is invoked. The derivation chain is therefore self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 1 invented entities

Abstract-only review limits visibility into exact assumptions; the design implicitly relies on accurate unit extraction and necessity determination being feasible on-device.

axioms (1)

domain assumption Local device can extract disclosure units and apply necessity control without external cloud assistance
Core to the on-device governor design described in abstract.

invented entities (1)

cloud-necessity control no independent evidence
purpose: Determines minimal information needed for delegated subtask
New component introduced to decide what reaches the cloud

pith-pipeline@v0.9.0 · 5857 in / 1297 out tokens · 48935 ms · 2026-05-20T16:12:10.125974+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

PRIVSCOPE realizes this through a lightweight on-device pipeline that extracts disclosure units from the LC-assembled payload and keeps direct identifiers and account-linked values on device... Units that must reach the cloud are abstracted to the least-specific representation sufficient for the task.
IndisputableMonolith/Foundation/RealityFromDistinction.lean reality_from_one_distinction unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

We evaluate PRIVSCOPE on 100 medical-booking information-seeking workflows... eliminates profile leakage (0.0% vs. 17.7%)

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

52 extracted references · 52 canonical work pages · 6 internal anchors

[1]

Language mod- els are few-shot learners,

T. Brown, B. Mann, N. Ryder, M. Subbiah, J. D. Kaplan, P. Dhariwal, A. Neelakantan, P. Shyam, G. Sastry, A. Askellet al., “Language mod- els are few-shot learners,”Advances in neural information processing systems, vol. 33, pp. 1877–1901, 2020

work page 1901
[2]

Language models are unsupervised multitask learners,

A. Radford, J. Wu, R. Child, D. Luan, D. Amodei, I. Sutskeveret al., “Language models are unsupervised multitask learners,”OpenAI blog, vol. 1, no. 8, p. 9, 2019

work page 2019
[3]

Introducing Operator,

OpenAI, “Introducing Operator,” https://openai.com/index/ introducing-operator/, 2025, [Online; accessed 25-Sep-2025]

work page 2025
[4]

Empower your digital tasks with AutoGPT,

Autogpt, “Empower your digital tasks with AutoGPT,” https://agpt.co/, 2025, [Online; accessed 25-Sep-2025]

work page 2025
[5]

Automate Your Business with AgentGPT,

AGENTGPT, “Automate Your Business with AgentGPT,” https:// agentgpt.io/, 2024, [Online; accessed 25-Sep-2025]

work page 2024
[6]

Towards automating data access permissions in ai agents,

Y . Wu, K. Yang, F. Roesner, T. Kohno, N. Zhang, and U. Iqbal, “Towards automating data access permissions in ai agents,”arXiv preprint arXiv:2511.17959, 2025

work page arXiv 2025
[7]

Kaggle, “Agents,” https://www.kaggle.com/whitepaper-agents, 2025, [Online; accessed 25-Sep-2025]

work page 2025
[8]

Runtime permissions for privacy in proactive intelligent assistants,

N. Malkin, D. Wagner, and S. Egelman, “Runtime permissions for privacy in proactive intelligent assistants,” inEighteenth Symposium on Usable Privacy and Security (SOUPS 2022), 2022, pp. 633–651

work page 2022
[9]

Agentic plan caching: Test- time memory for fast and cost-efficient llm agents,

Q. Zhang, M. Wornow, and K. Olukotun, “Agentic plan caching: Test- time memory for fast and cost-efficient llm agents,” inThe Thirty-ninth Annual Conference on Neural Information Processing Systems

work page
[10]

Collaborative inference and learning between edge slms and cloud llms: A survey of algorithms, execution, and open challenges,

S. Li, H. Wang, W. Xu, R. Zhang, S. Guo, J. Yuan, X. Zhong, T. Zhang, and R. Li, “Collaborative inference and learning between edge slms and cloud llms: A survey of algorithms, execution, and open challenges,” arXiv preprint arXiv:2507.16731, 2025

work page arXiv 2025
[11]

Beyond memoriza- tion: Violating privacy via inference with large language models,

R. Staab, M. Vero, M. Balunovic, and M. Vechev, “Beyond memoriza- tion: Violating privacy via inference with large language models,” inThe Twelfth International Conference on Learning Representations, 2023

work page 2023
[12]

Deprompt: Desensitization and evaluation of personal identifiable information in large language model prompts,

X. Sun, G. Liu, Z. He, H. Li, and X. Li, “Deprompt: Desensitization and evaluation of personal identifiable information in large language model prompts,”arXiv preprint arXiv:2408.08930, 2024

work page arXiv 2024
[13]

Extracting training data from large language models,

N. Carlini, F. Tramer, E. Wallace, M. Jagielski, A. Herbert-V oss, K. Lee, A. Roberts, T. Brown, D. Song, U. Erlingssonet al., “Extracting training data from large language models,” in30th USENIX security symposium (USENIX Security 21), 2021, pp. 2633–2650

work page 2021
[14]

Sustainable ai: Environmental implications, challenges and opportunities,

C.-J. Wu, R. Raghavendra, U. Gupta, B. Acun, N. Ardalani, K. Maeng, G. Chang, F. Aga, J. Huang, C. Baiet al., “Sustainable ai: Environmental implications, challenges and opportunities,”Proceedings of machine learning and systems, vol. 4, pp. 795–813, 2022

work page 2022
[15]

Splitreason: Learning to offload reasoning,

Y . Akhauri, A. Fei, C.-C. Chang, A. F. AbouElhamayed, Y . Li, and M. S. Abdelfattah, “Splitreason: Learning to offload reasoning,”arXiv preprint arXiv:2504.16379, 2025

work page arXiv 2025
[16]

Cogenesis: A framework collaborating large and small language models for secure context-aware instruction following,

K. Zhang, J. Wang, E. Hua, B. Qi, N. Ding, and B. Zhou, “Cogenesis: A framework collaborating large and small language models for secure context-aware instruction following,”arXiv preprint arXiv:2403.03129, 2024

work page arXiv 2024
[17]

Private Cloud Compute: A new frontier for AI privacy in the cloud,

A. S. Engineering and A. (SEAR), “Private Cloud Compute: A new frontier for AI privacy in the cloud,” https://security.apple.com/ documentation/private-cloud-compute, 2024, [Online; accessed 28-Oct- 2025]

work page 2024
[18]

Mobilellm: Optimizing sub- billion parameter language models for on-device use cases,

Z. Liu, C. Zhao, F. Iandola, C. Lai, Y . Tian, I. Fedorov, Y . Xiong, E. Chang, Y . Shi, R. Krishnamoorthiet al., “Mobilellm: Optimizing sub- billion parameter language models for on-device use cases,” inForty-first International Conference on Machine Learning, 2024

work page 2024
[19]

Pricing Flagship Model,

OpenAI, “Pricing Flagship Model,” https://developers.openai.com/api/ docs/pricing, 2025, [Online; accessed 25-Sep-2025]

work page 2025
[20]

Presidio: Data Protection and De-identification SDK,

M. Presidio, “Presidio: Data Protection and De-identification SDK,” https://microsoft.github.io/presidio/, 2025, [Online; accessed 21-April- 2026]

work page 2025
[21]

Privacy-and utility- preserving textual analysis via calibrated multivariate perturbations,

O. Feyisetan, B. Balle, T. Drake, and T. Diethe, “Privacy-and utility- preserving textual analysis via calibrated multivariate perturbations,” in Proceedings of the 13th international conference on web search and data mining, 2020, pp. 178–186

work page 2020
[22]

Hide and seek (has): A lightweight framework for prompt privacy protection

Y . Chen, T. Li, H. Liu, and Y . Yu, “Hide and seek (has): A lightweight framework for prompt privacy protection,”arXiv preprint arXiv:2309.03057, 2023

work page arXiv 2023
[23]

WebArena: A Realistic Web Environment for Building Autonomous Agents

S. Zhou, F. F. Xu, H. Zhu, X. Zhou, R. Lo, A. Sridhar, X. Cheng, T. Ou, Y . Bisk, D. Friedet al., “Webarena: A realistic web environment for building autonomous agents,”arXiv preprint arXiv:2307.13854, 2023

work page internal anchor Pith review Pith/arXiv arXiv 2023
[24]

Agentdam: Privacy leakage evaluation for autonomous web agents,

A. Zharmagambetov, C. Guo, I. Evtimov, M. Pavlova, R. Salakhut- dinov, and K. Chaudhuri, “Agentdam: Privacy leakage evaluation for autonomous web agents,”arXiv preprint arXiv:2503.09780, 2025

work page arXiv 2025
[25]

Privacylens: Evaluating privacy norm awareness of language models in action,

Y . Shao, T. Li, W. Shi, Y . Liu, and D. Yang, “Privacylens: Evaluating privacy norm awareness of language models in action,”Advances in Neural Information Processing Systems, vol. 37, pp. 89 373–89 407, 2024

work page 2024
[26]

Operationalizing contextual integrity in privacy-conscious assistants.arXiv preprint arXiv:2408.02373, 2024

S. Ghalebikesabi, E. Bagdasaryan, R. Yi, I. Yona, I. Shumailov, A. Pappu, C. Shi, L. Weidinger, R. Stanforth, L. Berradaet al., “Operationalizing contextual integrity in privacy-conscious assistants,” arXiv preprint arXiv:2408.02373, 2024

work page arXiv 2024
[27]

SecGPT: An Execution Isolation Architecture for LLM-Based Systems,

Y . Wu, F. Roesner, T. Kohno, N. Zhang, and U. Iqbal, “Isolategpt: An execution isolation architecture for llm-based agentic systems,”arXiv preprint arXiv:2403.04960, 2024

work page arXiv 2024
[28]

Airgapagent: Protecting privacy- conscious conversational agents,

E. Bagdasarian, R. Yi, S. Ghalebikesabi, P. Kairouz, M. Gruteser, S. Oh, B. Balle, and D. Ramage, “Airgapagent: Protecting privacy- conscious conversational agents,” inProceedings of the 2024 on ACM SIGSAC Conference on Computer and Communications Security, 2024, pp. 3868–3882

work page 2024
[29]

Alsa: Context-sensitive prompt privacy preservation in large language models,

H. Ma, W. Lu, Y . Liang, T. Wang, Q. Zhang, Y . Zhu, and J. Si, “Alsa: Context-sensitive prompt privacy preservation in large language models,” inProceedings of the 31st ACM SIGKDD Conference on Knowledge Discovery and Data Mining V . 2, 2025, pp. 2042–2053

work page 2025
[30]

The fire thief is also the keeper: Balancing usability and privacy in prompts,

Z. Shen, Z. Xi, Y . He, W. Tong, J. Hua, and S. Zhong, “The fire thief is also the keeper: Balancing usability and privacy in prompts,”arXiv preprint arXiv:2406.14318, 2024

work page arXiv 2024
[31]

Privacyrestore: Privacy-preserving inference in large language models via privacy removal and restoration,

Z. Zeng, J. Wang, J. Yang, Z. Lu, H. Li, H. Zhuang, and C. Chen, “Privacyrestore: Privacy-preserving inference in large language models via privacy removal and restoration,” inProceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2025, pp. 10 821–10 855

work page 2025
[32]

Propile: Probing privacy leakage in large language models,

S. Kim, S. Yun, H. Lee, M. Gubri, S. Yoon, and S. J. Oh, “Propile: Probing privacy leakage in large language models,”Advances in Neural Information Processing Systems, vol. 36, pp. 20 750–20 762, 2023

work page 2023
[33]

Ciphergpt: Secure two-party gpt inference,

X. Hou, J. Liu, J. Li, Y . Li, W.-j. Lu, C. Hong, and K. Ren, “Ciphergpt: Secure two-party gpt inference,”Cryptology ePrint Archive, 2023

work page 2023
[34]

Iron: Private inference on transformers,

M. Hao, H. Li, H. Chen, P. Xing, G. Xu, and T. Zhang, “Iron: Private inference on transformers,”Advances in neural information processing systems, vol. 35, pp. 15 718–15 731, 2022

work page 2022
[35]

Anti-adversarial learning: Desensitizing prompts for large language models,

X. Li, Z. Yin, X. Gu, and B. Shen, “Anti-adversarial learning: Desensitizing prompts for large language models,”arXiv preprint arXiv:2505.01273, 2025

work page arXiv 2025
[36]

Can llms keep a secret? testing privacy implications of language models via contextual integrity theory,

N. Mireshghallah, H. Kim, X. Zhou, Y . Tsvetkov, M. Sap, R. Shokri, and Y . Choi, “Can llms keep a secret? testing privacy implications of language models via contextual integrity theory,”arXiv preprint arXiv:2310.17884, 2023

work page arXiv 2023
[37]

Privacy as contextual integrity,

H. Nissenbaum, “Privacy as contextual integrity,”Wash. L. Rev., vol. 79, p. 119, 2004

work page 2004
[38]

Ci-bench: Benchmarking contextual integrity of ai assistants on synthetic data.arXiv preprint arXiv:2409.13903, 2024

Z. Cheng, D. Wan, M. Abueg, S. Ghalebikesabi, R. Yi, E. Bagdasarian, B. Balle, S. Mellem, and S. O’Banion, “Ci-bench: Benchmarking contextual integrity of ai assistants on synthetic data,”arXiv preprint arXiv:2409.13903, 2024

work page arXiv 2024
[39]

Industrial-Strength Natural Language Processing,

SpaCy, “Industrial-Strength Natural Language Processing,” https://spacy. io/, 2025, [Online; accessed 25-Sep-2025]

work page 2025
[40]

A survey on in-context learning,

Q. Dong, L. Li, D. Dai, C. Zheng, J. Ma, R. Li, H. Xia, J. Xu, Z. Wu, B. Changet al., “A survey on in-context learning,” inProceedings of the 2024 conference on empirical methods in natural language processing, 2024, pp. 1107–1128

work page 2024
[41]

Can generalist foundation models outcompete special-purpose tuning? case study in medicine,

H. Nori, Y . T. Lee, S. Zhang, D. Carignan, R. Edgar, N. Fusi, N. King, J. Larson, Y . Li, W. Liuet al., “Can generalist foundation models outcompete special-purpose tuning? case study in medicine,”arXiv preprint arXiv:2311.16452, 2023

work page arXiv 2023
[42]

Confidence in the reasoning of large language models,

Y . Pawitan and C. Holmes, “Confidence in the reasoning of large language models,”Harvard Data Science Review, vol. 7, no. 1, pp. 2644–2353, 2025

work page 2025
[43]

Ship agents that wow,

LangChain, “Ship agents that wow,” https://www.langchain.com/, 2025, [Online; accessed 21-April-2026]

work page 2025
[44]

The easiest way to build with open models,

Ollama, “The easiest way to build with open models,” https://ollama. com/, 2025, [Online; accessed 25-Sep-2025]

work page 2025
[45]

The Llama 3 Herd of Models

A. Grattafiori, A. Dubey, A. Jauhri, A. Pandey, A. Kadian, A. Al-Dahle, A. Letman, A. Mathur, A. Schelten, A. Vaughanet al., “The llama 3 herd of models,”arXiv preprint arXiv:2407.21783, 2024

work page internal anchor Pith review Pith/arXiv arXiv 2024
[46]

Phi-3 Technical Report: A Highly Capable Language Model Locally on Your Phone

M. Abdin, S. A. Jacobs, A. A. Awan, J. Aneja, A. Awadallah, H. Awadalla, N. Bach, A. Bahree, A. Bakhtiari, H. Behlet al., “Phi- 3 technical report: A highly capable language model locally on your phone,”arXiv preprint arXiv:2404.14219, 2024

work page internal anchor Pith review Pith/arXiv arXiv 2024
[47]

Mistral 7B

A. Q. Jiang, A. Sablayrolles, A. Mensch, C. Bamford, D. S. Chaplot, D. de las Casas, F. Bressand, G. Lengyel, G. Lample, L. Saulnieret al., “Mistral 7b,”arXiv preprint arXiv:2310.06825, 2023

work page internal anchor Pith review Pith/arXiv arXiv 2023
[48]

Qwen3 Technical Report

A. Yang, A. Li, B. Yang, B. Zhang, B. Hui, B. Zheng, B. Yu, C. Gao, C. Huang, C. Lvet al., “Qwen3 technical report,”arXiv preprint arXiv:2505.09388, 2025

work page internal anchor Pith review Pith/arXiv arXiv 2025
[49]

GPT-4o mini: advancing cost-efficient intelligence,

OpenAi, “GPT-4o mini: advancing cost-efficient intelligence,” https: //openai.com/index/gpt-4o-mini-advancing-cost-efficient-intelligence/, 2025, [Online; accessed 6-May-2026]

work page 2025
[50]

Claude Haiku 4.5,

Claude, “Claude Haiku 4.5,” https://www.anthropic.com/claude/haiku, 2025, [Online; accessed 6-May-2026]

work page 2025
[51]

Gemini 2.5: Pushing the Frontier with Advanced Reasoning, Multimodality, Long Context, and Next Generation Agentic Capabilities

G. Comanici, E. Bieber, M. Schaekermann, I. Pasupat, N. Sachdeva, I. Dhillon, M. Blistein, O. Ram, D. Zhang, E. Rosenet al., “Gem- ini 2.5: Pushing the frontier with advanced reasoning, multimodality, long context, and next generation agentic capabilities,”arXiv preprint arXiv:2507.06261, 2025

work page internal anchor Pith review Pith/arXiv arXiv 2025
[52]

Pricing,

Claude, “Pricing,” https://platform.claude.com/docs/en/about-claude/ pricing, 2025, [Online; accessed 5-May-2026]. APPENDIX A. Methodology details Algorithm 1 summarizes the end-to-end flow, Algorithm 2 expands the payload-mediation procedure, and Table III sum- marizes the notation used throughout this section. TABLE III: Notation used in the PRIVSCOPEde...

work page 2025

[1] [1]

Language mod- els are few-shot learners,

T. Brown, B. Mann, N. Ryder, M. Subbiah, J. D. Kaplan, P. Dhariwal, A. Neelakantan, P. Shyam, G. Sastry, A. Askellet al., “Language mod- els are few-shot learners,”Advances in neural information processing systems, vol. 33, pp. 1877–1901, 2020

work page 1901

[2] [2]

Language models are unsupervised multitask learners,

A. Radford, J. Wu, R. Child, D. Luan, D. Amodei, I. Sutskeveret al., “Language models are unsupervised multitask learners,”OpenAI blog, vol. 1, no. 8, p. 9, 2019

work page 2019

[3] [3]

Introducing Operator,

OpenAI, “Introducing Operator,” https://openai.com/index/ introducing-operator/, 2025, [Online; accessed 25-Sep-2025]

work page 2025

[4] [4]

Empower your digital tasks with AutoGPT,

Autogpt, “Empower your digital tasks with AutoGPT,” https://agpt.co/, 2025, [Online; accessed 25-Sep-2025]

work page 2025

[5] [5]

Automate Your Business with AgentGPT,

AGENTGPT, “Automate Your Business with AgentGPT,” https:// agentgpt.io/, 2024, [Online; accessed 25-Sep-2025]

work page 2024

[6] [6]

Towards automating data access permissions in ai agents,

Y . Wu, K. Yang, F. Roesner, T. Kohno, N. Zhang, and U. Iqbal, “Towards automating data access permissions in ai agents,”arXiv preprint arXiv:2511.17959, 2025

work page arXiv 2025

[7] [7]

Kaggle, “Agents,” https://www.kaggle.com/whitepaper-agents, 2025, [Online; accessed 25-Sep-2025]

work page 2025

[8] [8]

Runtime permissions for privacy in proactive intelligent assistants,

N. Malkin, D. Wagner, and S. Egelman, “Runtime permissions for privacy in proactive intelligent assistants,” inEighteenth Symposium on Usable Privacy and Security (SOUPS 2022), 2022, pp. 633–651

work page 2022

[9] [9]

Agentic plan caching: Test- time memory for fast and cost-efficient llm agents,

Q. Zhang, M. Wornow, and K. Olukotun, “Agentic plan caching: Test- time memory for fast and cost-efficient llm agents,” inThe Thirty-ninth Annual Conference on Neural Information Processing Systems

work page

[10] [10]

Collaborative inference and learning between edge slms and cloud llms: A survey of algorithms, execution, and open challenges,

S. Li, H. Wang, W. Xu, R. Zhang, S. Guo, J. Yuan, X. Zhong, T. Zhang, and R. Li, “Collaborative inference and learning between edge slms and cloud llms: A survey of algorithms, execution, and open challenges,” arXiv preprint arXiv:2507.16731, 2025

work page arXiv 2025

[11] [11]

Beyond memoriza- tion: Violating privacy via inference with large language models,

R. Staab, M. Vero, M. Balunovic, and M. Vechev, “Beyond memoriza- tion: Violating privacy via inference with large language models,” inThe Twelfth International Conference on Learning Representations, 2023

work page 2023

[12] [12]

Deprompt: Desensitization and evaluation of personal identifiable information in large language model prompts,

X. Sun, G. Liu, Z. He, H. Li, and X. Li, “Deprompt: Desensitization and evaluation of personal identifiable information in large language model prompts,”arXiv preprint arXiv:2408.08930, 2024

work page arXiv 2024

[13] [13]

Extracting training data from large language models,

N. Carlini, F. Tramer, E. Wallace, M. Jagielski, A. Herbert-V oss, K. Lee, A. Roberts, T. Brown, D. Song, U. Erlingssonet al., “Extracting training data from large language models,” in30th USENIX security symposium (USENIX Security 21), 2021, pp. 2633–2650

work page 2021

[14] [14]

Sustainable ai: Environmental implications, challenges and opportunities,

C.-J. Wu, R. Raghavendra, U. Gupta, B. Acun, N. Ardalani, K. Maeng, G. Chang, F. Aga, J. Huang, C. Baiet al., “Sustainable ai: Environmental implications, challenges and opportunities,”Proceedings of machine learning and systems, vol. 4, pp. 795–813, 2022

work page 2022

[15] [15]

Splitreason: Learning to offload reasoning,

Y . Akhauri, A. Fei, C.-C. Chang, A. F. AbouElhamayed, Y . Li, and M. S. Abdelfattah, “Splitreason: Learning to offload reasoning,”arXiv preprint arXiv:2504.16379, 2025

work page arXiv 2025

[16] [16]

Cogenesis: A framework collaborating large and small language models for secure context-aware instruction following,

K. Zhang, J. Wang, E. Hua, B. Qi, N. Ding, and B. Zhou, “Cogenesis: A framework collaborating large and small language models for secure context-aware instruction following,”arXiv preprint arXiv:2403.03129, 2024

work page arXiv 2024

[17] [17]

Private Cloud Compute: A new frontier for AI privacy in the cloud,

A. S. Engineering and A. (SEAR), “Private Cloud Compute: A new frontier for AI privacy in the cloud,” https://security.apple.com/ documentation/private-cloud-compute, 2024, [Online; accessed 28-Oct- 2025]

work page 2024

[18] [18]

Mobilellm: Optimizing sub- billion parameter language models for on-device use cases,

Z. Liu, C. Zhao, F. Iandola, C. Lai, Y . Tian, I. Fedorov, Y . Xiong, E. Chang, Y . Shi, R. Krishnamoorthiet al., “Mobilellm: Optimizing sub- billion parameter language models for on-device use cases,” inForty-first International Conference on Machine Learning, 2024

work page 2024

[19] [19]

Pricing Flagship Model,

OpenAI, “Pricing Flagship Model,” https://developers.openai.com/api/ docs/pricing, 2025, [Online; accessed 25-Sep-2025]

work page 2025

[20] [20]

Presidio: Data Protection and De-identification SDK,

M. Presidio, “Presidio: Data Protection and De-identification SDK,” https://microsoft.github.io/presidio/, 2025, [Online; accessed 21-April- 2026]

work page 2025

[21] [21]

Privacy-and utility- preserving textual analysis via calibrated multivariate perturbations,

O. Feyisetan, B. Balle, T. Drake, and T. Diethe, “Privacy-and utility- preserving textual analysis via calibrated multivariate perturbations,” in Proceedings of the 13th international conference on web search and data mining, 2020, pp. 178–186

work page 2020

[22] [22]

Hide and seek (has): A lightweight framework for prompt privacy protection

Y . Chen, T. Li, H. Liu, and Y . Yu, “Hide and seek (has): A lightweight framework for prompt privacy protection,”arXiv preprint arXiv:2309.03057, 2023

work page arXiv 2023

[23] [23]

WebArena: A Realistic Web Environment for Building Autonomous Agents

S. Zhou, F. F. Xu, H. Zhu, X. Zhou, R. Lo, A. Sridhar, X. Cheng, T. Ou, Y . Bisk, D. Friedet al., “Webarena: A realistic web environment for building autonomous agents,”arXiv preprint arXiv:2307.13854, 2023

work page internal anchor Pith review Pith/arXiv arXiv 2023

[24] [24]

Agentdam: Privacy leakage evaluation for autonomous web agents,

A. Zharmagambetov, C. Guo, I. Evtimov, M. Pavlova, R. Salakhut- dinov, and K. Chaudhuri, “Agentdam: Privacy leakage evaluation for autonomous web agents,”arXiv preprint arXiv:2503.09780, 2025

work page arXiv 2025

[25] [25]

Privacylens: Evaluating privacy norm awareness of language models in action,

Y . Shao, T. Li, W. Shi, Y . Liu, and D. Yang, “Privacylens: Evaluating privacy norm awareness of language models in action,”Advances in Neural Information Processing Systems, vol. 37, pp. 89 373–89 407, 2024

work page 2024

[26] [26]

Operationalizing contextual integrity in privacy-conscious assistants.arXiv preprint arXiv:2408.02373, 2024

S. Ghalebikesabi, E. Bagdasaryan, R. Yi, I. Yona, I. Shumailov, A. Pappu, C. Shi, L. Weidinger, R. Stanforth, L. Berradaet al., “Operationalizing contextual integrity in privacy-conscious assistants,” arXiv preprint arXiv:2408.02373, 2024

work page arXiv 2024

[27] [27]

SecGPT: An Execution Isolation Architecture for LLM-Based Systems,

Y . Wu, F. Roesner, T. Kohno, N. Zhang, and U. Iqbal, “Isolategpt: An execution isolation architecture for llm-based agentic systems,”arXiv preprint arXiv:2403.04960, 2024

work page arXiv 2024

[28] [28]

Airgapagent: Protecting privacy- conscious conversational agents,

E. Bagdasarian, R. Yi, S. Ghalebikesabi, P. Kairouz, M. Gruteser, S. Oh, B. Balle, and D. Ramage, “Airgapagent: Protecting privacy- conscious conversational agents,” inProceedings of the 2024 on ACM SIGSAC Conference on Computer and Communications Security, 2024, pp. 3868–3882

work page 2024

[29] [29]

Alsa: Context-sensitive prompt privacy preservation in large language models,

H. Ma, W. Lu, Y . Liang, T. Wang, Q. Zhang, Y . Zhu, and J. Si, “Alsa: Context-sensitive prompt privacy preservation in large language models,” inProceedings of the 31st ACM SIGKDD Conference on Knowledge Discovery and Data Mining V . 2, 2025, pp. 2042–2053

work page 2025

[30] [30]

The fire thief is also the keeper: Balancing usability and privacy in prompts,

Z. Shen, Z. Xi, Y . He, W. Tong, J. Hua, and S. Zhong, “The fire thief is also the keeper: Balancing usability and privacy in prompts,”arXiv preprint arXiv:2406.14318, 2024

work page arXiv 2024

[31] [31]

Privacyrestore: Privacy-preserving inference in large language models via privacy removal and restoration,

Z. Zeng, J. Wang, J. Yang, Z. Lu, H. Li, H. Zhuang, and C. Chen, “Privacyrestore: Privacy-preserving inference in large language models via privacy removal and restoration,” inProceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2025, pp. 10 821–10 855

work page 2025

[32] [32]

Propile: Probing privacy leakage in large language models,

S. Kim, S. Yun, H. Lee, M. Gubri, S. Yoon, and S. J. Oh, “Propile: Probing privacy leakage in large language models,”Advances in Neural Information Processing Systems, vol. 36, pp. 20 750–20 762, 2023

work page 2023

[33] [33]

Ciphergpt: Secure two-party gpt inference,

X. Hou, J. Liu, J. Li, Y . Li, W.-j. Lu, C. Hong, and K. Ren, “Ciphergpt: Secure two-party gpt inference,”Cryptology ePrint Archive, 2023

work page 2023

[34] [34]

Iron: Private inference on transformers,

M. Hao, H. Li, H. Chen, P. Xing, G. Xu, and T. Zhang, “Iron: Private inference on transformers,”Advances in neural information processing systems, vol. 35, pp. 15 718–15 731, 2022

work page 2022

[35] [35]

Anti-adversarial learning: Desensitizing prompts for large language models,

X. Li, Z. Yin, X. Gu, and B. Shen, “Anti-adversarial learning: Desensitizing prompts for large language models,”arXiv preprint arXiv:2505.01273, 2025

work page arXiv 2025

[36] [36]

Can llms keep a secret? testing privacy implications of language models via contextual integrity theory,

N. Mireshghallah, H. Kim, X. Zhou, Y . Tsvetkov, M. Sap, R. Shokri, and Y . Choi, “Can llms keep a secret? testing privacy implications of language models via contextual integrity theory,”arXiv preprint arXiv:2310.17884, 2023

work page arXiv 2023

[37] [37]

Privacy as contextual integrity,

H. Nissenbaum, “Privacy as contextual integrity,”Wash. L. Rev., vol. 79, p. 119, 2004

work page 2004

[38] [38]

Ci-bench: Benchmarking contextual integrity of ai assistants on synthetic data.arXiv preprint arXiv:2409.13903, 2024

Z. Cheng, D. Wan, M. Abueg, S. Ghalebikesabi, R. Yi, E. Bagdasarian, B. Balle, S. Mellem, and S. O’Banion, “Ci-bench: Benchmarking contextual integrity of ai assistants on synthetic data,”arXiv preprint arXiv:2409.13903, 2024

work page arXiv 2024

[39] [39]

Industrial-Strength Natural Language Processing,

SpaCy, “Industrial-Strength Natural Language Processing,” https://spacy. io/, 2025, [Online; accessed 25-Sep-2025]

work page 2025

[40] [40]

A survey on in-context learning,

Q. Dong, L. Li, D. Dai, C. Zheng, J. Ma, R. Li, H. Xia, J. Xu, Z. Wu, B. Changet al., “A survey on in-context learning,” inProceedings of the 2024 conference on empirical methods in natural language processing, 2024, pp. 1107–1128

work page 2024

[41] [41]

Can generalist foundation models outcompete special-purpose tuning? case study in medicine,

H. Nori, Y . T. Lee, S. Zhang, D. Carignan, R. Edgar, N. Fusi, N. King, J. Larson, Y . Li, W. Liuet al., “Can generalist foundation models outcompete special-purpose tuning? case study in medicine,”arXiv preprint arXiv:2311.16452, 2023

work page arXiv 2023

[42] [42]

Confidence in the reasoning of large language models,

Y . Pawitan and C. Holmes, “Confidence in the reasoning of large language models,”Harvard Data Science Review, vol. 7, no. 1, pp. 2644–2353, 2025

work page 2025

[43] [43]

Ship agents that wow,

LangChain, “Ship agents that wow,” https://www.langchain.com/, 2025, [Online; accessed 21-April-2026]

work page 2025

[44] [44]

The easiest way to build with open models,

Ollama, “The easiest way to build with open models,” https://ollama. com/, 2025, [Online; accessed 25-Sep-2025]

work page 2025

[45] [45]

The Llama 3 Herd of Models

A. Grattafiori, A. Dubey, A. Jauhri, A. Pandey, A. Kadian, A. Al-Dahle, A. Letman, A. Mathur, A. Schelten, A. Vaughanet al., “The llama 3 herd of models,”arXiv preprint arXiv:2407.21783, 2024

work page internal anchor Pith review Pith/arXiv arXiv 2024

[46] [46]

Phi-3 Technical Report: A Highly Capable Language Model Locally on Your Phone

M. Abdin, S. A. Jacobs, A. A. Awan, J. Aneja, A. Awadallah, H. Awadalla, N. Bach, A. Bahree, A. Bakhtiari, H. Behlet al., “Phi- 3 technical report: A highly capable language model locally on your phone,”arXiv preprint arXiv:2404.14219, 2024

work page internal anchor Pith review Pith/arXiv arXiv 2024

[47] [47]

Mistral 7B

A. Q. Jiang, A. Sablayrolles, A. Mensch, C. Bamford, D. S. Chaplot, D. de las Casas, F. Bressand, G. Lengyel, G. Lample, L. Saulnieret al., “Mistral 7b,”arXiv preprint arXiv:2310.06825, 2023

work page internal anchor Pith review Pith/arXiv arXiv 2023

[48] [48]

Qwen3 Technical Report

A. Yang, A. Li, B. Yang, B. Zhang, B. Hui, B. Zheng, B. Yu, C. Gao, C. Huang, C. Lvet al., “Qwen3 technical report,”arXiv preprint arXiv:2505.09388, 2025

work page internal anchor Pith review Pith/arXiv arXiv 2025

[49] [49]

GPT-4o mini: advancing cost-efficient intelligence,

OpenAi, “GPT-4o mini: advancing cost-efficient intelligence,” https: //openai.com/index/gpt-4o-mini-advancing-cost-efficient-intelligence/, 2025, [Online; accessed 6-May-2026]

work page 2025

[50] [50]

Claude Haiku 4.5,

Claude, “Claude Haiku 4.5,” https://www.anthropic.com/claude/haiku, 2025, [Online; accessed 6-May-2026]

work page 2025

[51] [51]

Gemini 2.5: Pushing the Frontier with Advanced Reasoning, Multimodality, Long Context, and Next Generation Agentic Capabilities

G. Comanici, E. Bieber, M. Schaekermann, I. Pasupat, N. Sachdeva, I. Dhillon, M. Blistein, O. Ram, D. Zhang, E. Rosenet al., “Gem- ini 2.5: Pushing the frontier with advanced reasoning, multimodality, long context, and next generation agentic capabilities,”arXiv preprint arXiv:2507.06261, 2025

work page internal anchor Pith review Pith/arXiv arXiv 2025

[52] [52]

Pricing,

Claude, “Pricing,” https://platform.claude.com/docs/en/about-claude/ pricing, 2025, [Online; accessed 5-May-2026]. APPENDIX A. Methodology details Algorithm 1 summarizes the end-to-end flow, Algorithm 2 expands the payload-mediation procedure, and Table III sum- marizes the notation used throughout this section. TABLE III: Notation used in the PRIVSCOPEde...

work page 2025