pith. sign in

arxiv: 2605.26542 · v1 · pith:2DBH3L6Wnew · submitted 2026-05-26 · 💻 cs.CR · cs.AI

ChainCaps: Composition-Safe Tool-Using Agents via Monotonic Capability Attenuation

Pith reviewed 2026-06-29 17:30 UTC · model grok-4.3

classification 💻 cs.CR cs.AI
keywords tool-using agentspermission launderingcapability attenuationcomposition safetyinformation flow controlMCP proxy
0
0 comments X

The pith

ChainCaps stops permission laundering by intersecting sink-specific capability budgets during tool composition so authority never increases.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

Tool-using agents can chain permitted tools to produce unsafe end results such as leaking summaries of private files. ChainCaps assigns each value a budget of allowed sinks and propagates budgets by intersection, so any composed result can only lose or keep authority. The mechanism runs as a transparent proxy that needs no changes to the agent or the tool servers. On 82 tasks across five frontier models the system lowers attack success from 25-68 percent to 0-4.8 percent while keeping 96-100 percent of benign tasks intact. Manifest quality determines how much of the protection is realized.

Core claim

ChainCaps enforces monotonic capability attenuation: every value carries a sink-specific budget, composition takes the intersection of budgets, and therefore a value can preserve or lose authority but cannot gain new authority through any sequence of tools.

What carries the argument

Sink-specific capability budget attached to each value and reduced by intersection on every tool step.

If this is right

  • Attack success falls from 25-68 percent to 0-4.8 percent on the tested tasks.
  • Benign completion stays between 96 and 100 percent.
  • The method outperforms scalar-IFC and per-function isolation baselines in replay experiments.
  • Expert manifests reach 100 percent attack blocking while naive manifests reach only 27.3 percent.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • Better automated manifest generation would reduce the main deployment bottleneck.
  • The same intersection rule could be tested on implicit flows if the proxy gains visibility into internal agent state.
  • Attenuation budgets could be applied to other composition points such as multi-agent hand-offs.

Load-bearing premise

Manifests correctly list the permissions of every tool and the proxy can see every data movement between tools.

What would settle it

An agent completing a laundering sequence such as reading a confidential file, summarizing it, and sending the summary to an external sink despite the proxy, or a legitimate task failing solely because of budget intersection.

Figures

Figures reproduced from arXiv: 2605.26542 by Haoran Yu, Lifei Liu, Shiqi Yang, Xiaochong Jiang, Yichen Liu, Ziwei Li.

Figure 1
Figure 1. Figure 1: Budget propagation example. A summary combining salary data (display-only) and public news inherits the most restric￾tive budget via intersection. Because the resulting budget permits display but not HTTP sending, the outbound call is blocked while user display is allowed. This monotonic attenuation is the core runtime property of ChainCaps. where op names an effectful operation such as http send, file wri… view at source ↗
Figure 2
Figure 2. Figure 2: ChainCaps proxy architecture. The proxy intercepts every tools/call between the LLM agent and tool server. Steps 1–2 resolve argument dependencies and compute Bagg = Bctx ∩ T x∈D B(x). Step 3 checks whether Req(t, a) ∈ Bagg; if not, it verifies a lineage-bound declassification token before either forwarding (step 4) or blocking (step 4’, dashed). On the response path, step 5 propagates B(y) = Pass(t) ∩ Bag… view at source ↗
Figure 3
Figure 3. Figure 3: Attack success rate across five frontier models. Without defense, ASR ranges from 25% to 68%. With ChainCaps, all tested models fall to ≤5% ASR (Qwen 3.5 reaches 0%), corresponding to an 86–100% relative reduction on this stress-test suite. 4.2. Main Results [PITH_FULL_IMAGE:figures/full_fig_p005_3.png] view at source ↗
read the original abstract

Tool-using agents increasingly operate in open-ended deployment environments, where they compose file systems, web APIs, code interpreters, and enterprise services at runtime. This creates a safety gap in tool composition: an agent can satisfy every per-tool permission check and still produce an unsafe end-to-end effect, such as reading a confidential document, summarizing it, and sending the summary to an external endpoint. We call this failure mode permission laundering. ChainCaps addresses it with a runtime rule: every value carries a sink-specific capability budget, and tool composition propagates budgets by intersection. A value can preserve or lose authority as it moves through a tool chain, but it cannot gain new authority through composition. We implement ChainCaps as a transparent MCP proxy that requires no changes to the agent or tool servers. On 82 tasks across five frontier models from three providers, ChainCaps reduces attack success rate from 25-68% to 0-4.8% while preserving 96-100% benign completion. In replay experiments, it also outperforms scalar-IFC and per-function-isolation baselines. Manifest quality is the dominant deployment bottleneck: expert manifests reach 100% attack blocking, while naive manifests fall to 27.3%. Our claims are limited to explicit-flow composition safety under trusted manifests and proxy-visible data movement, a practical gap in deployed tool-using agents today.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

0 major / 1 minor

Summary. The paper introduces ChainCaps to mitigate permission laundering in tool-composing agents. It assigns sink-specific capability budgets to values and propagates them via intersection during tool chains, ensuring authority cannot increase through composition. Implemented as a transparent MCP proxy, the system is evaluated on 82 tasks across five frontier models, reducing attack success rates from 25-68% to 0-4.8% while retaining 96-100% benign completion rates. It outperforms scalar-IFC and per-function-isolation baselines, with manifest quality identified as the primary practical bottleneck. All claims are explicitly scoped to explicit-flow composition safety under trusted manifests and proxy-visible data movement.

Significance. If the results hold under the stated scope, this provides a practical, deployable mechanism for addressing composition safety gaps in tool-using agents, supported by concrete empirical evaluation across multiple models and tasks plus baseline comparisons. The clear scoping of claims and identification of manifest quality as the dominant bottleneck are strengths that enhance the work's utility for practitioners.

minor comments (1)
  1. The abstract and evaluation summary would benefit from a brief statement on how the 82 tasks were selected and categorized to allow readers to assess coverage of composition patterns.

Simulated Author's Rebuttal

0 responses · 0 unresolved

We thank the referee for the detailed and positive review, including the clear summary of our contributions, the assessment of significance, and the recommendation to accept. No major comments were provided in the report.

Circularity Check

0 steps flagged

No significant circularity

full rationale

The paper describes a runtime capability propagation rule (intersection on budgets) and evaluates it empirically on 82 tasks across five models, reporting attack success reductions and benign completion rates against external baselines. No equations, predictions, or first-principles derivations are presented that reduce by construction to fitted inputs, self-definitions, or self-citation chains. Manifest quality is identified as an external bottleneck with explicit experimental comparison (expert vs. naive). The work is scoped to explicit-flow safety under trusted manifests and contains no load-bearing self-referential steps.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 1 invented entities

The approach rests on the domain assumption of trusted manifests; the capability budget is a new invented entity with no independent evidence provided beyond the system design.

axioms (1)
  • domain assumption Trusted manifests and proxy-visible data movement suffice for safety claims
    Explicitly stated as the scope of the claims in the abstract.
invented entities (1)
  • sink-specific capability budget no independent evidence
    purpose: To enforce monotonic attenuation during tool composition
    New construct introduced to track and intersect authority

pith-pipeline@v0.9.1-grok · 6869 in / 1058 out tokens · 59734 ms · 2026-06-29T17:30:42.975451+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

10 extracted references · 10 canonical work pages · 4 internal anchors

  1. [1]

    URL https: //arxiv.org/abs/2510.21236. Chen, J. and Cong, S. L. Agentguard: Repurposing agen- tic orchestrator for safety evaluation of tool orchestra- tion,

  2. [2]

    Costa, M., K¨opf, B., Kolluri, A., Paverd, A., Russinovich, M., Salem, A., Tople, S., Wutschitz, L., and Zanella- B´eguelin, S

    URL https://arxiv.org/abs/2503.22738. Costa, M., K¨opf, B., Kolluri, A., Paverd, A., Russinovich, M., Salem, A., Tople, S., Wutschitz, L., and Zanella- B´eguelin, S. Securing ai agents with information-flow control,

  3. [3]

    Securing AI Agents with Information-Flow Control

    URL https://arxiv.org/abs/ 2505.23643. Garby, Z., Gordon, A. D., and Sands, D. The llmbda calcu- lus: Ai agents, conversations, and information flow,

  4. [4]

    Ji, Z., Wu, D., Jiang, W., Ma, P., Li, Z., Gao, Y ., Wang, S., and Li, Y

    URLhttps://arxiv.org/abs/2602.20064. Ji, Z., Wu, D., Jiang, W., Ma, P., Li, Z., Gao, Y ., Wang, S., and Li, Y . Taming various privilege escalation in llm-based agent systems: A mandatory access control framework,

  5. [5]

    Jiang, X., Yang, S., Yang, W., Liu, Y ., and Ji, C

    URL https://arxiv.org/abs/ 2601.11893. Jiang, X., Yang, S., Yang, W., Liu, Y ., and Ji, C. Sok: A taxonomy of attack vectors and defense strategies for agentic supply chain runtime,

  6. [6]

    SOK: A Taxonomy of Attack Vectors and Defense Strategies for Agentic Supply Chain Runtime

    URL https:// arxiv.org/abs/2602.19555. Kim, J., Choi, W., and Lee, B. Prompt flow integrity to prevent privilege escalation in llm agents,

  7. [7]

    Prompt flow integrity to prevent privilege escalation in llm agents.arXiv preprint arXiv:2503.15547, 2025

    URL https://arxiv.org/abs/2503.15547. Ruan, Y ., Dong, H., Wang, A., Pitis, S., Zhou, Y ., Ba, J., Dubois, Y ., Maddison, C. J., and Hashimoto, T. Identi- fying the risks of lm agents with an lm-emulated sand- box,

  8. [8]

    Xing, W., Qi, Z., Qin, Y ., Li, Y ., Chang, C., Yu, J., Lin, C., Xie, Z., and Han, M

    URL https://arxiv.org/ abs/2603.12614. Xing, W., Qi, Z., Qin, Y ., Li, Y ., Chang, C., Yu, J., Lin, C., Xie, Z., and Han, M. Mcp-guard: A multi-stage defense-in-depth framework for securing model context protocol in agentic ai,

  9. [9]

    org/abs/2508.10991

    URL https://arxiv. org/abs/2508.10991. Zhan, Q., Liang, Z., Ying, Z., and Kang, D. Injeca- gent: Benchmarking indirect prompt injections in tool- integrated large language model agents,

  10. [10]

    URL https://arxiv.org/abs/2403.02691. 7