pith. sign in
Pith Number

pith:K3E4H3J4

pith:2026:K3E4H3J4BAQGCFTOR4TMUGZE3H
not attested not anchored not stored refs resolved

AgentTrap: Measuring Runtime Trust Failures in Third-Party Agent Skills

Hanwen Xing, Haomin Zhuang, Xiangliang Zhang, Yili Shen, Yuchen Ma, Yue Huang, Yufei Han, Yujun Zhou

LLM agents often finish the user's visible request while executing unsafe side effects from third-party skills as if they were normal workflow steps.

arxiv:2605.13940 v1 · 2026-05-13 · cs.CR · cs.AI

Add to your LaTeX paper
\usepackage{pith}
\pithnumber{K3E4H3J4BAQGCFTOR4TMUGZE3H}

Prints a linked badge after your title and injects PDF metadata. Compiles on arXiv. Learn more · Embed verified badge

Record completeness

1 Bitcoin timestamp
2 Internet Archive
3 Author claim open · sign in to claim
4 Citations open
5 Replications open
Portable graph bundle live · download bundle · merged state
The bundle contains the canonical record plus signed events. A mirror can host it anywhere and recompute the same current state with the deterministic merge algorithm.

Claims

C1strongest claim

Models often complete the visible user task while treating unsafe side effects introduced by the skill as part of the normal workflow.

C2weakest assumption

That the 141 hand-crafted tasks and sandboxed execution environment faithfully represent the diversity and stealth of real-world malicious third-party skills without introducing evaluation artifacts.

C3one line summary

AgentTrap shows that current LLM agents typically complete user tasks while silently accepting unsafe side effects from malicious third-party skills rather than refusing them.

References

13 extracted · 13 resolved · 8 Pith anchors

[1] AgentHarm: A Benchmark for Measuring Harmfulness of LLM Agents · arXiv:2410.09024
[2] Credential Leakage in LLM Agent Skills: A Large-Scale Empirical Study · arXiv:2604.03070
[3] AgentDojo: A Dynamic Environment to Evaluate Prompt Injection Attacks and Defenses for LLM Agents · arXiv:2406.13352
[4] Towards Secure Agent Skills: Architecture, Threat Taxonomy, and Security Analysis · arXiv:2604.02837
[5] Identifying the Risks of LM Agents with an LM-Emulated Sandbox · arXiv:2309.15817
Receipt and verification
First computed 2026-05-17T23:39:13.870101Z
Builder pith-number-builder-2026-05-17-v1
Signature Pith Ed25519 (pith-v1-2026-05) · public key
Schema pith-number/v1.0

Canonical hash

56c9c3ed3c082061166e8f26ca1b24d9e7302d7a13ccaf4f4a59d20496829aa0

Aliases

arxiv: 2605.13940 · arxiv_version: 2605.13940v1 · doi: 10.48550/arxiv.2605.13940 · pith_short_12: K3E4H3J4BAQG · pith_short_16: K3E4H3J4BAQGCFTO · pith_short_8: K3E4H3J4
Agent API
Verify this Pith Number yourself
curl -sH 'Accept: application/ld+json' https://pith.science/pith/K3E4H3J4BAQGCFTOR4TMUGZE3H \
  | jq -c '.canonical_record' \
  | python3 -c "import sys,json,hashlib; b=json.dumps(json.loads(sys.stdin.read()), sort_keys=True, separators=(',',':'), ensure_ascii=False).encode(); print(hashlib.sha256(b).hexdigest())"
# expect: 56c9c3ed3c082061166e8f26ca1b24d9e7302d7a13ccaf4f4a59d20496829aa0
Canonical record JSON
{
  "metadata": {
    "abstract_canon_sha256": "052f86cb119bd0f739111a24767ccec66e20007d73197810e54526f02bb15f69",
    "cross_cats_sorted": [
      "cs.AI"
    ],
    "license": "http://creativecommons.org/licenses/by/4.0/",
    "primary_cat": "cs.CR",
    "submitted_at": "2026-05-13T17:04:17Z",
    "title_canon_sha256": "5377d80ac6b95d15395f667b4f3bc9fcd7ade0a9bf6f191a2dba0cc9858b33ee"
  },
  "schema_version": "1.0",
  "source": {
    "id": "2605.13940",
    "kind": "arxiv",
    "version": 1
  }
}