Pith Number

pith:K3E4H3J4

pith:2026:K3E4H3J4BAQGCFTOR4TMUGZE3H

not attested not anchored not stored refs resolved

AgentTrap: Measuring Runtime Trust Failures in Third-Party Agent Skills

Hanwen Xing, Haomin Zhuang, Xiangliang Zhang, Yili Shen, Yuchen Ma, Yue Huang, Yufei Han, Yujun Zhou

LLM agents often finish the user's visible request while executing unsafe side effects from third-party skills as if they were normal workflow steps.

arxiv:2605.13940 v1 · 2026-05-13 · cs.CR · cs.AI

Open paper page JSON Open Graph Bundle Merged state Verified badge What is a Pith Number?

Add to your LaTeX paper

\usepackage{pith}
\pithnumber{K3E4H3J4BAQGCFTOR4TMUGZE3H}

Prints a linked badge after your title and injects PDF metadata. Compiles on arXiv. Learn more · Embed verified badge

Record completeness

1 Bitcoin timestamp

2 Internet Archive

3 Author claim open · sign in to claim

4 Citations open

5 Replications open

✓ Portable graph bundle live · download bundle · merged state

The bundle contains the canonical record plus signed events. A mirror can host it anywhere and recompute the same current state with the deterministic merge algorithm.

Claims

C1strongest claim

Models often complete the visible user task while treating unsafe side effects introduced by the skill as part of the normal workflow.

C2weakest assumption

That the 141 hand-crafted tasks and sandboxed execution environment faithfully represent the diversity and stealth of real-world malicious third-party skills without introducing evaluation artifacts.

C3one line summary

AgentTrap shows that current LLM agents typically complete user tasks while silently accepting unsafe side effects from malicious third-party skills rather than refusing them.

References

13 extracted · 13 resolved · 8 Pith anchors

[1] AgentHarm: A Benchmark for Measuring Harmfulness of LLM Agents · arXiv:2410.09024

[2] Credential Leakage in LLM Agent Skills: A Large-Scale Empirical Study · arXiv:2604.03070

[3] AgentDojo: A Dynamic Environment to Evaluate Prompt Injection Attacks and Defenses for LLM Agents · arXiv:2406.13352

[4] Towards Secure Agent Skills: Architecture, Threat Taxonomy, and Security Analysis · arXiv:2604.02837

[5] Identifying the Risks of LM Agents with an LM-Emulated Sandbox · arXiv:2309.15817

Receipt and verification

First computed	2026-05-17T23:39:13.870101Z
Builder	pith-number-builder-2026-05-17-v1
Signature	Pith Ed25519 (`pith-v1-2026-05`) · public key
Schema	pith-number/v1.0

Canonical hash

56c9c3ed3c082061166e8f26ca1b24d9e7302d7a13ccaf4f4a59d20496829aa0

Aliases

arxiv: 2605.13940 · arxiv_version: 2605.13940v1 · doi: 10.48550/arxiv.2605.13940 · pith_short_12: K3E4H3J4BAQG · pith_short_16: K3E4H3J4BAQGCFTO · pith_short_8: K3E4H3J4

Agent API

Resolver JSON Graph JSON Events JSON Schema Signing key

Verify this Pith Number yourself

curl -sH 'Accept: application/ld+json' https://pith.science/pith/K3E4H3J4BAQGCFTOR4TMUGZE3H \
  | jq -c '.canonical_record' \
  | python3 -c "import sys,json,hashlib; b=json.dumps(json.loads(sys.stdin.read()), sort_keys=True, separators=(',',':'), ensure_ascii=False).encode(); print(hashlib.sha256(b).hexdigest())"
# expect: 56c9c3ed3c082061166e8f26ca1b24d9e7302d7a13ccaf4f4a59d20496829aa0

Canonical record JSON

{
  "metadata": {
    "abstract_canon_sha256": "052f86cb119bd0f739111a24767ccec66e20007d73197810e54526f02bb15f69",
    "cross_cats_sorted": [
      "cs.AI"
    ],
    "license": "http://creativecommons.org/licenses/by/4.0/",
    "primary_cat": "cs.CR",
    "submitted_at": "2026-05-13T17:04:17Z",
    "title_canon_sha256": "5377d80ac6b95d15395f667b4f3bc9fcd7ade0a9bf6f191a2dba0cc9858b33ee"
  },
  "schema_version": "1.0",
  "source": {
    "id": "2605.13940",
    "kind": "arxiv",
    "version": 1
  }
}