pith:G5ZA4AWU
Counterfactual Trace Auditing of LLM Agent Skills
Counterfactual Trace Auditing shows skills reshape LLM agent behavior in 522 specific ways even when pass rates change by less than one percent.
arxiv:2605.11946 v2 · 2026-05-12 · cs.AI
Add to your LaTeX paper
\usepackage{pith}
\pithnumber{G5ZA4AWUYVUVZ6ZQ6TQ423M67R}
Prints a linked badge after your title and injects PDF metadata. Compiles on arXiv. Learn more · Embed verified badge
Record completeness
Claims
CTA identifies 522 SIP instances across the same paired traces, showing that the skills substantially reshape agent behavior even when pass rate is nearly unchanged.
That manual or automated segmentation of traces into goal-directed phases and their alignment produces unbiased, reproducible Skill Influence Pattern annotations that faithfully capture the causal effect of the skill.
CTA framework detects 522 skill influence patterns in LLM agent traces across 49 tasks where average pass rate shifts only +0.3%, exposing evaluation gaps in behavioral effects like template copying and excess planning.
Cited by
Receipt and verification
| First computed | 2026-06-01T01:02:42.443559Z |
|---|---|
| Builder | pith-number-builder-2026-05-17-v1 |
| Signature | Pith Ed25519
(pith-v1-2026-05) · public key |
| Schema | pith-number/v1.0 |
Canonical hash
37720e02d4c5695cfb30f4e1cd6d9efc5c0c816b60eed6886361888cf4eb114e
Aliases
· · · · ·Agent API
Verify this Pith Number yourself
curl -sH 'Accept: application/ld+json' https://pith.science/pith/G5ZA4AWUYVUVZ6ZQ6TQ423M67R \
| jq -c '.canonical_record' \
| python3 -c "import sys,json,hashlib; b=json.dumps(json.loads(sys.stdin.read()), sort_keys=True, separators=(',',':'), ensure_ascii=False).encode(); print(hashlib.sha256(b).hexdigest())"
# expect: 37720e02d4c5695cfb30f4e1cd6d9efc5c0c816b60eed6886361888cf4eb114e
Canonical record JSON
{
"metadata": {
"abstract_canon_sha256": "ca48521c969a2f9f7610fcbb3d91ae89a343431b732d80679848ab6bd2397466",
"cross_cats_sorted": [],
"license": "http://arxiv.org/licenses/nonexclusive-distrib/1.0/",
"primary_cat": "cs.AI",
"submitted_at": "2026-05-12T10:56:18Z",
"title_canon_sha256": "11dc2659358d72a662e51468a6ff5521fee2d46ef44f79680d985008eaed3799"
},
"schema_version": "1.0",
"source": {
"id": "2605.11946",
"kind": "arxiv",
"version": 2
}
}