pith:6IV7I75E
Beyond the Black Box: Interpretability of Agentic AI Tool Use
A toolkit of sparse autoencoders and linear probes can identify the internal features that drive tool-use decisions inside AI agents before they act.
arxiv:2605.06890 v2 · 2026-05-07 · cs.AI · cs.MA
Add to your LaTeX paper
\usepackage{pith}
\pithnumber{6IV7I75EBOVETGY7TKL7GUAVOV}
Prints a linked badge after your title and injects PDF metadata. Compiles on arXiv. Learn more · Embed verified badge
Record completeness
Claims
By decomposing activations into sparse features, it identifies the internal layers and features most associated with tool decisions and tests their functional importance through feature ablation.
That the sparse features extracted by SAEs and the predictions from linear probes correspond to causally relevant internal representations of tool-use decisions rather than spurious correlations.
A mechanistic interpretability toolkit with SAEs and probes enables pre-action inference of tool decisions in AI agents trained on function-calling trajectories.
Formal links
Receipt and verification
| First computed | 2026-05-22T01:03:19.955494Z |
|---|---|
| Builder | pith-number-builder-2026-05-17-v1 |
| Signature | Pith Ed25519
(pith-v1-2026-05) · public key |
| Schema | pith-number/v1.0 |
Canonical hash
f22bf47fa40baa499b1f9a97f350157571fb79107dc989c4b39993cc3cac4990
Aliases
· · · · ·Agent API
Verify this Pith Number yourself
curl -sH 'Accept: application/ld+json' https://pith.science/pith/6IV7I75EBOVETGY7TKL7GUAVOV \
| jq -c '.canonical_record' \
| python3 -c "import sys,json,hashlib; b=json.dumps(json.loads(sys.stdin.read()), sort_keys=True, separators=(',',':'), ensure_ascii=False).encode(); print(hashlib.sha256(b).hexdigest())"
# expect: f22bf47fa40baa499b1f9a97f350157571fb79107dc989c4b39993cc3cac4990
Canonical record JSON
{
"metadata": {
"abstract_canon_sha256": "7bc38ef75e145719bbda9943dbdd8387887e39843ae163fd3d16e1e9ed558bb4",
"cross_cats_sorted": [
"cs.MA"
],
"license": "http://arxiv.org/licenses/nonexclusive-distrib/1.0/",
"primary_cat": "cs.AI",
"submitted_at": "2026-05-07T19:47:30Z",
"title_canon_sha256": "9395ffb1953fc09f81fb0ddd480e4d129d1955cd4e21c6ca6ccc7d41dd643e8d"
},
"schema_version": "1.0",
"source": {
"id": "2605.06890",
"kind": "arxiv",
"version": 2
}
}