pith. sign in
Pith Number

pith:MX7CKMOA

pith:2026:MX7CKMOA7XVUF477NIM2IBHTCI
not attested not anchored not stored refs resolved

How to Interpret Agent Behavior

Daniel Khashabi, Heyuan Huang, Jen-tse Huang, Jie Gao, Kaiser Sun, Katherine Van Koevering, Mark Dredze, Sijie Ji, Weiyan Shi, Zhuoran Lu, Ziang Xiao

ACTONOMY taxonomy structures agent behavior into 10 actions and 120 categories for consistent interpretation.

arxiv:2605.13625 v1 · 2026-05-13 · cs.AI

Add to your LaTeX paper
\usepackage{pith}
\pithnumber{MX7CKMOA7XVUF477NIM2IBHTCI}

Prints a linked badge after your title and injects PDF metadata. Compiles on arXiv. Learn more · Embed verified badge

Record completeness

1 Bitcoin timestamp
2 Internet Archive
3 Author claim open · sign in to claim
4 Citations open
5 Replications open
Portable graph bundle live · download bundle · merged state
The bundle contains the canonical record plus signed events. A mirror can host it anywhere and recompute the same current state with the deterministic merge algorithm.

Claims

C1strongest claim

Our experiments show that ACTONOMY can compare behavioral profiles across agents and characterize a single agent's behavior across diverse trajectories, surfacing patterns indicative of failure modes.

C2weakest assumption

That a taxonomy developed via Grounded Theory on a limited set of agent traces will remain comprehensive and unbiased when applied to new agents, tasks, and longer trajectories without substantial revision.

C3one line summary

ACT*ONOMY is a Grounded-Theory-derived hierarchical taxonomy and open repository that enables systematic comparison and characterization of autonomous agent behavior across trajectories.

References

64 extracted · 64 resolved · 13 Pith anchors

[1] J. R. Anderson and C. Lebiere. The newell test for a theory of cognition.Behavioral and brain Sciences, 26(5):587–601, 2003 2003
[2] Computational Linguistics , volume = 2022 · doi:10.1162/coli
[3] V . P. Bhardwaj. Agentassay: Token-efficient regression testing for non-deterministic ai agent workflows, 2026. URLhttps://zenodo.org/doi/10.5281/zenodo.18842011 2026 · doi:10.5281/zenodo.18842011
[4] Measuring Progress on Scalable Oversight for Large Language Models 2022 · arXiv:2211.03540
[5] V . Braun and V . Clarke. Using thematic analysis in psychology.Qualitative research in psy- chology, 3(2):77–101, 2006 2006
Receipt and verification
First computed 2026-05-18T02:44:17.827927Z
Builder pith-number-builder-2026-05-17-v1
Signature Pith Ed25519 (pith-v1-2026-05) · public key
Schema pith-number/v1.0

Canonical hash

65fe2531c0fdeb42f3ff6a19a404f3121e84d0c863f372fb86336848a651dc2f

Aliases

arxiv: 2605.13625 · arxiv_version: 2605.13625v1 · doi: 10.48550/arxiv.2605.13625 · pith_short_12: MX7CKMOA7XVU · pith_short_16: MX7CKMOA7XVUF477 · pith_short_8: MX7CKMOA
Agent API
Verify this Pith Number yourself
curl -sH 'Accept: application/ld+json' https://pith.science/pith/MX7CKMOA7XVUF477NIM2IBHTCI \
  | jq -c '.canonical_record' \
  | python3 -c "import sys,json,hashlib; b=json.dumps(json.loads(sys.stdin.read()), sort_keys=True, separators=(',',':'), ensure_ascii=False).encode(); print(hashlib.sha256(b).hexdigest())"
# expect: 65fe2531c0fdeb42f3ff6a19a404f3121e84d0c863f372fb86336848a651dc2f
Canonical record JSON
{
  "metadata": {
    "abstract_canon_sha256": "6c023ff3ab7fcba1a82b2fb3cd5c1b7b20d8f34a480d8e4a8a1e563d64253f0d",
    "cross_cats_sorted": [],
    "license": "http://arxiv.org/licenses/nonexclusive-distrib/1.0/",
    "primary_cat": "cs.AI",
    "submitted_at": "2026-05-13T14:52:40Z",
    "title_canon_sha256": "24f750ded8e0ac1b824ce62503a6fc18b9627aa001265b439c94738eae709819"
  },
  "schema_version": "1.0",
  "source": {
    "id": "2605.13625",
    "kind": "arxiv",
    "version": 1
  }
}