pith:L7D72N3N
CausalReasoningBenchmark: A Real-World Benchmark for Disentangled Evaluation of Causal Identification and Estimation
A benchmark of 173 real-world queries scores causal identification and numerical estimation separately to diagnose AI failures in causal analysis.
arxiv:2602.20571 v2 · 2026-02-24 · cs.AI
Add to your LaTeX paper
\usepackage{pith}
\pithnumber{L7D72N3NLDDETL3LVORIJACM4X}
Prints a linked badge after your title and injects PDF metadata. Compiles on arXiv. Learn more · Embed verified badge
Record completeness
Claims
By scoring these two components separately, our benchmark enables granular diagnosis: it distinguishes failures in causal reasoning from errors in numerical execution.
The ground-truth identification specifications and estimates extracted from the 79 source papers and three textbooks are accurate and complete enough to serve as reliable labels for the 173 queries.
CausalReasoningBenchmark supplies 173 real-world queries that separately grade causal identification specifications and point estimates to expose distinct failure modes in automated causal systems.
References
Receipt and verification
| First computed | 2026-05-17T23:39:04.417543Z |
|---|---|
| Builder | pith-number-builder-2026-05-17-v1 |
| Signature | Pith Ed25519
(pith-v1-2026-05) · public key |
| Schema | pith-number/v1.0 |
Canonical hash
5fc7fd376d58c649af6baba284804ce5c590df7186c0712f5c400bd04a096d4d
Aliases
· · · · ·Agent API
Verify this Pith Number yourself
curl -sH 'Accept: application/ld+json' https://pith.science/pith/L7D72N3NLDDETL3LVORIJACM4X \
| jq -c '.canonical_record' \
| python3 -c "import sys,json,hashlib; b=json.dumps(json.loads(sys.stdin.read()), sort_keys=True, separators=(',',':'), ensure_ascii=False).encode(); print(hashlib.sha256(b).hexdigest())"
# expect: 5fc7fd376d58c649af6baba284804ce5c590df7186c0712f5c400bd04a096d4d
Canonical record JSON
{
"metadata": {
"abstract_canon_sha256": "e2d6378955fdea69a379f372eb8c84bb114ae5363eba723996a3bf3aa595d9e5",
"cross_cats_sorted": [],
"license": "http://creativecommons.org/licenses/by/4.0/",
"primary_cat": "cs.AI",
"submitted_at": "2026-02-24T05:44:25Z",
"title_canon_sha256": "72c206eaa9bfed5145e51a2dc55dcdae3634fd7613d3f706ccd14a368f041f98"
},
"schema_version": "1.0",
"source": {
"id": "2602.20571",
"kind": "arxiv",
"version": 2
}
}