pith:QEGTRZSK
From Program Slices to Causal Clarity: Evaluating Faithful, Actionable LLM-Generated Failure Explanations via Context Partitioning and LLM-as-a-Judge
Varying the composition of debugging context causally changes the quality of LLM-generated failure explanations, with targeted artifacts yielding better causal and actionable insights than large undifferentiated contexts.
arxiv:2604.18309 v2 · 2026-04-20 · cs.SE
Add to your LaTeX paper
\usepackage{pith}
\pithnumber{QEGTRZSKM7TPLSA53H77I25ZDT}
Prints a linked badge after your title and injects PDF metadata. Compiles on arXiv. Learn more · Embed verified badge
Record completeness
Claims
Our results indicate that explanation quality is causally affected by context composition. Evidence-rich, failure-specific artifacts improve causal and action-oriented quality, whereas overly large contexts tend to yield vague explanations. Higher explanation-score quartiles are associated with higher downstream repair pass rates and, for some models, with fixes that are closer to the reference minimal fixes.
That the six evaluation criteria and LLM-as-a-judge scores faithfully reflect true causal and actionable quality, and that the 93 context configurations plus the chosen real bugs are representative enough to support general claims about context effects.
Focused, failure-specific contexts such as program slices produce more causal and actionable LLM bug explanations than large undifferentiated contexts, and higher-quality explanations correlate with better downstream repair success rates.
Receipt and verification
| First computed | 2026-05-21T01:05:19.330590Z |
|---|---|
| Builder | pith-number-builder-2026-05-17-v1 |
| Signature | Pith Ed25519
(pith-v1-2026-05) · public key |
| Schema | pith-number/v1.0 |
Canonical hash
810d38e64a67e6f5c81dd9fff46bb91cf94f839b823c465c08b967516ccff1b5
Aliases
· · · · ·Agent API
Verify this Pith Number yourself
curl -sH 'Accept: application/ld+json' https://pith.science/pith/QEGTRZSKM7TPLSA53H77I25ZDT \
| jq -c '.canonical_record' \
| python3 -c "import sys,json,hashlib; b=json.dumps(json.loads(sys.stdin.read()), sort_keys=True, separators=(',',':'), ensure_ascii=False).encode(); print(hashlib.sha256(b).hexdigest())"
# expect: 810d38e64a67e6f5c81dd9fff46bb91cf94f839b823c465c08b967516ccff1b5
Canonical record JSON
{
"metadata": {
"abstract_canon_sha256": "896a0b51accb7ad6369fa5b7d6290a7ebfc7890c62a0e1bb4461c5c0e9330fb3",
"cross_cats_sorted": [],
"license": "http://creativecommons.org/licenses/by/4.0/",
"primary_cat": "cs.SE",
"submitted_at": "2026-04-20T14:16:39Z",
"title_canon_sha256": "cd7a2731dfd4788c8ba1214696a1ffbeff5d0e1a99fda2a2c02ccd7b0e4b98d8"
},
"schema_version": "1.0",
"source": {
"id": "2604.18309",
"kind": "arxiv",
"version": 2
}
}