CodeQ aggregates token rationales into code categories to enable global interpretability of LLMs, claiming over 50% entropy reduction and revealing model preference for syntactic cues plus human misalignment in a 37-person study.
arXiv preprint arXiv:1802.07810, 2018
4 Pith papers cite this work. Polarity classification is still indexing.
verdicts
UNVERDICTED 4representative citing papers
Human-grounded evaluation finds no significant performance improvement from adding SHAP explanations to model confidence scores in alert processing.
Compares LIME, input perturbation and attention for explaining QA on KB+text; proposes automatic evaluation paradigm and finds input perturbation superior in both automatic and human studies.
Advanced AI systems are unexplainable in full and produce explanations that humans cannot comprehend.
citing papers explorer
-
Enabling Global, Human-Centered Explanations for LLMs:From Tokens to Interpretable Code and Test Generation
CodeQ aggregates token rationales into code categories to enable global interpretability of LLMs, claiming over 50% entropy reduction and revealing model preference for syntactic cues plus human misalignment in a 37-person study.
-
A Human-Grounded Evaluation of SHAP for Alert Processing
Human-grounded evaluation finds no significant performance improvement from adding SHAP explanations to model confidence scores in alert processing.
-
Interpretable Question Answering on Knowledge Bases and Text
Compares LIME, input perturbation and attention for explaining QA on KB+text; proposes automatic evaluation paradigm and finds input perturbation superior in both automatic and human studies.
-
Unexplainability and Incomprehensibility of Artificial Intelligence
Advanced AI systems are unexplainable in full and produce explanations that humans cannot comprehend.