Agentic-imodels evolves scikit-learn regressors via an autoresearch loop to jointly boost predictive performance and LLM-simulatability, improving downstream agentic data science tasks by up to 73% on the BLADE benchmark.
hub Canonical reference
Towards A Rigorous Science of Interpretable Machine Learning
Canonical reference. 71% of citing Pith papers cite this work as background.
abstract
As machine learning systems become ubiquitous, there has been a surge of interest in interpretable machine learning: systems that provide explanation for their outputs. These explanations are often used to qualitatively assess other criteria such as safety or non-discrimination. However, despite the interest in interpretability, there is very little consensus on what interpretable machine learning is and how it should be measured. In this position paper, we first define interpretability and describe when interpretability is needed (and when it is not). Next, we suggest a taxonomy for rigorous evaluation and expose open questions towards a more rigorous science of interpretable machine learning.
hub tools
citation-role summary
citation-polarity summary
representative citing papers
ISAAC auditing applied to three DTI models on the Davis benchmark finds 25% relative differences in causal reasoning scores despite nearly identical AUROC values.
In-context symbolic regression methods improve robustness of symbolic formula recovery from KANs, cutting median OFAT test MSE by up to 99.8 percent across hyperparameter sweeps.
Qualitative study of 19 practitioners reveals ten LLM product evaluation practices and introduces the results-actionability gap as a key barrier to turning findings into improvements.
A training-free method using Fourier-parameterized star-convex contours optimized via gradients to generate compact, faithful visual attributions for image classifiers on benchmarks like ImageNet.
A method automatically constructs a causal model from behavior tree structure and domain knowledge to generate real-time causal counterfactual explanations for robot decisions.
SAE-NOs extend sparse autoencoders to function spaces via Fourier neural operators with concept and domain sparsity, learning localized patterns more efficiently and generalizing across discretizations on vision data.
MIMIC is a new inversion framework that recovers visual concepts from VLM internal states using joint inversion, feature alignment, and three regularizers.
Chain-of-thought explanations in LLMs are frequently unfaithful: models systematically omit mention of biasing prompt features that change their answers and instead produce rationalizations for those biased outputs.
AI models misalign with humans on concept boundaries when probed with implausible category members, such as classifying words as vehicles or vegetables as fruit.
p-ResNet-50 adds a prototype layer with anchor- and medoid-based regularizations to ResNet-50, achieving ROC-AUC 0.994 and accuracy 0.957 on ~12k XCT patches while supplying case-based explanations aligned to expert categories.
The authors introduce a taxonomy with target, functional role, and mode of justification axes plus a framework that decomposes abstract XAI desiderata into concrete benchmarkable tasks via identified dependency structures.
An entropy criterion on mean representations characterises the polarised regime in VAEs and related models, with theoretical links to KL minimisation and empirical tests across several architectures.
A latent mediation framework with sparse autoencoders enables non-additive token-level influence attribution in LLMs by learning orthogonal features and back-propagating attributions.
Interpretability research should be judged by actionability—the degree to which its insights support concrete decisions and interventions—rather than explanatory power alone.
AI deployment in high-stakes areas requires domain-scoped calibrated verification with monitoring and revocation, using a proposed six-component Verification Coverage standard instead of mechanistic interpretability.
ShifaMind achieves competitive performance with the LAAT baseline on MIMIC-IV top-50 ICD-10 coding while outperforming vanilla concept bottleneck models and providing concept-mediated explanations.
The authors introduce the XAI Evaluation Card template to standardize how XAI evaluation metrics are defined, validated, and reported.
NEURON raises AUC from 0.74-0.77 to 0.84-0.88 on MIMIC-IV heart-failure mortality prediction while lifting human-aligned explanation scores from 0.50 to 0.85 by grounding SHAP values in SNOMED CT and patient notes via RAG-LLM.
In high-stakes settings, Shapley explanations increase analyst confidence but do not improve decision accuracy, and standard metrics fail to predict human utility.
A four-year mixed-methods study of game-based systems for Indian CHWs yields eight design guidelines for sustained engagement, learning transfer, and contextual appropriateness in low-resource health training.
X-SYS is a reference architecture for interactive explanation systems organized around STAR quality attributes and five service components, demonstrated via SemanticLens for vision-language models.
FaVeX accelerates verified explanations for neural networks via dynamic batch-sequential processing and query reuse while introducing verifier-optimal robust explanations that incorporate verifier incompleteness.
A new framework combines AI-derived concept embeddings with high-dimensional selective inference to enable statistically principled, interpretable discovery from unstructured data in empirical economics.
citing papers explorer
-
Design Guidelines for Game-Based Refresher Training of Community Health Workers in Low-Resource Contexts
A four-year mixed-methods study of game-based systems for Indian CHWs yields eight design guidelines for sustained engagement, learning transfer, and contextual appropriateness in low-resource health training.
-
Evaluating the False Trust Engendered by LLM Explanations
LLM reasoning traces and post-hoc explanations increase false trust in incorrect predictions, whereas contrastive dual explanations enhance users' ability to distinguish correct from incorrect AI outputs.
-
From Awareness to Intent: Mitigating Silent Driving System Failures through Prospective Situation Awareness Enhancing Interfaces
Prospective situation awareness enhancing interfaces delivered via AR HUD improve takeover performance after silent automation failures, with perceptual cues most effective at raising situational awareness and system-intent messages best at building trust.
-
Why Johnny Can't Use Agents: Industry Aspirations vs. User Realities with AI Agents
Industry markets AI agents for orchestration, creation, and insight, but a usability study with 31 participants reveals users face challenges from capability misalignment and lack of meta-cognition in tools like Operator and Manus.
-
From Trust to Appropriate Reliance: Measurement Constructs in Human-AI Decision-Making
A literature review shows that constructs for appropriate reliance on AI are fragmented, presents three views on the topic, and calls for consensus on objective metrics to enable better comparisons across studies.
-
Evaluating Physician-AI Interaction for Cancer Management: Paving the Path towards Precision Oncology
A within-subjects study with 32 physicians using a web-based CDSS for 12 synthetic multiple myeloma scenarios found over-reliance on ML outputs when discordant with RCT evidence and poor retention of model validation details.