LLM self-explanations for entity resolution are unstable and weakly faithful to causal evidence, but a hybrid framework using them as priors matches post-hoc quality at up to 10x lower cost.
Title resolution pending
8 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
roles
background 1polarities
background 1representative citing papers
Proposes SCSuff metric for evaluating LLM explanation sufficiency via model-generated alternative inputs, showing explanations are typically insufficient and predictable from hidden states.
Data-similarity and data-influence produce significantly overlapping rankings of training documents for LLM outputs, with asymmetry allowing a favorable cost-accuracy trade-off.
A claimed mathematical proof establishes a quadrilemma showing that complex environments, high AI performance, interpretable explanations, and complete faithfulness cannot coexist.
BoolXLLM augments an existing Boolean rule learner with LLMs for feature selection, discretization thresholds, and natural-language rule translation to improve interpretability while preserving accuracy.
MLLMs achieve competitive but subhuman performance on the new VSI-Bench for visual-spatial intelligence from videos, with spatial reasoning as the main bottleneck and explicit cognitive map generation improving distance estimation.
Self-explanations from LLMs produce faithful token subsets for correct predictions but align with human rationales only conditionally on text length and task complexity, unlike post-hoc attribution methods that highlight structural tokens.
citing papers explorer
-
Can we trust LLM Self-Explanations for Entity Resolution?
LLM self-explanations for entity resolution are unstable and weakly faithful to causal evidence, but a hybrid framework using them as priors matches post-hoc quality at up to 10x lower cost.
-
What LLMs explain is not what they believe: Evaluating explanation sufficiency under models' own input beliefs
Proposes SCSuff metric for evaluating LLM explanation sufficiency via model-generated alternative inputs, showing explanations are typically insufficient and predictable from hidden states.
-
Quantifying the Agreement Between Data-Influence and Data-Similarity to Understand LLM Behavior
Data-similarity and data-influence produce significantly overlapping rankings of training documents for LLM outputs, with asymmetry allowing a favorable cost-accuracy trade-off.
-
Fundamental Limitation in Explaining AI
A claimed mathematical proof establishes a quadrilemma showing that complex environments, high AI performance, interpretable explanations, and complete faithfulness cannot coexist.
-
BoolXLLM: LLM-Assisted Explainability for Boolean Models
BoolXLLM augments an existing Boolean rule learner with LLMs for feature selection, discretization thresholds, and natural-language rule translation to improve interpretability while preserving accuracy.
-
Thinking in Space: How Multimodal Large Language Models See, Remember, and Recall Spaces
MLLMs achieve competitive but subhuman performance on the new VSI-Bench for visual-spatial intelligence from videos, with spatial reasoning as the main bottleneck and explicit cognitive map generation improving distance estimation.
-
A Systematic Comparison between Extractive Self-Explanations and Human Rationales in Text Classification
Self-explanations from LLMs produce faithful token subsets for correct predictions but align with human rationales only conditionally on text length and task complexity, unlike post-hoc attribution methods that highlight structural tokens.
- From Actions to Understanding: Conformal Interpretability of Temporal Concepts in LLM Agents