A training-free method improves epistemic faithfulness of LLM textual explanations by guiding generation with attribution-based attention interventions.
Attnlrp: attention-aware layer-wise relevance propagation for transformers
4 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
years
2026 4verdicts
UNVERDICTED 4roles
baseline 1polarities
baseline 1representative citing papers
Saliency-R1 uses a novel saliency map technique and GRPO with human bounding-box overlap as reward to improve VLM reasoning faithfulness and interpretability.
CA-LIG is a unified hierarchical attribution method that computes layer-wise Integrated Gradients fused with class-specific attention gradients to generate signed, context-sensitive explanations for transformer models.
An attribution-based continual learning framework for LLMs modulates per-parameter gradients using task-specific importance scores to reduce forgetting of prior tasks.
citing papers explorer
-
Faithfulness Serum: Mitigating the Faithfulness Gap in Textual Explanations of LLM Decisions via Attribution Guidance
A training-free method improves epistemic faithfulness of LLM textual explanations by guiding generation with attribution-based attention interventions.
-
Saliency-R1: Enforcing Interpretable and Faithful Vision-language Reasoning via Saliency-map Alignment Reward
Saliency-R1 uses a novel saliency map technique and GRPO with human bounding-box overlap as reward to improve VLM reasoning faithfulness and interpretability.
-
Explainable AI: Context-Aware Layer-Wise Integrated Gradients for Explaining Transformer Models
CA-LIG is a unified hierarchical attribution method that computes layer-wise Integrated Gradients fused with class-specific attention gradients to generate signed, context-sensitive explanations for transformer models.
-
Attribution-Guided Continual Learning for Large Language Models
An attribution-based continual learning framework for LLMs modulates per-parameter gradients using task-specific importance scores to reduce forgetting of prior tasks.