Gradient based feature attribution in explainable ai: A technical review

· 2024 · arXiv 2403.10415

5 Pith papers cite this work. Polarity classification is still indexing.

5 Pith papers citing it

representative citing papers

Explanation of Dynamic Physical Field Predictions using WassersteinGrad: Application to Autoregressive Weather Forecasting

stat.ML · 2026-04-24 · unverdicted · novelty 6.0

WassersteinGrad aggregates perturbed gradient attribution maps via their entropic Wasserstein barycenter to avoid blurring from geometric shifts in explanations of autoregressive weather forecasts.

AttnTrace: Contextual Attribution of Prompt Injection and Knowledge Corruption

cs.CL · 2025-08-05 · unverdicted · novelty 6.0

AttnTrace is an attention-weight-based context traceback method for LLMs that claims higher accuracy and efficiency than prior art like TracLLM while aiding prompt injection detection.

Explainable AI Isn't Enough! Rethinking Algorithmic Contestability

stat.ML · 2026-05-15 · unverdicted · novelty 5.0

The paper defines algorithmic contestability as identifying evidence to overturn potentially incorrect decisions and identifies three types of such evidence that make decisions normatively indefensible under the decision maker's standards.

Evaluating the Temporal Detection Capability of Integrated Gradients Applied on Sound Classifier

eess.AS · 2026-05-22 · unverdicted · novelty 4.0

Integrated gradients on a 10-class domestic sound classifier yields 0.39 mean IoU, 0.52 frame F1 and 82.6% Pointing Game accuracy for temporal event detection, approaching weakly and strongly supervised framewise CNN baselines.

Safety Must Precede the Deployment of Open-Ended AI

cs.AI · 2025-02-06

citing papers explorer

Showing 5 of 5 citing papers.

Explanation of Dynamic Physical Field Predictions using WassersteinGrad: Application to Autoregressive Weather Forecasting stat.ML · 2026-04-24 · unverdicted · none · ref 50
WassersteinGrad aggregates perturbed gradient attribution maps via their entropic Wasserstein barycenter to avoid blurring from geometric shifts in explanations of autoregressive weather forecasts.
AttnTrace: Contextual Attribution of Prompt Injection and Knowledge Corruption cs.CL · 2025-08-05 · unverdicted · none · ref 59
AttnTrace is an attention-weight-based context traceback method for LLMs that claims higher accuracy and efficiency than prior art like TracLLM while aiding prompt injection detection.
Explainable AI Isn't Enough! Rethinking Algorithmic Contestability stat.ML · 2026-05-15 · unverdicted · none · ref 66
The paper defines algorithmic contestability as identifying evidence to overturn potentially incorrect decisions and identifies three types of such evidence that make decisions normatively indefensible under the decision maker's standards.
Evaluating the Temporal Detection Capability of Integrated Gradients Applied on Sound Classifier eess.AS · 2026-05-22 · unverdicted · none · ref 5
Integrated gradients on a 10-class domestic sound classifier yields 0.39 mean IoU, 0.52 frame F1 and 82.6% Pointing Game accuracy for temporal event detection, approaching weakly and strongly supervised framewise CNN baselines.
Safety Must Precede the Deployment of Open-Ended AI cs.AI · 2025-02-06 · unreviewed · ref 90

Gradient based feature attribution in explainable ai: A technical review

fields

years

verdicts

representative citing papers

citing papers explorer