IMPACTX: improving model performance by appropriately constraining the training with teacher explanations

Andrea Apicella; Andrea Pollastro; Francesco Isgr\`o; Roberto Prevete; Salvatore Giugliano

arxiv: 2502.12222 · v2 · submitted 2025-02-17 · 💻 cs.LG · cs.AI

IMPACTX: improving model performance by appropriately constraining the training with teacher explanations

Andrea Apicella , Salvatore Giugliano , Francesco Isgr\`o , Andrea Pollastro , Roberto Prevete This is my paper

Pith reviewed 2026-05-23 02:58 UTC · model grok-4.3

classification 💻 cs.LG cs.AI

keywords explainable AIattention mechanismdeep learningmodel performancefeature attributionimage classificationtraining constraint

0 comments

The pith

Integrating XAI outputs as an attention mechanism during training improves deep learning model performance and supplies built-in feature attribution maps at inference.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper presents IMPACTX as a method that turns XAI explanations into an automated attention signal used to constrain training of deep learning models. This process requires no external knowledge or human feedback and is tested on EfficientNet-B2, MobileNet, and LeNet-5 across CIFAR-10, CIFAR-100, and STL-10. Results indicate higher accuracy than standard training while the models themselves generate appropriate feature attribution maps without any post-training XAI step. The central idea is that the XAI signal acts as a training regularizer that both lifts performance and embeds explanation capability directly into the learned weights.

Core claim

IMPACTX uses XAI method outputs as a fully automated attention mechanism integrated into the training loop, yielding improved performance over standalone models on standard image classification benchmarks and directly producing proper feature attribution maps for decisions at inference time without external XAI methods.

What carries the argument

The IMPACTX attention mechanism, which injects XAI-generated feature attribution maps as a training constraint to guide optimization.

If this is right

All three tested models show higher accuracy on all three datasets when trained with the XAI attention signal.
The resulting models generate feature attribution maps during inference without calling any external XAI procedure.
The approach requires no human feedback or domain-specific external knowledge to operate.
The same training modification applies uniformly to EfficientNet-B2, MobileNet, and LeNet-5.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Deployed systems could satisfy explanation requirements without maintaining separate XAI modules after training.
The method suggests treating explanation quality as an explicit training objective rather than a post-hoc check.
If XAI methods improve for other data types, the same attention integration could be tested on non-image tasks.
Models trained this way might exhibit different robustness properties because the optimization is explicitly tied to attribution consistency.

Load-bearing premise

The XAI method used to create the attention signal produces maps accurate and stable enough to help training rather than add noise or bias.

What would settle it

Training a model with IMPACTX on CIFAR-10 or STL-10 and measuring either lower accuracy than the baseline model or attribution maps judged inappropriate by standard evaluation would falsify the performance and explanation claims.

Figures

Figures reproduced from arXiv: 2502.12222 by Andrea Apicella, Andrea Pollastro, Francesco Isgr\`o, Roberto Prevete, Salvatore Giugliano.

**Figure 1.** Figure 1: An overview of the IMPACTX framework. In the training phase of IMPACTX, both M and LEP receive x, generating m and z respectively. These are combined for classification by C. In particular, LEP and D exploit the R [PITH_FULL_IMAGE:figures/full_fig_p006_1.png] view at source ↗

**Figure 2.** Figure 2: The Decoder architecture designed for the CIFAR-10 and CIFAR100 datasets. The architecture is composed of convolutional (Conv2D), fully connected (FC), and UpSampling layers. The kernel size is 3 × 3 for all the convolutional layers, while the number of filters is given by the third dimension of the output shape. 9 [PITH_FULL_IMAGE:figures/full_fig_p009_2.png] view at source ↗

**Figure 3.** Figure 3: Images from the CIFAR-10 test set. The images have been filtered for better visualisation. 11 [PITH_FULL_IMAGE:figures/full_fig_p011_3.png] view at source ↗

**Figure 4.** Figure 4: Images from the CIFAR-100 test set. The images have been filtered for better visualisation. 5.2 Evaluating attribution maps In this section we want to evaluate if the attribution maps directly obtained by IMPACTX can be considered as explanations of the IMPACTX classification responses. To this aim, we compare them with the explanations given by 12 [PITH_FULL_IMAGE:figures/full_fig_p012_4.png] view at source ↗

**Figure 5.** Figure 5: Images from the STL-10 test set. The images have been filtered for better visualisation. R (that is, in this experimental setup, SHAP) outputs when applied on IMPACTX itself. In figures 3, 4 and 5 (left side), examples from the CIFAR-10, CIFAR-100, and STL-10 test sets are presented (column 1). The examples are reported considering the experiments made on LeNet-5. For each example, the 13 [PITH_FULL_IMAG… view at source ↗

read the original abstract

The eXplainable Artificial Intelligence (XAI) research predominantly concentrates to provide explainations about AI model decisions, especially Deep Learning (DL) models. However, there is a growing interest in using XAI techniques to automatically improve the performance of the AI systems themselves. This paper proposes IMPACTX, a novel approach that leverages XAI as a fully automated attention mechanism, without requiring external knowledge or human feedback. Experimental results show that IMPACTX has improved performance respect to the standalone ML model by integrating an attention mechanism based an XAI method outputs during the model training. Furthermore, IMPACTX directly provides proper feature attribution maps for the model's decisions, without relying on external XAI methods during the inference process. Our proposal is evaluated using three widely recognized DL models (EfficientNet-B2, MobileNet, and LeNet-5) along with three standard image datasets: CIFAR-10, CIFAR-100, and STL-10. The results show that IMPACTX consistently improves the performance of all the inspected DL models across all evaluated datasets, and it directly provides appropriate explanations for its responses.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

IMPACTX turns XAI outputs into a training-time attention signal for CNNs on image data and claims consistent accuracy gains plus built-in explanations, but the abstract supplies almost no implementation or validation details.

read the letter

The one thing to know is that this paper takes post-hoc XAI maps and feeds them back as an attention constraint while training standard CNNs, reporting better accuracy than the plain models plus free feature attributions at test time. They run the idea on EfficientNet-B2, MobileNet, and LeNet-5 using CIFAR-10, CIFAR-100, and STL-10, and state that the gains appear across all three architectures and all three datasets. That multi-model, multi-dataset check is the main empirical content and is a fair baseline for this kind of work. The framing as a fully automated attention mechanism that needs no human input or extra modules at inference is a straightforward extension of existing explanation-guided training papers, so the headline claim counts as new relative to the cited literature even if it is not a first-principles advance. The approach is simple enough that a practitioner in computer vision who already uses XAI tools might try it if the code were released. The soft spots are exactly where the stress-test note flags them. The abstract gives no equation for how the XAI output becomes an attention weight or loss term, no name for the XAI method, no ablation that removes the XAI signal, and no check that the maps are faithful or stable rather than noisy. Without those controls it is impossible to tell whether any reported lift comes from the claimed mechanism or from generic regularization effects. Statistical significance, error bars, and computational overhead are also missing. Because the full manuscript is referenced but the provided text stays at the abstract level, the central empirical claim remains unverified. This paper is aimed at people already working on XAI-augmented training or attention mechanisms in image classification. A reader who wants a quick empirical test of the idea on standard benchmarks could extract some value once the method section is expanded. I would send it to peer review because the evaluation setup is conventional and the idea is concrete enough that referees could usefully ask for the missing ablations and loss details rather than reject outright.

Referee Report

3 major / 3 minor

Summary. The manuscript proposes IMPACTX, a method that integrates outputs from an XAI technique as an automated attention mechanism to constrain training of deep learning models (EfficientNet-B2, MobileNet, LeNet-5), claiming consistent performance gains over baseline models on CIFAR-10, CIFAR-100, and STL-10 while also producing built-in feature attribution maps at inference without external XAI methods. The approach requires no human feedback or external knowledge.

Significance. If the claimed improvements are robust and attributable to the XAI attention rather than generic regularization, the work could contribute to the growing area of using explanations to enhance model training itself, providing both performance benefits and inherent interpretability. The absence of map-fidelity validation and statistical details in the reported experiments substantially weakens the current evidence for this contribution.

major comments (3)

[Experimental Evaluation] Experimental Evaluation (results paragraphs): The central claim of consistent performance improvement across all models and datasets is asserted without reported statistical significance tests, error bars, number of random seeds/runs, or ablation studies that isolate the XAI attention component from other training modifications.
[Methods] Methods / Training Procedure: No validation is provided that the chosen XAI method produces faithful, stable attention maps (e.g., no comparison against ground-truth attributions, no stability analysis across seeds or perturbations), which is required to attribute gains to the claimed mechanism rather than map noise or bias.
[Methods] Methods: The exact loss formulation, how the XAI-derived attention is mathematically incorporated into the training objective, and the specific XAI technique employed are not specified, preventing assessment of whether the constraint is parameter-free or introduces new hyperparameters.

minor comments (3)

[Abstract] Abstract: Typo 'respect to' should read 'with respect to'; 'based an XAI' should read 'based on XAI method outputs'.
[Abstract] Abstract: 'explanations' is misspelled as 'explanations' in the first sentence.
The manuscript would benefit from a clearer statement of the precise integration point of the attention signal (e.g., which layer or loss term) to aid reproducibility.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for the constructive comments. We address each major point below and will revise the manuscript accordingly to strengthen the experimental rigor and methodological clarity.

read point-by-point responses

Referee: [Experimental Evaluation] Experimental Evaluation (results paragraphs): The central claim of consistent performance improvement across all models and datasets is asserted without reported statistical significance tests, error bars, number of random seeds/runs, or ablation studies that isolate the XAI attention component from other training modifications.

Authors: We agree that the current presentation lacks the necessary statistical details to fully support the claims. In the revised version we will rerun the experiments with at least five random seeds, report mean accuracy and standard deviation (with error bars in figures), perform paired t-tests or Wilcoxon tests for significance against baselines, and add ablation studies that remove the XAI attention term while keeping all other training elements fixed. revision: yes
Referee: [Methods] Methods / Training Procedure: No validation is provided that the chosen XAI method produces faithful, stable attention maps (e.g., no comparison against ground-truth attributions, no stability analysis across seeds or perturbations), which is required to attribute gains to the claimed mechanism rather than map noise or bias.

Authors: We accept that the manuscript does not currently demonstrate faithfulness or stability of the attention maps. We will add a dedicated subsection that (i) specifies the XAI method, (ii) reports stability metrics (e.g., IoU or Spearman rank correlation across seeds and small input perturbations), and (iii) includes qualitative examples comparing the maps to known salient regions on the datasets. If quantitative ground-truth attributions are unavailable, we will cite prior validation studies of the chosen XAI technique. revision: yes
Referee: [Methods] Methods: The exact loss formulation, how the XAI-derived attention is mathematically incorporated into the training objective, and the specific XAI technique employed are not specified, preventing assessment of whether the constraint is parameter-free or introduces new hyperparameters.

Authors: The referee is correct that these details are missing from the current text. We will expand the Methods section to (a) name the exact XAI technique, (b) write the full training objective (including the mathematical form of the attention constraint and any weighting hyperparameter λ), and (c) state whether the constraint introduces additional tunable parameters or is parameter-free. revision: yes

Circularity Check

0 steps flagged

No circularity: empirical training loop is independent of its inputs

full rationale

The paper presents IMPACTX as an empirical procedure that feeds external XAI attribution maps into an attention mechanism during training and reports accuracy gains on standard image-classification benchmarks. No equations, fitted parameters, or derivations are shown that would make the reported improvement equivalent to the XAI maps by construction. No self-citation is invoked as a load-bearing uniqueness theorem, and the evaluation uses held-out test sets on CIFAR-10/100 and STL-10 with three unrelated architectures. The central claim therefore remains falsifiable against external benchmarks rather than reducing to a tautology or self-referential fit.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central claim rests on the domain assumption that XAI maps generated during training are reliable enough to serve as attention signals; no free parameters or invented entities are named in the abstract.

axioms (1)

domain assumption XAI methods produce stable and task-relevant feature attribution maps that can be used as training constraints without external knowledge or human feedback.
Invoked when the paper states that IMPACTX 'leverages XAI as a fully automated attention mechanism, without requiring external knowledge or human feedback'.

pith-pipeline@v0.9.0 · 5739 in / 1308 out tokens · 27066 ms · 2026-05-23T02:58:14.642181+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

58 extracted references · 58 canonical work pages

[1]

Finding and removing clever hans: using explanation methods to debug and improve deep models

Christopher J Anders, Leander Weber, David Neumann, Wojciech Samek, Klaus-Robert M¨ uller, and Sebastian Lapuschkin. Finding and removing clever hans: using explanation methods to debug and improve deep models. Information Fusion, 77:261–295, 2022

work page 2022
[2]

Impact of nutritional factors in blood glucose predic- tion in type 1 diabetes through machine learning

Giovanni Annuzzi, Andrea Apicella, Pasquale Arpaia, Lutgarda Bozzetto, Sabatina Criscuolo, Egidio De Benedetto, Marisa Pesola, Roberto Prevete, and Ersilia Vallefuoco. Impact of nutritional factors in blood glucose predic- tion in type 1 diabetes through machine learning. IEEE Access, 11:17104– 17115, 2023

work page 2023
[3]

Apicella, F

A. Apicella, F. Isgr` o, R. Prevete, A. Sorrentino, and G. Tamburrini. Ex- plaining classification systems using sparse dictionaries. ESANN 2019 - Proceedings, 27th European Symposium on Artificial Neural Networks, Computational Intelligence and Machine Learning , page 495 – 500, 2019

work page 2019
[4]

Integration of context information through probabilistic ontological knowl- edge into image classification

Andrea Apicella, Anna Corazza, Francesco Isgr` o, and Giuseppe Vettigli. Integration of context information through probabilistic ontological knowl- edge into image classification. Information, 9(10):252, 2018

work page 2018
[5]

Strategies to exploit xai to improve classification systems

Andrea Apicella, Luca Di Lorenzo, Francesco Isgr` o, Andrea Pollastro, and Roberto Prevete. Strategies to exploit xai to improve classification systems. 15 Communications in Computer and Information Science , 1901 CCIS:147 – 159, 2023

work page 1901
[6]

Exploiting auto-encoders and segmentation methods for middle-level explanations of image classification systems

Andrea Apicella, Salvatore Giugliano, Francesco Isgr` o, and Roberto Pre- vete. Exploiting auto-encoders and segmentation methods for middle-level explanations of image classification systems. Knowledge-Based Systems , 255:109725, 2022

work page 2022
[7]

Shap-based explanations to improve classification systems

Andrea Apicella, Salvatore Giugliano, Francesco Isgr` o, and Roberto Pre- vete. Shap-based explanations to improve classification systems. In Pro- ceedings of the 4th Italian Workshop on Explainable Artificial Intelligence co-located with 22nd International Conference of the Italian Association for Artificial Intelligence(AIxIA 2023), Roma, Italy, Novembe...

work page 2023
[8]

Explainable artificial intelligence (xai): Concepts, taxonomies, opportunities and challenges to- ward responsible ai

Alejandro Barredo Arrieta, Natalia D´ ıaz-Rodr´ ıguez, Javier Del Ser, Adrien Bennetot, Siham Tabik, Alberto Barbado, Salvador Garc´ ıa, Sergio Gil- L´ opez, Daniel Molina, Richard Benjamins, et al. Explainable artificial intelligence (xai): Concepts, taxonomies, opportunities and challenges to- ward responsible ai. Information Fusion, 58:82–115, 2020

work page 2020
[9]

On pixel-wise ex- planations for non-linear classifier decisions by layer-wise relevance propa- gation

Sebastian Bach, Alexander Binder, Gr´ egoire Montavon, Frederick Klauschen, Klaus-Robert M¨ uller, and Wojciech Samek. On pixel-wise ex- planations for non-linear classifier decisions by layer-wise relevance propa- gation. PloS one, 10(7):e0130140, 2015

work page 2015
[10]

Guided zoom: Zooming into network evidence to refine fine-grained model decisions.IEEE Transactions on Pattern Analysis and Machine Intelligence , 43(11):4196–4202, 2021

Sarah Adel Bargal, Andrea Zunino, Vitali Petsiuk, Jianming Zhang, Kate Saenko, Vittorio Murino, and Stan Sclaroff. Guided zoom: Zooming into network evidence to refine fine-grained model decisions.IEEE Transactions on Pattern Analysis and Machine Intelligence , 43(11):4196–4202, 2021

work page 2021
[11]

Layer-wise relevance propagation for neural networks with local renormalization layers

Alexander Binder, Gr´ egoire Montavon, Sebastian Lapuschkin, Klaus- Robert M¨ uller, and Wojciech Samek. Layer-wise relevance propagation for neural networks with local renormalization layers. In International Confer- ence on Artificial Neural Networks , pages 63–71, Barcelona, Spain, 2016. Springer

work page 2016
[12]

An analysis of single-layer networks in unsupervised feature learning

Adam Coates, Andrew Ng, and Honglak Lee. An analysis of single-layer networks in unsupervised feature learning. In Proceedings of the fourteenth international conference on artificial intelligence and statistics , pages 215–

work page
[13]

JMLR Workshop and Conference Proceedings, 2011

work page 2011
[14]

Ima- genet: A large-scale hierarchical image database

Jia Deng, Wei Dong, Richard Socher, Li-Jia Li, Kai Li, and Li Fei-Fei. Ima- genet: A large-scale hierarchical image database. In 2009 IEEE conference on computer vision and pattern recognition , pages 248–255. Ieee, 2009

work page 2009
[15]

Dosovitskiy and T

A. Dosovitskiy and T. Brox. Inverting visual representations with convo- lutional networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 4829–4837, Las Vegas, USA, 2016. 16

work page 2016
[16]

Erhan, Y

D. Erhan, Y. Bengio, . Courville, and P. Vincent. Visualizing higher-layer features of a deep network. University of Montreal, 1341(3):1, 2009

work page 2009
[17]

Attention branch network: Learning of attention mechanism for visual explanation

Hiroshi Fukui, Tsubasa Hirakawa, Takayoshi Yamashita, and Hironobu Fu- jiyoshi. Attention branch network: Learning of attention mechanism for visual explanation. In Proceedings of the IEEE/CVF conference on com- puter vision and pattern recognition , pages 10705–10714, 2019

work page 2019
[18]

Improvement in deep networks for optimization using explainable artificial intelligence

Jin ha Lee, Ik hee Shin, Sang gu Jeong, Seung-Ik Lee, Muhama- mad Zaigham Zaheer, and Beom-Su Seo. Improvement in deep networks for optimization using explainable artificial intelligence. In 2019 International Conference on Information and Communication Technology Convergence (ICTC), pages 525–530. IEEE, 2019

work page 2019
[19]

Impact of feedback type on explanatory interactive learning

Misgina Tsighe Hagos, Kathleen M Curran, and Brian Mac Namee. Impact of feedback type on explanatory interactive learning. In International Sym- posium on Methodologies for Intelligent Systems , pages 127–137. Springer, 2022

work page 2022
[20]

Howard, Menglong Zhu, Bo Chen, Dmitry Kalenichenko, Wei- jun Wang, Tobias Weyand, Marco Andreetto, and Hartwig Adam

Andrew G. Howard, Menglong Zhu, Bo Chen, Dmitry Kalenichenko, Wei- jun Wang, Tobias Weyand, Marco Andreetto, and Hartwig Adam. Mo- bilenets: Efficient convolutional neural networks for mobile vision applica- tions, 2017

work page 2017
[21]

Harnessing deep neural networks with logic rules

Zhiting Hu, Xuezhe Ma, Zhengzhong Liu, Eduard Hovy, and Eric Xing. Harnessing deep neural networks with logic rules. In Katrin Erk and Noah A. Smith, editors, Proceedings of the 54th Annual Meeting of the As- sociation for Computational Linguistics (Volume 1: Long Papers) , pages 2410–2420, Berlin, Germany, August 2016. Association for Computational Linguistics

work page 2016
[22]

A systematic review of explainable artificial intelligence in terms of different application domains and tasks

Mir Riyanul Islam, Mobyen Uddin Ahmed, Shaibal Barua, and Shahina Begum. A systematic review of explainable artificial intelligence in terms of different application domains and tasks. Applied Sciences, 12(3):1353, 2022

work page 2022
[23]

Improving deep learning interpretability by saliency guided training

Aya Abdelsalam Ismail, Hector Corrada Bravo, and Soheil Feizi. Improving deep learning interpretability by saliency guided training. Advances in Neural Information Processing Systems, 34:26726–26739, 2021

work page 2021
[24]

Evaluating explain- able artificial intelligence methods for multi-label deep learning classifi- cation tasks in remote sensing

Ioannis Kakogeorgiou and Konstantinos Karantzalos. Evaluating explain- able artificial intelligence methods for multi-label deep learning classifi- cation tasks in remote sensing. International Journal of Applied Earth Observation and Geoinformation , 103:102520, 2021

work page 2021
[25]

Krizhevsky and G

A. Krizhevsky and G. Hinton. Learning multiple layers of features from tiny images. Master’s thesis, Department of Computer Science, University of Toronto, 2009. 17

work page 2009
[26]

Gradient- based learning applied to document recognition

Yann LeCun, L´ eon Bottou, Yoshua Bengio, and Patrick Haffner. Gradient- based learning applied to document recognition. Proceedings of the IEEE, 86(11):2278–2324, 1998

work page 1998
[27]

The mythos of model interpretability: In machine learn- ing, the concept of interpretability is both important and slippery

Zachary C Lipton. The mythos of model interpretability: In machine learn- ing, the concept of interpretability is both important and slippery. Queue, 16(3):31–57, 2018

work page 2018
[28]

Icel: Learning with inconsistent explanations

Biao Liu, Xiaoyu Wu, and Bo Yuan. Icel: Learning with inconsistent explanations. In ICASSP 2023-2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pages 1–5. IEEE, 2023

work page 2023
[29]

Incorporating priors with feature attribu- tion on text classification

Frederick Liu and Besim Avci. Incorporating priors with feature attribu- tion on text classification. In Anna Korhonen, David Traum, and Llu´ ıs M` arquez, editors,Proceedings of the 57th Annual Meeting of the Associa- tion for Computational Linguistics , pages 6274–6283, Florence, Italy, July

work page
[30]

Association for Computational Linguistics

work page
[31]

A unified approach to interpreting model predictions

Scott M Lundberg and Su-In Lee. A unified approach to interpreting model predictions. In I. Guyon, U. V. Luxburg, S. Bengio, H. Wallach, R. Fergus, S. Vishwanathan, and R. Garnett, editors, Advances in Neural Information Processing Systems 30, pages 4765–4774. Curran Associates, Inc., 2017

work page 2017
[32]

Interpretability-driven sample selection using self supervised learn- ing for disease classification and segmentation

Dwarikanath Mahapatra, Alexander Poellinger, Ling Shao, and Mauricio Reyes. Interpretability-driven sample selection using self supervised learn- ing for disease classification and segmentation. IEEE transactions on med- ical imaging, 40(10):2548–2562, 2021

work page 2021
[33]

Explanation in artificial intelligence: Insights from the social sciences

Tim Miller. Explanation in artificial intelligence: Insights from the social sciences. Artificial intelligence, 267:1–38, 2019

work page 2019
[34]

Em- bedding human knowledge into deep neural network via attention map

Masahiro Mitsuhara, Hiroshi Fukui, Yusuke Sakashita, Takanori Ogata, Tsubasa Hirakawa, Takayoshi Yamashita, and Hironobu Fujiyoshi. Em- bedding human knowledge into deep neural network via attention map. In Giovanni Maria Farinella, Petia Radeva, Jos´ e Braz, and Kadi Bouatouch, editors, Proceedings of the 16th International Joint Conference on Com- puter...

work page 2021
[35]

Layer-wise relevance propagation: an overview

Gr´ egoire Montavon, Alexander Binder, Sebastian Lapuschkin, Wojciech Samek, and Klaus-Robert M¨ uller. Layer-wise relevance propagation: an overview. Explainable AI: interpreting, explaining and visualizing deep learning, pages 193–209, 2019

work page 2019
[36]

Explaining nonlinear classification de- cisions with deep taylor decomposition

Gr´ egoire Montavon, Sebastian Lapuschkin, Alexander Binder, Wojciech Samek, and Klaus-Robert M¨ uller. Explaining nonlinear classification de- cisions with deep taylor decomposition. Pattern Recognition, 65:211–222, 2017. 18

work page 2017
[37]

”why should i trust you?” explaining the predictions of any classifier

Marco Tulio Ribeiro, Sameer Singh, and Carlos Guestrin. ”why should i trust you?” explaining the predictions of any classifier. In Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining , pages 1135–1144, 2016

work page 2016
[38]

Hughes, and Finale Doshi-Velez

Andrew Slavin Ross, Michael C. Hughes, and Finale Doshi-Velez. Right for the right reasons: Training differentiable models by constraining their explanations. In Proceedings of the Twenty-Sixth International Joint Con- ference on Artificial Intelligence, IJCAI-17 , pages 2662–2670, 2017

work page 2017
[39]

Evaluating the visualization of what a deep neural network has learned

Wojciech Samek, Alexander Binder, Gr´ egoire Montavon, Sebastian La- puschkin, and Klaus-Robert M¨ uller. Evaluating the visualization of what a deep neural network has learned. IEEE transactions on neural networks and learning systems , 28(11):2660–2673, 2016

work page 2016
[40]

Relevance-based feature masking: Improving neural network based whale classification through explainable artificial intelligence

Dominik Schiller, Tobias Huber, Florian Lingenfelser, Michael Dietz, An- dreas Seiderer, and Elisabeth Andr´ e. Relevance-based feature masking: Improving neural network based whale classification through explainable artificial intelligence. 2019

work page 2019
[41]

Human-centered xai: Developing design patterns for ex- planations of clinical decision support systems

Tjeerd AJ Schoonderwoerd, Wiard Jorritsma, Mark A Neerincx, and Karel Van Den Bosch. Human-centered xai: Developing design patterns for ex- planations of clinical decision support systems. International Journal of Human-Computer Studies, 154:102684, 2021

work page 2021
[42]

Making deep neural networks right for the right scientific reasons by interacting with their explanations

Patrick Schramowski, Wolfgang Stammer, Stefano Teso, Anna Brug- ger, Franziska Herbert, Xiaoting Shao, Hans-Georg Luigs, Anne-Katrin Mahlein, and Kristian Kersting. Making deep neural networks right for the right scientific reasons by interacting with their explanations. Nature Machine Intelligence, 2(8):476–486, 2020

work page 2020
[43]

Grad-cam: Visual explanations from deep networks via gradient-based localization

Ramprasaath R Selvaraju, Michael Cogswell, Abhishek Das, Ramakrishna Vedantam, Devi Parikh, and Dhruv Batra. Grad-cam: Visual explanations from deep networks via gradient-based localization. In Proceedings of the IEEE international conference on computer vision , pages 618–626, 2017

work page 2017
[44]

Taking a hint: Lever- aging explanations to make vision and language models more grounded

Ramprasaath R Selvaraju, Stefan Lee, Yilin Shen, Hongxia Jin, Shalini Ghosh, Larry Heck, Dhruv Batra, and Devi Parikh. Taking a hint: Lever- aging explanations to make vision and language models more grounded. In Proceedings of the IEEE/CVF international conference on computer vision, pages 2591–2600, 2019

work page 2019
[45]

Utilizing explainable ai for improving the performance of neural networks

Huawei Sun, Lorenzo Servadei, Hao Feng, Michael Stephan, Avik Santra, and Robert Wille. Utilizing explainable ai for improving the performance of neural networks. In 2022 21st IEEE International Conference on Machine Learning and Applications (ICMLA) , pages 1775–1782. IEEE, 2022. 19

work page 2022
[46]

Explanation-guided training for cross- domain few-shot classification

Jiamei Sun, Sebastian Lapuschkin, Wojciech Samek, Yunqing Zhao, Ngai- Man Cheung, and Alexander Binder. Explanation-guided training for cross- domain few-shot classification. In 2020 25th International Conference on Pattern Recognition (ICPR), pages 7609–7616. IEEE, 2021

work page 2020
[47]

Efficientnet: Rethinking model scaling for convolutional neural networks

Mingxing Tan and Quoc Le. Efficientnet: Rethinking model scaling for convolutional neural networks. In International conference on machine learning, pages 6105–6114. PMLR, 2019

work page 2019
[48]

Explanatory interactive machine learn- ing

Stefano Teso and Kristian Kersting. Explanatory interactive machine learn- ing. In Proceedings of the 2019 AAAI/ACM Conference on AI, Ethics, and Society, pages 239–245, 2019

work page 2019
[49]

Quantifying explainability of saliency meth- ods in deep neural networks with a synthetic dataset

Erico Tjoa and Cuntai Guan. Quantifying explainability of saliency meth- ods in deep neural networks with a synthetic dataset. IEEE Transactions on Artificial Intelligence, 4(4):858–870, 2022

work page 2022
[50]

Attention is all you need

A Vaswani. Attention is all you need. Advances in Neural Information Processing Systems, 2017

work page 2017
[51]

Beyond explaining: Opportunities and challenges of xai-based model improvement

Leander Weber, Sebastian Lapuschkin, Alexander Binder, and Wojciech Samek. Beyond explaining: Opportunities and challenges of xai-based model improvement. Information Fusion, 2022

work page 2022
[52]

Self-training with noisy student improves imagenet classification

Qizhe Xie, Minh-Thang Luong, Eduard Hovy, and Quoc V Le. Self-training with noisy student improves imagenet classification. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition , pages 10687–10698, 2020

work page 2020
[53]

Pruning by explaining: A novel criterion for deep neural network pruning

Seul-Ki Yeom, Philipp Seegerer, Sebastian Lapuschkin, Alexander Binder, Simon Wiedemann, Klaus-Robert M¨ uller, and Wojciech Samek. Pruning by explaining: A novel criterion for deep neural network pruning. Pattern Recognition, 115:107899, 2021

work page 2021
[54]

M. D. Zeiler and R. Fergus. Visualizing and understanding convolutional networks. In European Conference on Computer Cision , pages 818–833, Zurich, Switzerland, 2014. Springer

work page 2014
[55]

M. D. Zeiler, G. W. Taylor, and R. Fergus. Adaptive deconvolutional net- works for mid and high level feature learning. In Computer Vision (ICCV), 2011 IEEE International Conference on , pages 2018–2025, Barcelona, Spain, 2011. IEEE

work page 2011
[56]

Learning deep features for discriminative localization

Bolei Zhou, Aditya Khosla, Agata Lapedriza, Aude Oliva, and Antonio Tor- ralba. Learning deep features for discriminative localization. In Proceedings of the IEEE conference on computer vision and pattern recognition , pages 2921–2929, 2016. 20

work page 2016
[57]

Excitation dropout: Encouraging plastic- ity in deep neural networks

Andrea Zunino, Sarah Adel Bargal, Pietro Morerio, Jianming Zhang, Stan Sclaroff, and Vittorio Murino. Excitation dropout: Encouraging plastic- ity in deep neural networks. International Journal of Computer Vision , 129(4):1139–1152, 2021

work page 2021
[58]

Explain- able deep classification models for domain generalization

Andrea Zunino, Sarah Adel Bargal, Riccardo Volpi, Mehrnoosh Sameki, Jianming Zhang, Stan Sclaroff, Vittorio Murino, and Kate Saenko. Explain- able deep classification models for domain generalization. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition , pages 3233–3242, 2021. 21

work page 2021

[1] [1]

Finding and removing clever hans: using explanation methods to debug and improve deep models

Christopher J Anders, Leander Weber, David Neumann, Wojciech Samek, Klaus-Robert M¨ uller, and Sebastian Lapuschkin. Finding and removing clever hans: using explanation methods to debug and improve deep models. Information Fusion, 77:261–295, 2022

work page 2022

[2] [2]

Impact of nutritional factors in blood glucose predic- tion in type 1 diabetes through machine learning

Giovanni Annuzzi, Andrea Apicella, Pasquale Arpaia, Lutgarda Bozzetto, Sabatina Criscuolo, Egidio De Benedetto, Marisa Pesola, Roberto Prevete, and Ersilia Vallefuoco. Impact of nutritional factors in blood glucose predic- tion in type 1 diabetes through machine learning. IEEE Access, 11:17104– 17115, 2023

work page 2023

[3] [3]

Apicella, F

A. Apicella, F. Isgr` o, R. Prevete, A. Sorrentino, and G. Tamburrini. Ex- plaining classification systems using sparse dictionaries. ESANN 2019 - Proceedings, 27th European Symposium on Artificial Neural Networks, Computational Intelligence and Machine Learning , page 495 – 500, 2019

work page 2019

[4] [4]

Integration of context information through probabilistic ontological knowl- edge into image classification

Andrea Apicella, Anna Corazza, Francesco Isgr` o, and Giuseppe Vettigli. Integration of context information through probabilistic ontological knowl- edge into image classification. Information, 9(10):252, 2018

work page 2018

[5] [5]

Strategies to exploit xai to improve classification systems

Andrea Apicella, Luca Di Lorenzo, Francesco Isgr` o, Andrea Pollastro, and Roberto Prevete. Strategies to exploit xai to improve classification systems. 15 Communications in Computer and Information Science , 1901 CCIS:147 – 159, 2023

work page 1901

[6] [6]

Exploiting auto-encoders and segmentation methods for middle-level explanations of image classification systems

Andrea Apicella, Salvatore Giugliano, Francesco Isgr` o, and Roberto Pre- vete. Exploiting auto-encoders and segmentation methods for middle-level explanations of image classification systems. Knowledge-Based Systems , 255:109725, 2022

work page 2022

[7] [7]

Shap-based explanations to improve classification systems

Andrea Apicella, Salvatore Giugliano, Francesco Isgr` o, and Roberto Pre- vete. Shap-based explanations to improve classification systems. In Pro- ceedings of the 4th Italian Workshop on Explainable Artificial Intelligence co-located with 22nd International Conference of the Italian Association for Artificial Intelligence(AIxIA 2023), Roma, Italy, Novembe...

work page 2023

[8] [8]

Explainable artificial intelligence (xai): Concepts, taxonomies, opportunities and challenges to- ward responsible ai

Alejandro Barredo Arrieta, Natalia D´ ıaz-Rodr´ ıguez, Javier Del Ser, Adrien Bennetot, Siham Tabik, Alberto Barbado, Salvador Garc´ ıa, Sergio Gil- L´ opez, Daniel Molina, Richard Benjamins, et al. Explainable artificial intelligence (xai): Concepts, taxonomies, opportunities and challenges to- ward responsible ai. Information Fusion, 58:82–115, 2020

work page 2020

[9] [9]

On pixel-wise ex- planations for non-linear classifier decisions by layer-wise relevance propa- gation

Sebastian Bach, Alexander Binder, Gr´ egoire Montavon, Frederick Klauschen, Klaus-Robert M¨ uller, and Wojciech Samek. On pixel-wise ex- planations for non-linear classifier decisions by layer-wise relevance propa- gation. PloS one, 10(7):e0130140, 2015

work page 2015

[10] [10]

Guided zoom: Zooming into network evidence to refine fine-grained model decisions.IEEE Transactions on Pattern Analysis and Machine Intelligence , 43(11):4196–4202, 2021

Sarah Adel Bargal, Andrea Zunino, Vitali Petsiuk, Jianming Zhang, Kate Saenko, Vittorio Murino, and Stan Sclaroff. Guided zoom: Zooming into network evidence to refine fine-grained model decisions.IEEE Transactions on Pattern Analysis and Machine Intelligence , 43(11):4196–4202, 2021

work page 2021

[11] [11]

Layer-wise relevance propagation for neural networks with local renormalization layers

Alexander Binder, Gr´ egoire Montavon, Sebastian Lapuschkin, Klaus- Robert M¨ uller, and Wojciech Samek. Layer-wise relevance propagation for neural networks with local renormalization layers. In International Confer- ence on Artificial Neural Networks , pages 63–71, Barcelona, Spain, 2016. Springer

work page 2016

[12] [12]

An analysis of single-layer networks in unsupervised feature learning

Adam Coates, Andrew Ng, and Honglak Lee. An analysis of single-layer networks in unsupervised feature learning. In Proceedings of the fourteenth international conference on artificial intelligence and statistics , pages 215–

work page

[13] [13]

JMLR Workshop and Conference Proceedings, 2011

work page 2011

[14] [14]

Ima- genet: A large-scale hierarchical image database

Jia Deng, Wei Dong, Richard Socher, Li-Jia Li, Kai Li, and Li Fei-Fei. Ima- genet: A large-scale hierarchical image database. In 2009 IEEE conference on computer vision and pattern recognition , pages 248–255. Ieee, 2009

work page 2009

[15] [15]

Dosovitskiy and T

A. Dosovitskiy and T. Brox. Inverting visual representations with convo- lutional networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 4829–4837, Las Vegas, USA, 2016. 16

work page 2016

[16] [16]

Erhan, Y

D. Erhan, Y. Bengio, . Courville, and P. Vincent. Visualizing higher-layer features of a deep network. University of Montreal, 1341(3):1, 2009

work page 2009

[17] [17]

Attention branch network: Learning of attention mechanism for visual explanation

Hiroshi Fukui, Tsubasa Hirakawa, Takayoshi Yamashita, and Hironobu Fu- jiyoshi. Attention branch network: Learning of attention mechanism for visual explanation. In Proceedings of the IEEE/CVF conference on com- puter vision and pattern recognition , pages 10705–10714, 2019

work page 2019

[18] [18]

Improvement in deep networks for optimization using explainable artificial intelligence

Jin ha Lee, Ik hee Shin, Sang gu Jeong, Seung-Ik Lee, Muhama- mad Zaigham Zaheer, and Beom-Su Seo. Improvement in deep networks for optimization using explainable artificial intelligence. In 2019 International Conference on Information and Communication Technology Convergence (ICTC), pages 525–530. IEEE, 2019

work page 2019

[19] [19]

Impact of feedback type on explanatory interactive learning

Misgina Tsighe Hagos, Kathleen M Curran, and Brian Mac Namee. Impact of feedback type on explanatory interactive learning. In International Sym- posium on Methodologies for Intelligent Systems , pages 127–137. Springer, 2022

work page 2022

[20] [20]

Howard, Menglong Zhu, Bo Chen, Dmitry Kalenichenko, Wei- jun Wang, Tobias Weyand, Marco Andreetto, and Hartwig Adam

Andrew G. Howard, Menglong Zhu, Bo Chen, Dmitry Kalenichenko, Wei- jun Wang, Tobias Weyand, Marco Andreetto, and Hartwig Adam. Mo- bilenets: Efficient convolutional neural networks for mobile vision applica- tions, 2017

work page 2017

[21] [21]

Harnessing deep neural networks with logic rules

Zhiting Hu, Xuezhe Ma, Zhengzhong Liu, Eduard Hovy, and Eric Xing. Harnessing deep neural networks with logic rules. In Katrin Erk and Noah A. Smith, editors, Proceedings of the 54th Annual Meeting of the As- sociation for Computational Linguistics (Volume 1: Long Papers) , pages 2410–2420, Berlin, Germany, August 2016. Association for Computational Linguistics

work page 2016

[22] [22]

A systematic review of explainable artificial intelligence in terms of different application domains and tasks

Mir Riyanul Islam, Mobyen Uddin Ahmed, Shaibal Barua, and Shahina Begum. A systematic review of explainable artificial intelligence in terms of different application domains and tasks. Applied Sciences, 12(3):1353, 2022

work page 2022

[23] [23]

Improving deep learning interpretability by saliency guided training

Aya Abdelsalam Ismail, Hector Corrada Bravo, and Soheil Feizi. Improving deep learning interpretability by saliency guided training. Advances in Neural Information Processing Systems, 34:26726–26739, 2021

work page 2021

[24] [24]

Evaluating explain- able artificial intelligence methods for multi-label deep learning classifi- cation tasks in remote sensing

Ioannis Kakogeorgiou and Konstantinos Karantzalos. Evaluating explain- able artificial intelligence methods for multi-label deep learning classifi- cation tasks in remote sensing. International Journal of Applied Earth Observation and Geoinformation , 103:102520, 2021

work page 2021

[25] [25]

Krizhevsky and G

A. Krizhevsky and G. Hinton. Learning multiple layers of features from tiny images. Master’s thesis, Department of Computer Science, University of Toronto, 2009. 17

work page 2009

[26] [26]

Gradient- based learning applied to document recognition

Yann LeCun, L´ eon Bottou, Yoshua Bengio, and Patrick Haffner. Gradient- based learning applied to document recognition. Proceedings of the IEEE, 86(11):2278–2324, 1998

work page 1998

[27] [27]

The mythos of model interpretability: In machine learn- ing, the concept of interpretability is both important and slippery

Zachary C Lipton. The mythos of model interpretability: In machine learn- ing, the concept of interpretability is both important and slippery. Queue, 16(3):31–57, 2018

work page 2018

[28] [28]

Icel: Learning with inconsistent explanations

Biao Liu, Xiaoyu Wu, and Bo Yuan. Icel: Learning with inconsistent explanations. In ICASSP 2023-2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pages 1–5. IEEE, 2023

work page 2023

[29] [29]

Incorporating priors with feature attribu- tion on text classification

Frederick Liu and Besim Avci. Incorporating priors with feature attribu- tion on text classification. In Anna Korhonen, David Traum, and Llu´ ıs M` arquez, editors,Proceedings of the 57th Annual Meeting of the Associa- tion for Computational Linguistics , pages 6274–6283, Florence, Italy, July

work page

[30] [30]

Association for Computational Linguistics

work page

[31] [31]

A unified approach to interpreting model predictions

Scott M Lundberg and Su-In Lee. A unified approach to interpreting model predictions. In I. Guyon, U. V. Luxburg, S. Bengio, H. Wallach, R. Fergus, S. Vishwanathan, and R. Garnett, editors, Advances in Neural Information Processing Systems 30, pages 4765–4774. Curran Associates, Inc., 2017

work page 2017

[32] [32]

Interpretability-driven sample selection using self supervised learn- ing for disease classification and segmentation

Dwarikanath Mahapatra, Alexander Poellinger, Ling Shao, and Mauricio Reyes. Interpretability-driven sample selection using self supervised learn- ing for disease classification and segmentation. IEEE transactions on med- ical imaging, 40(10):2548–2562, 2021

work page 2021

[33] [33]

Explanation in artificial intelligence: Insights from the social sciences

Tim Miller. Explanation in artificial intelligence: Insights from the social sciences. Artificial intelligence, 267:1–38, 2019

work page 2019

[34] [34]

Em- bedding human knowledge into deep neural network via attention map

Masahiro Mitsuhara, Hiroshi Fukui, Yusuke Sakashita, Takanori Ogata, Tsubasa Hirakawa, Takayoshi Yamashita, and Hironobu Fujiyoshi. Em- bedding human knowledge into deep neural network via attention map. In Giovanni Maria Farinella, Petia Radeva, Jos´ e Braz, and Kadi Bouatouch, editors, Proceedings of the 16th International Joint Conference on Com- puter...

work page 2021

[35] [35]

Layer-wise relevance propagation: an overview

Gr´ egoire Montavon, Alexander Binder, Sebastian Lapuschkin, Wojciech Samek, and Klaus-Robert M¨ uller. Layer-wise relevance propagation: an overview. Explainable AI: interpreting, explaining and visualizing deep learning, pages 193–209, 2019

work page 2019

[36] [36]

Explaining nonlinear classification de- cisions with deep taylor decomposition

Gr´ egoire Montavon, Sebastian Lapuschkin, Alexander Binder, Wojciech Samek, and Klaus-Robert M¨ uller. Explaining nonlinear classification de- cisions with deep taylor decomposition. Pattern Recognition, 65:211–222, 2017. 18

work page 2017

[37] [37]

”why should i trust you?” explaining the predictions of any classifier

Marco Tulio Ribeiro, Sameer Singh, and Carlos Guestrin. ”why should i trust you?” explaining the predictions of any classifier. In Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining , pages 1135–1144, 2016

work page 2016

[38] [38]

Hughes, and Finale Doshi-Velez

Andrew Slavin Ross, Michael C. Hughes, and Finale Doshi-Velez. Right for the right reasons: Training differentiable models by constraining their explanations. In Proceedings of the Twenty-Sixth International Joint Con- ference on Artificial Intelligence, IJCAI-17 , pages 2662–2670, 2017

work page 2017

[39] [39]

Evaluating the visualization of what a deep neural network has learned

Wojciech Samek, Alexander Binder, Gr´ egoire Montavon, Sebastian La- puschkin, and Klaus-Robert M¨ uller. Evaluating the visualization of what a deep neural network has learned. IEEE transactions on neural networks and learning systems , 28(11):2660–2673, 2016

work page 2016

[40] [40]

Relevance-based feature masking: Improving neural network based whale classification through explainable artificial intelligence

Dominik Schiller, Tobias Huber, Florian Lingenfelser, Michael Dietz, An- dreas Seiderer, and Elisabeth Andr´ e. Relevance-based feature masking: Improving neural network based whale classification through explainable artificial intelligence. 2019

work page 2019

[41] [41]

Human-centered xai: Developing design patterns for ex- planations of clinical decision support systems

Tjeerd AJ Schoonderwoerd, Wiard Jorritsma, Mark A Neerincx, and Karel Van Den Bosch. Human-centered xai: Developing design patterns for ex- planations of clinical decision support systems. International Journal of Human-Computer Studies, 154:102684, 2021

work page 2021

[42] [42]

Making deep neural networks right for the right scientific reasons by interacting with their explanations

Patrick Schramowski, Wolfgang Stammer, Stefano Teso, Anna Brug- ger, Franziska Herbert, Xiaoting Shao, Hans-Georg Luigs, Anne-Katrin Mahlein, and Kristian Kersting. Making deep neural networks right for the right scientific reasons by interacting with their explanations. Nature Machine Intelligence, 2(8):476–486, 2020

work page 2020

[43] [43]

Grad-cam: Visual explanations from deep networks via gradient-based localization

Ramprasaath R Selvaraju, Michael Cogswell, Abhishek Das, Ramakrishna Vedantam, Devi Parikh, and Dhruv Batra. Grad-cam: Visual explanations from deep networks via gradient-based localization. In Proceedings of the IEEE international conference on computer vision , pages 618–626, 2017

work page 2017

[44] [44]

Taking a hint: Lever- aging explanations to make vision and language models more grounded

Ramprasaath R Selvaraju, Stefan Lee, Yilin Shen, Hongxia Jin, Shalini Ghosh, Larry Heck, Dhruv Batra, and Devi Parikh. Taking a hint: Lever- aging explanations to make vision and language models more grounded. In Proceedings of the IEEE/CVF international conference on computer vision, pages 2591–2600, 2019

work page 2019

[45] [45]

Utilizing explainable ai for improving the performance of neural networks

Huawei Sun, Lorenzo Servadei, Hao Feng, Michael Stephan, Avik Santra, and Robert Wille. Utilizing explainable ai for improving the performance of neural networks. In 2022 21st IEEE International Conference on Machine Learning and Applications (ICMLA) , pages 1775–1782. IEEE, 2022. 19

work page 2022

[46] [46]

Explanation-guided training for cross- domain few-shot classification

Jiamei Sun, Sebastian Lapuschkin, Wojciech Samek, Yunqing Zhao, Ngai- Man Cheung, and Alexander Binder. Explanation-guided training for cross- domain few-shot classification. In 2020 25th International Conference on Pattern Recognition (ICPR), pages 7609–7616. IEEE, 2021

work page 2020

[47] [47]

Efficientnet: Rethinking model scaling for convolutional neural networks

Mingxing Tan and Quoc Le. Efficientnet: Rethinking model scaling for convolutional neural networks. In International conference on machine learning, pages 6105–6114. PMLR, 2019

work page 2019

[48] [48]

Explanatory interactive machine learn- ing

Stefano Teso and Kristian Kersting. Explanatory interactive machine learn- ing. In Proceedings of the 2019 AAAI/ACM Conference on AI, Ethics, and Society, pages 239–245, 2019

work page 2019

[49] [49]

Quantifying explainability of saliency meth- ods in deep neural networks with a synthetic dataset

Erico Tjoa and Cuntai Guan. Quantifying explainability of saliency meth- ods in deep neural networks with a synthetic dataset. IEEE Transactions on Artificial Intelligence, 4(4):858–870, 2022

work page 2022

[50] [50]

Attention is all you need

A Vaswani. Attention is all you need. Advances in Neural Information Processing Systems, 2017

work page 2017

[51] [51]

Beyond explaining: Opportunities and challenges of xai-based model improvement

Leander Weber, Sebastian Lapuschkin, Alexander Binder, and Wojciech Samek. Beyond explaining: Opportunities and challenges of xai-based model improvement. Information Fusion, 2022

work page 2022

[52] [52]

Self-training with noisy student improves imagenet classification

Qizhe Xie, Minh-Thang Luong, Eduard Hovy, and Quoc V Le. Self-training with noisy student improves imagenet classification. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition , pages 10687–10698, 2020

work page 2020

[53] [53]

Pruning by explaining: A novel criterion for deep neural network pruning

Seul-Ki Yeom, Philipp Seegerer, Sebastian Lapuschkin, Alexander Binder, Simon Wiedemann, Klaus-Robert M¨ uller, and Wojciech Samek. Pruning by explaining: A novel criterion for deep neural network pruning. Pattern Recognition, 115:107899, 2021

work page 2021

[54] [54]

M. D. Zeiler and R. Fergus. Visualizing and understanding convolutional networks. In European Conference on Computer Cision , pages 818–833, Zurich, Switzerland, 2014. Springer

work page 2014

[55] [55]

M. D. Zeiler, G. W. Taylor, and R. Fergus. Adaptive deconvolutional net- works for mid and high level feature learning. In Computer Vision (ICCV), 2011 IEEE International Conference on , pages 2018–2025, Barcelona, Spain, 2011. IEEE

work page 2011

[56] [56]

Learning deep features for discriminative localization

Bolei Zhou, Aditya Khosla, Agata Lapedriza, Aude Oliva, and Antonio Tor- ralba. Learning deep features for discriminative localization. In Proceedings of the IEEE conference on computer vision and pattern recognition , pages 2921–2929, 2016. 20

work page 2016

[57] [57]

Excitation dropout: Encouraging plastic- ity in deep neural networks

Andrea Zunino, Sarah Adel Bargal, Pietro Morerio, Jianming Zhang, Stan Sclaroff, and Vittorio Murino. Excitation dropout: Encouraging plastic- ity in deep neural networks. International Journal of Computer Vision , 129(4):1139–1152, 2021

work page 2021

[58] [58]

Explain- able deep classification models for domain generalization

Andrea Zunino, Sarah Adel Bargal, Riccardo Volpi, Mehrnoosh Sameki, Jianming Zhang, Stan Sclaroff, Vittorio Murino, and Kate Saenko. Explain- able deep classification models for domain generalization. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition , pages 3233–3242, 2021. 21

work page 2021