arxiv: 2605.05283 · v1 · submitted 2026-05-06 · 💻 cs.CV

Recognition: unknown

Seeing What Shouldn't Be There: Counterfactual GANs for Medical Image Attribution

Shakeeb Murtaza

Authors on Pith no claims yet

Pith reviewed 2026-05-08 16:42 UTC · model grok-4.3

classification 💻 cs.CV

keywords counterfactual explanationsGANsmedical image attributionfeature visualizationcycle-consistent lossBraTStuberculosis

0 comments

The pith

A cycle-consistent GAN generates counterfactual medical images to provide complete class-oriented feature attributions.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper proposes a counterfactual explanation method that generates altered versions of an input medical image to reveal why a classifier assigned it to a particular category. Unlike standard visualization techniques that spotlight only the minimal discriminative regions, this approach seeks to account for all noticeable objects by showing what the image would look like without the key causal features. It relies on generative adversarial networks trained with a cycle-consistent loss to produce plausible counterfactual instances. The method is demonstrated on synthetic data, tuberculosis chest X-rays, and brain tumor scans from the BraTS dataset, along with a new technique for assessing the quality of the generated counterfactuals.

Core claim

A counterfactual explanation based class-oriented feature attribution method is built on generative adversarial networks with a cyclical-consistent loss function to generate plausible counterfactual instances whose differences from the original image highlight causally relevant features for medical image classification. This overcomes the incompleteness of discriminative visualization methods that rely on minimal feature sets and addresses the implausibility issues in prior counterfactual techniques. Experiments across synthetic, tuberculosis, and BraTS datasets confirm the method's efficacy, and it establishes baseline results on BraTS while introducing a novel evaluation for counterfactual

What carries the argument

Cycle-consistent generative adversarial networks that produce counterfactual instances to enable self-explanatory analogy-based explanations by altering images in ways that flip the classifier output.

If this is right

The method visualizes deformities in medical images more comprehensively than minimal-feature approaches.
It supplies self-explanatory analogy-based explanations for radiologists.
Existing counterfactual techniques are shown to produce implausible instances, limiting their utility.
Baseline performance is established on the BraTS dataset for future comparisons.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Clinicians could compare original and altered images directly to verify AI-driven diagnoses.
The technique might apply to other image classification tasks where showing missing features aids interpretability.
Minimizing generator artifacts could further increase reliability in high-stakes medical settings.

Load-bearing premise

The cycle-consistent GAN produces plausible counterfactual instances whose differences from the original image correspond to causally relevant features rather than artifacts of the generator.

What would settle it

A test showing that the generated counterfactual images either fail to change the classifier prediction as expected or contain visible artifacts unrelated to known medical pathologies in the tuberculosis or BraTS data would falsify the claim of effective feature attribution.

Figures

Figures reproduced from arXiv: 2605.05283 by Shakeeb Murtaza.

**Figure 1.** Figure 1: Single and Multi Layer Perceptron 1 arXiv:2605.05283v1 [cs.CV] 6 May 2026 view at source ↗

**Figure 2.** Figure 2: At each layer, a set of filters is convolved, view at source ↗

**Figure 3.** Figure 3: Accuracy vs Interpretability 1.2.1 Importance The outstanding performance of deep learning-based systems does not ensure the reliability of the model. They are outperforming traditional ML algorithms in certain tasks, but they can’t be trusted unless they provide some human-understandable explanations. Without understanding the decision-making process completely, trusting these models can cost us a lot … view at source ↗

**Figure 4.** Figure 4: Counterfactual Explanation Humans tend to think in a counterfactual way. For example, if a person’s loan application was rejected, he would be interested in finding the accepted version of his application vs the rejected version [36]. In a similar way, a physician can ask, “Why didn’t the drug work for a patient, and he is interested in finding a patient with similar conditions, on whom the particular dru… view at source ↗

**Figure 5.** Figure 5: Plausible vs Unplausible Counterfactual Explanation Previous CI-based CX techniques replace a part of an input image (e.g. a square tile) with a specific region of a counterfactual image (i.e. a CI) [42]. Such techniques thus intervene in the original data space, but make only a restricted number of changes. Consequently, the generated CXs are not plausible. Furthermore, these techniques require a datase… view at source ↗

**Figure 7.** Figure 7: Illustration of Grad-CAM To produce the final feature map, Grad-CAM resizes the generated feature map and produces a low-resolution map. To cope with this issue, guided backpropagation was employed [47]. Map generated by guided-backpropagation and grad-CAM is combined through pointwise operation for producing smooth feature map as shown in view at source ↗

**Figure 6.** Figure 6: Illustration of CAM method However, CAM helps to visualize features learned by the last convolution layer. In order to visualize the feature maps at every layer, Grad-CAM [47] was introduced. To achieve this, backpropagation is employed at the targeted layer in the network with respect to the output class as shown in view at source ↗

**Figure 8.** Figure 8: Illustration of Guided Grad-CAM 6 view at source ↗

**Figure 9.** Figure 9: Generative Adversarial Networks minGmaxDV (D, G) = Ex∼pdata(x) [log[D(x)]] + Ez∼pdata(z) [log[1 − D(G(z))]] (4) Where x is an instance that belongs to the training set, and z is a random noise vector drawn from a known distribution. In this objective function, D(X) and D(G(x)) correspond to the discriminator’s output on training data and generated output, respectively. 3.1.2 Cycle GANs for Image-to-Image … view at source ↗

**Figure 11.** Figure 11: Flow Diagram of CX-GAN 3.2.1 Integrated Model (CX-GAN) In this section, I present an integrated model for jointly learning to produce CX and CIs. It is assumed that a dataset of input X contains N images {X} N i=1 and counterfactual Y contains M images {Y } M i=1 is available; however, it is not in the form of pairs. The distribution of input images and counterfactual images is represented as pdata(x) a… view at source ↗

**Figure 10.** Figure 10: Cycle-GAN: (a) Model consists of two gen view at source ↗

**Figure 12.** Figure 12: Illustration of CIs done using the following loss function: LM(GM,Dx, Y, X) = Ex∼pdata(x) [log(Dx(x))] + Ey∼pdata(y) [log(1 − Dx(GM(y) + y))]. (10) An illustration for CX (change map) by utilizing the generated pair (CI) and the input image is shown in view at source ↗

**Figure 13.** Figure 13: Input image (xi) Counterfactual (yi) Change Map view at source ↗

**Figure 15.** Figure 15: Examples of synthetic data. Left of the dotted line are Samples of Class 1 (i.e. the disease class) and to the right of the dotted line are samples of Class 0 (i.e. the normal class). The upper row shows the input, and the bottom row shows the ground truth. to 286 × 286 and then randomly cropping to 256 × 256 size. Tuberculosis dataset The Shenzhen Hospital tuberculosis chest X-rays (CXRs) dataset is also… view at source ↗

**Figure 16.** Figure 16: Examples of visualization maps of compared view at source ↗

**Figure 18.** Figure 18: Examples of visualization maps of compared view at source ↗

**Figure 17.** Figure 17: Examples of visualization maps of the compared methods on BraTS data view at source ↗

**Figure 19.** Figure 19: Illustration of the non-resemblance score view at source ↗

read the original abstract

Ascription of an image gives insights into the objects that influence the classification of the whole image or its pixels towards a specific category. These insights help radiologists to visualize deformities in medical imaging. Most of the existing visualization techniques are based on discriminative models and highlight regions of the input image participating in the decision-making of a classifier. However, these approaches do not take all noticeable objects into account as their objective is to classify the input by using a minimal set of discriminative features. To overcome the issue, a counterfactual explanation (CX) based class-oriented feature attribution method is proposed. A counterfactual explanation (CX) explicates a causal reasoning process of the form: "if X had not happened, then Y would not have happened". The method is built on generative adversarial networks (GANs) with a cyclical-consistent loss function. We evaluate our method on three datasets: synthetic, tuberculosis and BraTS. All experiments confirm the efficacy of the proposed method. This study also highlighted the limitations of existing counterfactual explanation techniques in producing plausible counterfactual instances (CIs). Accompanying CXs with believable CIs thus provides self-explanatory analogy-based explanations. To this end, a CI generation method is proposed. Also, a novel technique is used to evaluate the quality of CI. The baseline results are produced on the BraTS dataset.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper tries to use cycle-consistent GANs to generate counterfactual medical images for better feature attribution than standard saliency methods, but the evidence that the pixel differences actually track causal disease features is still thin.

read the letter

The main pitch is a generative approach to attribution: train a CycleGAN to produce a plausible image of the opposite class, then subtract to highlight what changed. This is meant to capture more than the minimal discriminative features that Grad-CAM-style methods pick up, and they run it on synthetic data, TB X-rays, and BraTS. They also flag that prior counterfactual techniques often yield unrealistic images and claim a new evaluation trick for the generated instances plus baseline numbers on BraTS. That part is useful; it directly tackles a known weakness in the CX literature for medical images where radiologists need believable “what if” visuals. The cycle-consistent loss is a reasonable choice for keeping the counterfactuals in-distribution. The datasets are appropriate and the motivation is clear. The soft spot is exactly the one the stress-test flags. Cycle consistency and adversarial training enforce realism and invertibility, but they do not force the generator to edit only the causally relevant pixels. It can alter texture, contrast, or neighboring anatomy in ways that look fine yet have nothing to do with the classifier’s actual decision boundary. The abstract says “all experiments confirm the efficacy,” yet supplies no numbers on classifier flip rates after editing, no overlap with expert lesion masks, and no ablation that would rule out generator artifacts. Without those checks the attribution maps remain consistent with the model’s biases rather than with ground-truth causal features. This is aimed at the XAI-for-medical-imaging group. A reader already working on counterfactual explanations or generative models for interpretability would find the setup and the three-dataset evaluation worth seeing. It is coherent on its own terms and shows honest engagement with the limitations of prior CX work, so it deserves a serious referee even if the current validation is light. I would send it to review rather than desk-reject.

Referee Report

3 major / 2 minor

Summary. The paper proposes a counterfactual explanation (CX) based class-oriented feature attribution method for medical images, constructed using generative adversarial networks (GANs) with a cycle-consistent loss. The method generates plausible counterfactual instances (CIs) of the opposite class and derives attribution maps by subtracting these from the input image, aiming to highlight causally relevant features rather than minimal discriminative ones used by prior visualization techniques. It evaluates the approach on synthetic data, a tuberculosis dataset, and the BraTS dataset, claims to confirm efficacy across all experiments, highlights limitations of existing CX techniques, introduces a novel CI quality evaluation technique, and provides baseline results on BraTS.

Significance. If the empirical claims hold under rigorous validation, the work could meaningfully advance interpretable machine learning for medical imaging by shifting from purely discriminative attributions to counterfactual ones that offer self-explanatory, analogy-based insights. This addresses a documented shortcoming in prior CX methods regarding plausibility of generated instances and could improve clinical utility for radiologists by better aligning attributions with disease-relevant anatomy.

major comments (3)

[Abstract] Abstract: the statement 'All experiments confirm the efficacy of the proposed method' is unsupported by any reported quantitative metrics, ablation studies, error analysis, or statistical comparisons, so the central efficacy claim rests on unshown details.
[Experiments] Evaluation on three datasets: no quantitative checks (e.g., classifier re-evaluation on edited images, expert segmentation overlap with known causal features, or comparison against ground-truth interventions) are described to verify that pixel differences isolate disease features rather than GAN artifacts or unrelated anatomy changes, leaving the attribution validity untested.
[Experiments] BraTS baseline results: while baselines are mentioned, the absence of specific performance numbers, tables, or direct comparisons to prior CX techniques prevents assessment of whether the cycle-consistent GAN approach improves upon existing methods.

minor comments (2)

[Abstract] The phrasing 'cyclical-consistent loss function' should be corrected to the standard term 'cycle-consistent loss' for consistency with the literature.
[Method] Notation for the attribution map computation (input minus counterfactual) could be formalized with an equation to improve clarity.

Simulated Author's Rebuttal

3 responses · 0 unresolved

Thank you for the constructive feedback on our manuscript. We address each major comment point by point below, agreeing where revisions are needed to improve clarity and rigor while defending the core contributions based on the presented evaluations.

read point-by-point responses

Referee: [Abstract] Abstract: the statement 'All experiments confirm the efficacy of the proposed method' is unsupported by any reported quantitative metrics, ablation studies, error analysis, or statistical comparisons, so the central efficacy claim rests on unshown details.

Authors: We agree that the abstract phrasing overstates the support, as the evaluations are primarily qualitative (visual assessment of plausible CIs and attribution maps) along with the novel CI quality technique. We will revise the abstract to state that the experiments illustrate the method's potential via visual results on the three datasets, without claiming comprehensive confirmation of efficacy. revision: yes
Referee: [Experiments] Evaluation on three datasets: no quantitative checks (e.g., classifier re-evaluation on edited images, expert segmentation overlap with known causal features, or comparison against ground-truth interventions) are described to verify that pixel differences isolate disease features rather than GAN artifacts or unrelated anatomy changes, leaving the attribution validity untested.

Authors: The current manuscript does not describe such quantitative checks, focusing instead on visual demonstrations and the proposed CI quality evaluation to highlight advantages over prior CX methods. We acknowledge this leaves room for questions about artifacts. We will add a limitations discussion and incorporate at least one quantitative validation, such as classifier output changes after region editing, in the revision. revision: partial
Referee: [Experiments] BraTS baseline results: while baselines are mentioned, the absence of specific performance numbers, tables, or direct comparisons to prior CX techniques prevents assessment of whether the cycle-consistent GAN approach improves upon existing methods.

Authors: The manuscript references baseline results on BraTS but omits specific numbers and tables, which was an incomplete presentation. We will expand this section with quantitative metrics, tables, and direct comparisons to prior CX techniques to allow assessment of improvements. revision: yes

Circularity Check

0 steps flagged

No circularity: novel construction from standard components with empirical evaluation

full rationale

The paper introduces a counterfactual explanation method built on CycleGAN-style generators with cycle-consistent loss for producing class-oriented attributions via image subtraction. No load-bearing equations, fitted parameters renamed as predictions, or self-citations appear in the provided abstract or description that would reduce the central claim to its own inputs by construction. The derivation is presented as an original assembly of existing GAN techniques, with efficacy shown through experiments on synthetic, TB, and BraTS datasets rather than tautological re-derivation. This qualifies as a self-contained proposal without detectable circular steps.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

The method assumes a trained classifier exists and that cycle-consistency in the GAN enforces meaningful causal changes; no explicit free parameters, axioms, or invented entities are stated in the abstract.

pith-pipeline@v0.9.0 · 5531 in / 1020 out tokens · 43529 ms · 2026-05-08T16:42:04.892113+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

58 extracted references · 10 canonical work pages · 1 internal anchor

[1]

General Data Protection Regulation, 2016

European Commission. General Data Protection Regulation, 2016

2016
[2]

Bakas, H

S. Bakas, H. Akbari, A. Sotiras, M. Bilello, M. Rozycki, J. S. Kirby, J. B. Freymann, K. Fara- hani, and C. Davatzikos. Advancing the cancer genome atlas glioma mri collections with expert segmentation labels and radiomic features.Scien- tific data, 4:170117, 2017

2017
[3]

C. F. Baumgartner, K. Kamnitsas, J. Matthew, T. P. Fletcher, S. Smith, L. M. Koch, B. Kainz, and D. Rueckert. Sononet: real-time detection and lo- calisation of fetal standard scan planes in freehand ultrasound.IEEE transactions on medical imaging, 36(11):2204–2215, 2017

2017
[4]

C. F. Baumgartner, K. Kamnitsas, J. Matthew, S. Smith, B. Kainz, and D. Rueckert. Real-time standard scan plane detection and localisation in fe- tal ultrasound using fully convolutional neural net- works. InInternational Conference on Medical Im- age Computing and Computer-Assisted Interven- tion, pages 203–211. Springer, 2016

2016
[5]

C. F. Baumgartner, L. M. Koch, K. Can Tezcan, J. Xi Ang, and E. Konukoglu. Visual feature attri- bution using wasserstein gans. InProceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 8309–8319, 2018. 14 [S. Murtaza]

2018
[6]

Bernal, K

J. Bernal, K. Kushibar, D. S. Asfaw, S. Valverde, A. Oliver, R. Mart´ ı, and X. Llad´ o. Deep convolu- tional neural networks for brain image analysis on magnetic resonance imaging: a review.Artificial intelligence in medicine, 95:64–81, 2019

2019
[7]

Carter, Z

S. Carter, Z. Armstrong, L. Schubert, I. Johnson, and C. Olah. Activation atlas.Distill, 4(3):e15, 2019

2019
[8]

D. V. Carvalho, E. M. Pereira, and J. S. Car- doso. Machine learning interpretability: A sur- vey on methods and metrics.Electronics, 8(8):832, 2019

2019
[9]

Chang, E

C.-H. Chang, E. Creager, A. Goldenberg, and D. Duvenaud. Explaining image classifiers by coun- terfactual generation. 2018

2018
[10]

Ciaparrone, F

G. Ciaparrone, F. L. S´ anchez, S. Tabik, L. Troiano, R. Tagliaferri, and F. Herrera. Deep learning in video multi-object tracking: A survey.Neurocom- puting, 2019

2019
[11]

Dabkowski and Y

P. Dabkowski and Y. Gal. Real time image saliency for black box classifiers. InAdvances in Neural Information Processing Systems, pages 6967–6976, 2017

2017
[12]

Dhurandhar, P.-Y

A. Dhurandhar, P.-Y. Chen, R. Luss, C.-C. Tu, P. Ting, K. Shanmugam, and P. Das. Explanations based on the missing: Towards contrastive explana- tions with pertinent negatives. InAdvances in Neu- ral Information Processing Systems, pages 592–603, 2018

2018
[13]

X. Feng, J. Yang, A. F. Laine, and E. D. An- gelini. Discriminative localization in cnns for weakly-supervised segmentation of pulmonary nod- ules. InInternational Conference on Medical Im- age Computing and Computer-Assisted Interven- tion, pages 568–576. Springer, 2017

2017
[14]

R. C. Fong and A. Vedaldi. Interpretable explana- tions of black boxes by meaningful perturbation. In Proceedings of the IEEE International Conference on Computer Vision, pages 3429–3437, 2017

2017
[15]

A. A. Freitas. A critical review of multi-objective optimization in data mining: a position paper. ACM SIGKDD Explorations Newsletter, 6(2):77– 86, 2004

2004
[16]

Comprehensible classification mod- els: a position paper.ACM SIGKDD explorations newsletter, 15(1):1–10, 2014

Freitas, Alex A. Comprehensible classification mod- els: a position paper.ACM SIGKDD explorations newsletter, 15(1):1–10, 2014

2014
[17]

Gao and J

Y. Gao and J. A. Noble. Detection and characteri- zation of the fetal heartbeat in free-hand ultrasound sweeps with weakly-supervised two-streams convo- lutional networks. InInternational Conference on Medical Image Computing and Computer-Assisted Intervention, pages 305–313. Springer, 2017

2017
[18]

Z. Ge, S. Demyanov, R. Chakravorty, A. Bowl- ing, and R. Garnavi. Skin disease recognition us- ing deep saliency features and multimodal learn- ing of dermoscopy and clinical images. InInter- national Conference on Medical Image Computing and Computer-Assisted Intervention, pages 250–
[19]

L. H. Gilpin, D. Bau, B. Z. Yuan, A. Bajwa, M. Specter, and L. Kagal. Explaining explanations: An overview of interpretability of machine learning. In2018 IEEE 5th International Conference on data science and advanced analytics (DSAA), pages 80–
[20]

Girshick, J

R. Girshick, J. Donahue, T. Darrell, and J. Malik. Rich feature hierarchies for accurate object detec- tion and semantic segmentation. InProceedings of the IEEE conference on computer vision and pat- tern recognition, pages 580–587, 2014

2014
[21]

W. M. Gondal, J. M. K¨ ohler, R. Grzeszick, G. A. Fink, and M. Hirsch. Weakly-supervised localiza- tion of diabetic retinopathy lesions in retinal fundus images. In2017 IEEE International Conference on Image Processing (ICIP), pages 2069–2073. IEEE, 2017

2069
[22]

Goodfellow, J

I. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley, S. Ozair, A. Courville, and Y. Bengio. Generative adversarial nets. InAd- vances in neural information processing systems, pages 2672–2680, 2014

2014
[23]

What do we need to build explainable AI systems for the medical domain?

A. Holzinger, C. Biemann, C. S. Pattichis, and D. B. Kell. What do we need to build explainable ai systems for the medical domain?arXiv preprint arXiv:1712.09923, 2017

work page Pith review arXiv 2017
[24]

Isola, J.-Y

P. Isola, J.-Y. Zhu, T. Zhou, and A. A. Efros. Image-to-image translation with conditional adver- sarial networks. InProceedings of the IEEE con- ference on computer vision and pattern recognition, pages 1125–1134, 2017

2017
[25]

Jaeger, S

S. Jaeger, S. Candemir, S. Antani, Y.-X. J. W´ ang, P.-X. Lu, and G. Thoma. Two public chest x- ray datasets for computer-aided screening of pul- monary diseases.Quantitative imaging in medicine and surgery, 4(6):475, 2014. 15 [S. Murtaza]

2014
[26]

Karpathy and L

A. Karpathy and L. Fei-Fei. Deep visual-semantic alignments for generating image descriptions. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 3128–3137, 2015

2015
[27]

Kim and S

H.-E. Kim and S. Hwang. Deconvolutional feature stacking for weakly-supervised semantic segmenta- tion.arXiv preprint arXiv:1602.04984, 2016

work page arXiv 2016
[28]

Krizhevsky, I

A. Krizhevsky, I. Sutskever, and G. E. Hinton. Im- agenet classification with deep convolutional neural networks. InAdvances in neural information pro- cessing systems, pages 1097–1105, 2012

2012
[29]

LeCun, Y

Y. LeCun, Y. Bengio, and G. Hinton. Deep learn- ing.nature, 521(7553):436–444, 2015

2015
[31]

Z. C. Lipton. The mythos of model interpretability. arXiv preprint arXiv:1606.03490, 2016

work page Pith review arXiv 2016
[32]

L. v. d. Maaten and G. Hinton. Visualizing data using t-sne.Journal of machine learning research, 9(Nov):2579–2605, 2008

2008
[33]

B. H. Menze, A. Jakab, S. Bauer, J. Kalpathy- Cramer, K. Farahani, J. Kirby, Y. Burren, N. Porz, J. Slotboom, R. Wiest, et al. The multi- modal brain tumor image segmentation benchmark (brats).IEEE transactions on medical imaging, 34(10):1993–2024, 2014

1993
[34]

T. Miller. Explanation in artificial intelligence: In- sights from the social sciences.Artificial Intelli- gence, 267:1–38, 2019

2019
[35]

Conditional Generative Adversarial Nets

M. Mirza and S. Osindero. Conditional generative adversarial nets.arXiv preprint arXiv:1411.1784, 2014

work page internal anchor Pith review arXiv 2014
[36]

Molnar.Interpretable machine learning

C. Molnar.Interpretable machine learning. Lulu. com, 2019

2019
[37]

Murtaza, S

S. Murtaza, S. Belharbi, M. A. Guichemerre, M. Pedersoli, and E. Granger. Ted-loc: Text dis- tillation for weakly supervised object localization. CoRR, 2025

2025
[38]

Murtaza, S

S. Murtaza, S. Belharbi, M. Pedersoli, and E. Granger. A realistic protocol for evaluation of weakly supervised object localization. InWACV, 2025

2025
[39]

Murtaza, S

S. Murtaza, S. Belharbi, M. Pedersoli, A. Sarraf, and E. Granger. DIPS: Discriminative pseudo- label sampling with self-supervised transformers for weakly supervised object localization.IVC Journal, 2023

2023
[40]

Murtaza, S

S. Murtaza, S. Belharbi, M. Pedersoli, A. Sarraf, and E. Granger. Discriminative sampling of pro- posals in self-supervised transformers for weakly su- pervised object localization. InWACV Workshop, 2023

2023
[41]

Murtaza, M

S. Murtaza, M. Pedersoli, A. Sarraf, and E. Granger. Leveraging transformers for weakly supervised object localization in unconstrained videos. InIAPRw, 2024

2024
[42]

J. E. D. B. D. Parikh, S. L. Y. Goyal, and Z. Wu. Counterfactual visual explanations. ICML, 2019

2019
[43]

Plumb, D

G. Plumb, D. Molitor, and A. S. Talwalkar. Model agnostic supervised local explanations. InAdvances in Neural Information Processing Systems, pages 2515–2524, 2018

2018
[44]

M. T. Ribeiro, S. Singh, and C. Guestrin. Why should i trust you?: Explaining the predictions of any classifier. InProceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining, pages 1135–1144. ACM, 2016

2016
[45]

Ronneberger, P

O. Ronneberger, P. Fischer, and T. Brox. U-net: Convolutional networks for biomedical image seg- mentation. InInternational Conference on Medical image computing and computer-assisted interven- tion, pages 234–241. Springer, 2015

2015
[46]

Russakovsky, J

O. Russakovsky, J. Deng, H. Su, J. Krause, S. Satheesh, S. Ma, Z. Huang, A. Karpathy, A. Khosla, M. Bernstein, et al. Imagenet large scale visual recognition challenge.International journal of computer vision, 115(3):211–252, 2015

2015
[47]

R. R. Selvaraju, M. Cogswell, A. Das, R. Vedan- tam, D. Parikh, and D. Batra. Grad-cam: Visual explanations from deep networks via gradient-based localization. InProceedings of the IEEE Interna- tional Conference on Computer Vision, pages 618– 626, 2017

2017
[49]

Deep Inside Convolutional Networks: Visualising Image Classification Models and Saliency Maps

K. Simonyan, A. Vedaldi, and A. Zisserman. Deep inside convolutional networks: Visualising image classification models and saliency maps.arXiv preprint arXiv:1312.6034, 2013

work page Pith review arXiv 2013
[50]

J. T. Springenberg, A. Dosovitskiy, T. Brox, and M. Riedmiller. Striving for simplicity: The all convolutional net.arXiv preprint arXiv:1412.6806, 2014

work page arXiv 2014
[51]

Sundararajan, A

M. Sundararajan, A. Taly, and Q. Yan. Axiomatic attribution for deep networks. InProceedings of the 34th International Conference on Machine Learning-Volume 70, pages 3319–3328. JMLR. org, 2017

2017
[52]

Vinyals, A

O. Vinyals, A. Toshev, S. Bengio, and D. Erhan. Show and tell: A neural image caption generator. InProceedings of the IEEE conference on computer vision and pattern recognition, pages 3156–3164, 2015

2015
[53]

A. Weller. Challenges for transparency.arXiv preprint arXiv:1708.01870, 2017

work page arXiv 2017
[54]

Yosinski, J

J. Yosinski, J. Clune, A. Nguyen, T. Fuchs, and H. Lipson. Understanding neural net- works through deep visualization.arXiv preprint arXiv:1506.06579, 2015

work page arXiv 2015
[55]

C. Zednik. Solving the black box problem: A general-purpose recipe for explainable artificial in- telligence.arXiv preprint arXiv:1903.04361, 2019

work page arXiv 1903
[56]

M. D. Zeiler and R. Fergus. Visualizing and understanding convolutional networks. InEuro- pean conference on computer vision, pages 818–833. Springer, 2014

2014
[57]

Zhang, S

J. Zhang, S. A. Bargal, Z. Lin, J. Brandt, X. Shen, and S. Sclaroff. Top-down neural attention by ex- citation backprop.International Journal of Com- puter Vision, 126(10):1084–1102, 2018

2018
[58]

B. Zhou, A. Khosla, A. Lapedriza, A. Oliva, and A. Torralba. Learning deep features for discrimina- tive localization. InProceedings of the IEEE con- ference on computer vision and pattern recognition, pages 2921–2929, 2016

2016
[59]

J.-Y. Zhu, T. Park, P. Isola, and A. A. Efros. Unpaired image-to-image translation using cycle- consistent adversarial networks. InProceedings of the IEEE international conference on computer vi- sion, pages 2223–2232, 2017

2017
[60]

L. M. Zintgraf, T. S. Cohen, T. Adel, and M. Welling. Visualizing deep neural network deci- sions: Prediction difference analysis.arXiv preprint arXiv:1702.04595, 2017. 17

work page arXiv 2017