On the Properties of Feature Attribution for Supervised Contrastive Learning

Aur\'elie Gallet; Ivan Gentile; Julia Eva Belloni; Leonardo Arrighi; Marco Zullich; Matteo Lippi

arxiv: 2604.22540 · v1 · submitted 2026-04-24 · 💻 cs.LG · cs.AI

On the Properties of Feature Attribution for Supervised Contrastive Learning

Leonardo Arrighi , Julia Eva Belloni , Aur\'elie Gallet , Ivan Gentile , Matteo Lippi , Marco Zullich This is my paper

Pith reviewed 2026-05-08 12:10 UTC · model grok-4.3

classification 💻 cs.LG cs.AI

keywords supervised contrastive learningfeature attributionexplainable AIneural network interpretabilityimage classificationfaithfulnesscontrastive learning

0 comments

The pith

Neural networks trained with supervised contrastive learning yield feature attributions that are more faithful, less complex, and more continuous than those from standard contrastive learning.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper empirically evaluates feature attribution explanations produced by neural networks for image classification that were trained using supervised contrastive learning versus standard contrastive learning. It reports that the supervised variant produces attributions scoring higher on faithfulness, lower on complexity, and higher on continuity according to quantitative metrics. A sympathetic reader would care because these properties support more reliable model interpretations in applications where transparency matters alongside accuracy. The work builds on established benefits of supervised contrastive learning in robustness and out-of-distribution detection by extending the comparison to explanation quality.

Core claim

Neural networks for image classification trained with supervised contrastive learning produce feature attribution explanations that outperform those from models trained with contrastive learning on the metrics of faithfulness, complexity, and continuity, as shown through direct empirical comparison on image datasets.

What carries the argument

Quantitative metrics of faithfulness, complexity, and continuity applied to feature attributions extracted from networks trained under supervised versus unsupervised contrastive objectives.

Load-bearing premise

The chosen metrics of faithfulness, complexity, and continuity together with the attribution methods used are adequate to judge overall explanation quality without being swayed by differences in model capacity or training details.

What would settle it

Repeating the experiments on the same architectures and datasets but with matched hyperparameters and random seeds across training objectives, then finding that the ranking on faithfulness, complexity, or continuity reverses or disappears.

Figures

Figures reproduced from arXiv: 2604.22540 by Aur\'elie Gallet, Ivan Gentile, Julia Eva Belloni, Leonardo Arrighi, Marco Zullich, Matteo Lippi.

**Figure 1.** Figure 1: Schematization of the two ResNets model variants used. (a) represents the classical ResNet, which we train using the CE loss (LCE). (b) is the variant we use in CL: the classification head fhead is replaced by a projection head fprojection, which projects the data in R 128, where the contrastive losses (LSCL,LTL) are applied. For the sole purpose of the classification, we train a linear classification head… view at source ↗

**Figure 2.** Figure 2: Sample explanations generated on CIFAR10 and Imagenet-S50. A quick qualitative analysis shows how Grad-CAM for CE, on CIFAR10, fails to focus on the object in the center of the image, rather producing wider FAs. This behavior seems to be present with less intensity on Imagenet-S50. On both datasets, SCL seems to produce more compact explanations, which is confirmed by the lower complexity scores (see view at source ↗

read the original abstract

Most Neural Networks (NNs) for classification are trained using Cross-Entropy as a loss function. This approach requires the model to have an explicit classification layer. However, there exist alternative approaches, such as Contrastive Learning (CL). Instead of explicitly operating a classification, CL has the NN produce an embedding space where projections of similar data are pulled together, while projections of dissimilar data are pushed apart. In the case of Supervised CL (SCL), labels are adopted as similarity criteria, thus creating an embedding space where the projected data points are well-clustered. SCL provides crucial advantages over CE with regard to adversarial robustness and out-of-distribution detection, thus making it a more natural choice in safety-critical scenarios. In the present paper, we empirically show that NNs for image classification trained with SCL present higher-quality feature attribution explanations than CL with regard to faithfulness, complexity, and continuity. These results reinforce previous findings about CL-based approaches when targeting more trustworthy and transparent NNs and can guide practitioners in the selection of training objectives targeting not only accuracy, but also transparency of the models.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

SCL-trained image classifiers show better feature attributions than CL ones on faithfulness, complexity and continuity, but the comparison leaves open whether the loss itself drives the gap or if capacity and accuracy differences do.

read the letter

The central observation is that models trained with supervised contrastive loss produce feature attributions that score higher on faithfulness, lower on complexity, and better on continuity than those from standard contrastive learning. The work applies off-the-shelf attribution methods to this comparison and links it to the known robustness and OOD benefits of SCL, suggesting practitioners might prefer it when both accuracy and post-hoc transparency matter.

Referee Report

2 major / 2 minor

Summary. The manuscript empirically compares feature attribution quality for image classification neural networks trained via Supervised Contrastive Learning (SCL) versus standard (unsupervised) Contrastive Learning (CL). It claims that SCL yields higher-quality attributions on three metrics—faithfulness, complexity, and continuity—positioning SCL as preferable for transparent models in addition to its known benefits in adversarial robustness and out-of-distribution detection.

Significance. If the central empirical comparison is placed on a sound footing, the result would usefully extend the literature on how contrastive objectives affect downstream explainability. It offers practitioners concrete guidance when accuracy alone is insufficient and could inform training choices in safety-critical domains. The work correctly situates the question within the broader advantages already established for SCL.

major comments (2)

[§4] §4 (Experimental Setup): the SCL versus CL comparison does not report or enforce matched final classification accuracy, embedding dimensionality, batch size, temperature schedule, or optimizer hyperparameters. Because attribution metrics can be sensitive to these factors, the observed gaps in faithfulness, complexity, and continuity cannot be attributed to the presence of label supervision in the contrastive loss. This is load-bearing for the central claim.
[§5] §5 (Results and Tables): no statistical significance tests, confidence intervals, or ablation on architecture capacity are provided for the metric differences. Without these, it is impossible to assess whether the reported superiority of SCL is robust or could be explained by uncontrolled variation in model properties.

minor comments (2)

[Abstract] Abstract: the datasets, architectures, and exact feature attribution methods (e.g., which saliency technique) should be named explicitly so readers can immediately gauge the scope of the empirical result.
[§3] §3 (Metrics): the precise definitions or implementations of the faithfulness, complexity, and continuity scores should be restated or referenced to a standard source to avoid ambiguity in replication.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive and detailed feedback. The comments help clarify the experimental controls needed to strengthen our central claim that supervised contrastive learning produces higher-quality feature attributions than unsupervised contrastive learning. We address each major comment below and will incorporate the suggested changes in the revised manuscript.

read point-by-point responses

Referee: [§4] §4 (Experimental Setup): the SCL versus CL comparison does not report or enforce matched final classification accuracy, embedding dimensionality, batch size, temperature schedule, or optimizer hyperparameters. Because attribution metrics can be sensitive to these factors, the observed gaps in faithfulness, complexity, and continuity cannot be attributed to the presence of label supervision in the contrastive loss. This is load-bearing for the central claim.

Authors: We appreciate the referee's emphasis on isolating the effect of label supervision. In the original experiments we adopted the standard hyperparameter configurations reported in the SupCon and SimCLR papers for each respective method. To address the concern directly, the revised manuscript will include an explicit table of all training hyperparameters (embedding dimension, batch size, temperature, optimizer, learning-rate schedule, and number of epochs) for both SCL and CL. In addition, we will run controlled experiments in which training duration or learning rate is adjusted so that final test accuracy is matched between the two objectives, and we will report the three attribution metrics under these matched-accuracy conditions. This will allow readers to attribute any remaining differences more confidently to the presence of label supervision. revision: yes
Referee: [§5] §5 (Results and Tables): no statistical significance tests, confidence intervals, or ablation on architecture capacity are provided for the metric differences. Without these, it is impossible to assess whether the reported superiority of SCL is robust or could be explained by uncontrolled variation in model properties.

Authors: We agree that statistical rigor and capacity ablations are necessary. The revised version will report results averaged over five independent random seeds, together with standard deviations and 95 % confidence intervals for every metric. We will also add paired t-tests (or Wilcoxon signed-rank tests where normality assumptions are violated) to establish statistical significance of the observed differences. Finally, we will include a new ablation subsection that repeats the full evaluation pipeline on ResNet-18, ResNet-34, and ResNet-50 backbones, confirming that the advantages of SCL persist across model capacities. revision: yes

Circularity Check

0 steps flagged

No circularity: purely empirical comparison of attribution metrics

full rationale

The paper reports experimental results comparing feature attribution faithfulness, complexity, and continuity for image classifiers trained under Supervised Contrastive Learning versus standard Contrastive Learning. No mathematical derivation, first-principles prediction, parameter fitting, or uniqueness theorem is claimed. The central claim rests on direct metric evaluation across trained models rather than any self-referential reduction or load-bearing self-citation. This matches the default expectation for non-circular empirical work.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

This is a purely empirical study that relies on standard machine-learning practices and does not introduce new free parameters, axioms, or postulated entities.

pith-pipeline@v0.9.0 · 5503 in / 961 out tokens · 48602 ms · 2026-05-08T12:10:18.711611+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

61 extracted references · 61 canonical work pages · 1 internal anchor

[1]

Advances in neural information processing systems31 (2018)

Adebayo, J., Gilmer, J., Muelly, M., Goodfellow, I., Hardt, M., Kim, B.: Sanity checks for saliency maps. Advances in neural information processing systems31 (2018)

work page 2018
[2]

In: International conference on machine learning

van Amersfoort, J., Smith, L., Teh, Y.W., Gal, Y.: Uncertainty estimation using a single deep deterministic neural network. In: International conference on machine learning. pp. 9690–9700. PMLR (2020)

work page 2020
[3]

Information Fusion77, 261–295 (2022)

Anders, C.J., Weber, L., Neumann, D., Samek, W., Müller, K.R., Lapuschkin, S.: Finding and removing Clever Hans: Using explanation methods to debug and improve deep models. Information Fusion77, 261–295 (2022)

work page 2022
[4]

Arrighi,L.,BarbonJunior,S.,Pellegrino,F.A.,Simonato,M.,Zullich,M.:Explain- able Automated Anomaly Recognition in Failure Analysis: is Deep Learning Doing it Correctly? In: Explainable Artificial Intelligence. pp. 420–432. Communications in Computer and Information Science (2023)

work page 2023
[5]

In: Image Analysis and Processing - ICIAP 2025 Workshops

Arrighi, L., de Moraes, I.A., Simonato, M., Barbon Junior, S.: Discriminating Short-Term Moisture Changes in Stuffed Pasta Using Deep Computer Vision. In: Image Analysis and Processing - ICIAP 2025 Workshops. pp. 489–496. Springer Nature Switzerland (2026)

work page 2025
[6]

PloS one10(7), e0130140 (2015)

Bach, S., Binder, A., Montavon, G., Klauschen, F., Müller, K.R., Samek, W.: On pixel-wise explanations for non-linear classifier decisions by layer-wise relevance propagation. PloS one10(7), e0130140 (2015)

work page 2015
[7]

Evaluating and aggre- gating feature-based model explanations.arXiv preprint arXiv:2005.00631, 2020

Bhatt, U., Weller, A., Moura, J.M.: Evaluating and aggregating feature-based model explanations. arXiv preprint arXiv:2005.00631 (2020)

work page arXiv 2005
[8]

In: International Conference on Ma- chine Learning

Chalasani, P., Chen, J., Chowdhury, A.R., Wu, X., Jha, S.: Concise explanations of neural networks using adversarial training. In: International Conference on Ma- chine Learning. pp. 1383–1391. PMLR (2020)

work page 2020
[9]

In: 2018 IEEE winter conference on applications of computer vision (WACV)

Chattopadhay, A., Sarkar, A., Howlader, P., Balasubramanian, V.N.: Grad- cam++: Generalized gradient-based visual explanations for deep convolutional networks. In: 2018 IEEE winter conference on applications of computer vision (WACV). pp. 839–847. IEEE (2018)

work page 2018
[10]

In: International Conference on Machine Learning (ICML)

Chen, T., Kornblith, S., Norouzi, M., Hinton, G.: A simple framework for con- trastive learning of visual representations. In: International Conference on Machine Learning (ICML). pp. 1597–1607. PMLR (2020) XAI in Contrastive Learning 19

work page 2020
[11]

In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P., Vaughan, J.W

Chen, T., Luo, C., Li, L.: Intriguing properties of contrastive losses. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems. vol. 34, pp. 11834–11845. Curran Asso- ciates, Inc. (2021)

work page 2021
[12]

In: 2005 IEEE computer society conference on computer vision and pattern recognition (CVPR’05)

Chopra, S., Hadsell, R., LeCun, Y.: Learning a similarity metric discriminatively, with application to face verification. In: 2005 IEEE computer society conference on computer vision and pattern recognition (CVPR’05). vol. 1, pp. 539–546. IEEE (2005)

work page 2005
[13]

XAI in Action: Past, Present, and Future Applications (2023)

Deck, L., Schoeffer, J., De-Arteaga, M., Kuehl, N.: A critical survey on fairness benefits of XAI. XAI in Action: Past, Present, and Future Applications (2023)

work page 2023
[14]

In: 2009 IEEE conference on computer vision and pattern recognition

Deng, J., Dong, W., Socher, R., Li, L.J., Li, K., Fei-Fei, L.: Imagenet: A large- scale hierarchical image database. In: 2009 IEEE conference on computer vision and pattern recognition. pp. 248–255. Ieee (2009)

work page 2009
[15]

Fan, L., Liu, S., Chen, P.Y., Zhang, G., Gan, C.: When does contrastive learning preserve adversarial robustness from pretraining to finetuning? Advances in neural information processing systems34, 21480–21492 (2021)

work page 2021
[16]

Gao, S., Li, Z.Y., Yang, M.H., Cheng, M.M., Han, J., Torr, P.: Large-scale Unsu- pervised Semantic Segmentation (2022)

work page 2022
[17]

In: Proceedings of the AAAI conference on artificial intelligence

Ghorbani, A., Abid, A., Zou, J.: Interpretation of neural networks is fragile. In: Proceedings of the AAAI conference on artificial intelligence. vol. 33, pp. 3681–3688 (2019)

work page 2019
[18]

https://github.com/ jacobgil/pytorch-grad-cam (2021)

Gildenblat, J., contributors: Pytorch library for cam methods. https://github.com/ jacobgil/pytorch-grad-cam (2021)

work page 2021
[19]

In: International Conference on Machine Learning

Graf, F., Hofer, C., Niethammer, M., Kwitt, R.: Dissecting supervised contrastive learning. In: International Conference on Machine Learning. pp. 3821–3830. PMLR (2021)

work page 2021
[20]

Journal of machine learning research3(Mar), 1157–1182 (2003)

Guyon, I., Elisseeff, A.: An introduction to variable and feature selection. Journal of machine learning research3(Mar), 1157–1182 (2003)

work page 2003
[21]

In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition

He, K., Fan, H., Wu, Y., Xie, S., Girshick, R.: Momentum contrast for unsupervised visual representation learning. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. pp. 9729–9738 (2020)

work page 2020
[22]

He,K.,Zhang,X.,Ren,S.,Sun,J.:Deepresiduallearningforimagerecognition.In: Proceedings of the IEEE conference on computer vision and pattern recognition. pp. 770–778 (2016)

work page 2016
[23]

In: Signal and Information Processing, Net- working and Computers

He, M., Li, B., Sun, S.: A Survey of Class Activation Mapping for the Interpretabil- ity of Convolution Neural Networks. In: Signal and Information Processing, Net- working and Computers. pp. 399–407. Springer Nature (2023)

work page 2023
[24]

Journal of Machine Learning Research24(34), 1–11 (2023)

Hedström, A., Weber, L., Krakowczyk, D., Bareeva, D., Motzkus, F., Samek, W., Lapuschkin, S., Höhne, M.M.C.: Quantus: An explainable ai toolkit for respon- sible evaluation of neural network explanations and beyond. Journal of Machine Learning Research24(34), 1–11 (2023)

work page 2023
[25]

In Defense of the Triplet Loss for Person Re-Identification

Hermans, A., Beyer, L., Leibe, B.: In defense of the triplet loss for person re- identification. arXiv preprint arXiv:1703.07737 (2017)

work page Pith review arXiv 2017
[26]

Advances in neural information processing systems32(2019)

Hooker, S., Erhan, D., Kindermans, P.J., Kim, B.: A benchmark for interpretabil- ity methods in deep neural networks. Advances in neural information processing systems32(2019)

work page 2019
[27]

Jacovi, A., Goldberg, Y.: Towards faithfully interpretable nlp systems: How should we define and evaluate faithfulness? In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. pp. 4198–4205 (2020)

work page 2020
[28]

arXiv preprint arXiv:2506.09810 (2025) 20 Arrighi et al

Jeong, M., Hero, A.: Generalizing supervised contrastive learning: A projection perspective. arXiv preprint arXiv:2506.09810 (2025) 20 Arrighi et al

work page arXiv 2025
[29]

In: Proceedings of the 2020 CHI conference on human factors in computing systems

Kaur, H., Nori, H., Jenkins, S., Caruana, R., Wallach, H., Wortman Vaughan, J.: Interpreting interpretability: understanding data scientists’ use of interpretability tools for machine learning. In: Proceedings of the 2020 CHI conference on human factors in computing systems. pp. 1–14 (2020)

work page 2020
[30]

Advances in neural information processing systems33, 18661–18673 (2020)

Khosla, P., Teterwak, P., Wang, C., Sarna, A., Tian, Y., Isola, P., Maschinot, A., Liu, C., Krishnan, D.: Supervised contrastive learning. Advances in neural information processing systems33, 18661–18673 (2020)

work page 2020
[31]

In: 2020 international joint conference on neural networks (IJCNN)

Kohlbrenner, M., Bauer, A., Nakajima, S., Binder, A., Samek, W., Lapuschkin, S.: Towards best practice in explaining neural network decisions with lrp. In: 2020 international joint conference on neural networks (IJCNN). pp. 1–7. IEEE (2020)

work page 2020
[32]

Krizhevsky, A., Hinton, G., et al.: Learning multiple layers of features from tiny images (2009)

work page 2009
[33]

Information Fusion106, 102301 (2024)

Longo, L., Brcic, M., Cabitza, F., Choi, J., Confalonieri, R., Del Ser, J., Guidotti, R., Hayashi, Y., Herrera, F., Holzinger, A., et al.: Explainable Artificial Intelligence (XAI) 2.0: A manifesto of open challenges and interdisciplinary research directions. Information Fusion106, 102301 (2024)

work page 2024
[34]

Multimedia Tools and Applications81(28), 41059–41077 (2022)

Lopes, J.F., da Costa, V.G.T., Barbin, D.F., Cruz-Tirado, L.J.P., Baeten, V., Bar- bon Junior, S.: Deep computer vision system for cocoa classification. Multimedia Tools and Applications81(28), 41059–41077 (2022)

work page 2022
[35]

Advances in neural information processing systems30(2017)

Lundberg, S.M., Lee, S.I.: A unified approach to interpreting model predictions. Advances in neural information processing systems30(2017)

work page 2017
[36]

In: 2020 international joint conference on neural networks (IJCNN)

Muhammad, M.B., Yeasin, M.: Eigen-cam: Class activation map using principal components. In: 2020 international joint conference on neural networks (IJCNN). pp. 1–7. IEEE (2020)

work page 2020
[37]

ACM Computing Surveys55(13s), 1–42 (2023)

Nauta, M., Trienes, J., Pathak, S., Nguyen, E., Peters, M., Schmitt, Y., Schlötterer, J., van Keulen, M., Seifert, C.: From anecdotal evidence to quantitative evalua- tion methods: A systematic review on evaluating explainable ai. ACM Computing Surveys55(13s), 1–42 (2023)

work page 2023
[38]

In: Proceedings of the IEEE con- ference on computer vision and pattern recognition

Nguyen, A., Yosinski, J., Clune, J.: Deep neural networks are easily fooled: High confidence predictions for unrecognizable images. In: Proceedings of the IEEE con- ference on computer vision and pattern recognition. pp. 427–436 (2015)

work page 2015
[39]

Advances in neural information process- ing systems32(2019)

Ovadia, Y., Fertig, E., Ren, J., Nado, Z., Sculley, D., Nowozin, S., Dillon, J., Lak- shminarayanan, B., Snoek, J.: Can you trust your model’s uncertainty? evaluating predictive uncertainty under dataset shift. Advances in neural information process- ing systems32(2019)

work page 2019
[40]

In: 2023 IEEE International Symposium on Information Theory (ISIT)

Paes, L.M., Cruz, R., Calmon, F.P., Diaz, M.: On the inevitability of the rashomon effect. In: 2023 IEEE International Symposium on Information Theory (ISIT). pp. 549–554. IEEE (2023)

work page 2023
[41]

Proceedings of the National Academy of Sciences117(40), 24652–24663 (2020)

Papyan, V., Han, X., Donoho, D.L.: Prevalence of neural collapse during the ter- minal phase of deep learning training. Proceedings of the National Academy of Sciences117(40), 24652–24663 (2020)

work page 2020
[42]

Pattern Recognition131, 108889 (2022)

Qian, Z., Huang, K., Wang, Q.F., Zhang, X.Y.: A survey of robust adversarial training in pattern recognition: Fundamental, theory, and methodologies. Pattern Recognition131, 108889 (2022)

work page 2022
[43]

In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition

Rebuffi, S.A., Fong, R., Ji, X., Vedaldi, A.: There and back again: Revisiting back- propagation saliency methods. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. pp. 8839–8848 (2020)

work page 2020
[44]

Model-Agnostic Interpretability of Machine Learning

Ribeiro, M.T., Singh, S., Guestrin, C.: Model-agnostic interpretability of machine learning. arXiv preprint arXiv:1606.05386 (2016)

work page Pith review arXiv 2016
[45]

Applied Sciences14(19), 8884 (2024) XAI in Contrastive Learning 21

Saarela, M., Podgorelec, V.: Recent applications of Explainable AI (XAI): A sys- tematic literature review. Applied Sciences14(19), 8884 (2024) XAI in Contrastive Learning 21

work page 2024
[46]

In: Proceedings of the IEEE conference on computer vision and pattern recognition

Schroff, F., Kalenichenko, D., Philbin, J.: Facenet: A unified embedding for face recognition and clustering. In: Proceedings of the IEEE conference on computer vision and pattern recognition. pp. 815–823 (2015)

work page 2015
[47]

International Journal of Computer Vision128(2), 336–359 (2020)

Selvaraju, R.R., Cogswell, M., Abhishek, D., Ramakrishna, V., Devi, P., Dhruv, B.: Grad-CAM: Visual Explanations from Deep Networks via Gradient-Based Lo- calization. International Journal of Computer Vision128(2), 336–359 (2020)

work page 2020
[48]

In: International conference on machine learn- ing

Shrikumar, A., Greenside, P., Kundaje, A.: Learning important features through propagating activation differences. In: International conference on machine learn- ing. pp. 3145–3153. PMlR (2017)

work page 2017
[49]

In: International conference on machine learning

Sundararajan, M., Taly, A., Yan, Q.: Axiomatic attribution for deep networks. In: International conference on machine learning. pp. 3319–3328. PMLR (2017)

work page 2017
[50]

Intriguing properties of neural networks

Szegedy, C., Zaremba, W., Sutskever, I., Bruna, J., Erhan, D., Goodfellow, I., Fer- gus, R.: Intriguing properties of neural networks. arXiv preprint arXiv:1312.6199 (2013)

work page internal anchor Pith review arXiv 2013
[51]

Applied Sciences14(10) (2024)

Tang, D., Chen, J., Ren, L., Wang, X., Li, D., Zhang, H.: Reviewing CAM-Based Deep Explainable Methods in Healthcare. Applied Sciences14(10) (2024)

work page 2024
[52]

Advances in Neural Information Processing Systems36, 74952–74965 (2023)

Turpin, M., Michael, J., Perez, E., Bowman, S.: Language models don’t always say what they think: Unfaithful explanations in chain-of-thought prompting. Advances in Neural Information Processing Systems36, 74952–74965 (2023)

work page 2023
[53]

PeerJ 2, e453 (2014)

van der Walt, S., Schönberger, J.L., Nunez-Iglesias, J., Boulogne, F., Warner, J.D., Yager, N., Gouillart, E., Yu, T.: scikit-image: image processing in python. PeerJ 2, e453 (2014)

work page 2014
[54]

In: Proceedings of the IEEE/CVF conference on computer vision and pat- tern recognition workshops

Wang, H., Wang, Z., Du, M., Yang, F., Zhang, Z., Ding, S., Mardziel, P., Hu, X.: Score-cam: Score-weighted visual explanations for convolutional neural net- works. In: Proceedings of the IEEE/CVF conference on computer vision and pat- tern recognition workshops. pp. 24–25 (2020)

work page 2020
[55]

Variational supervised contrastive learning

Wang, Z., Fan, J., Nguyen, T., Ji, H., Liu, G.: Variational supervised contrastive learning. arXiv preprint arXiv:2506.07413 (2025)

work page arXiv 2025
[56]

In: 32nd European Symposium on Artificial Neural Networks, Computational Intelligence and Machine Learning (2024)

Wilfling, J., Valdenegro-Toro, M., Zullich, M.: Evaluating the Quality of Saliency Maps for Distilled Convolutional Neural Networks. In: 32nd European Symposium on Artificial Neural Networks, Computational Intelligence and Machine Learning (2024)

work page 2024
[57]

PLoS medicine15(11), e1002683 (2018)

Zech, J.R., Badgeley, M.A., Liu, M., Costa, A.B., Titano, J.J., Oermann, E.K.: Variable generalization performance of a deep learning model to detect pneumonia in chest radiographs: a cross-sectional study. PLoS medicine15(11), e1002683 (2018)

work page 2018
[58]

International Journal of Computer Vision 126(10), 1084–1102 (2018)

Zhang, J., Bargal, S.A., Lin, Z., Brandt, J., Shen, X., Sclaroff, S.: Top-down neu- ral attention by excitation backprop. International Journal of Computer Vision 126(10), 1084–1102 (2018)

work page 2018
[59]

arXiv preprint arXiv:2111.14271 (2021)

Zhang, Z., Jang, J., Trabelsi, C., Li, R., Sanner, S., Jeong, Y., Shim, D.: Excon: Explanation-driven supervised contrastive learning for image classification. arXiv preprint arXiv:2111.14271 (2021)

work page arXiv 2021
[60]

In: The Thirteenth International Conference on Learning Representations (2025)

Zheng,X.,Shirani,F.,Chen,Z.,Lin,C.,Cheng,W.,Guo,W.,Luo,D.:F-fidelity:A robust framework for faithfulness evaluation of explainable AI. In: The Thirteenth International Conference on Learning Representations (2025)

work page 2025
[61]

In: Proceedings of the 2021 Conference on Empirical Meth- ods in Natural Language Processing

Zhou, W., Liu, F., Chen, M.: Contrastive Out-of-Distribution Detection for Pre- trained Transformers. In: Proceedings of the 2021 Conference on Empirical Meth- ods in Natural Language Processing. pp. 1100–1111 (2021)

work page 2021

[1] [1]

Advances in neural information processing systems31 (2018)

Adebayo, J., Gilmer, J., Muelly, M., Goodfellow, I., Hardt, M., Kim, B.: Sanity checks for saliency maps. Advances in neural information processing systems31 (2018)

work page 2018

[2] [2]

In: International conference on machine learning

van Amersfoort, J., Smith, L., Teh, Y.W., Gal, Y.: Uncertainty estimation using a single deep deterministic neural network. In: International conference on machine learning. pp. 9690–9700. PMLR (2020)

work page 2020

[3] [3]

Information Fusion77, 261–295 (2022)

Anders, C.J., Weber, L., Neumann, D., Samek, W., Müller, K.R., Lapuschkin, S.: Finding and removing Clever Hans: Using explanation methods to debug and improve deep models. Information Fusion77, 261–295 (2022)

work page 2022

[4] [4]

Arrighi,L.,BarbonJunior,S.,Pellegrino,F.A.,Simonato,M.,Zullich,M.:Explain- able Automated Anomaly Recognition in Failure Analysis: is Deep Learning Doing it Correctly? In: Explainable Artificial Intelligence. pp. 420–432. Communications in Computer and Information Science (2023)

work page 2023

[5] [5]

In: Image Analysis and Processing - ICIAP 2025 Workshops

Arrighi, L., de Moraes, I.A., Simonato, M., Barbon Junior, S.: Discriminating Short-Term Moisture Changes in Stuffed Pasta Using Deep Computer Vision. In: Image Analysis and Processing - ICIAP 2025 Workshops. pp. 489–496. Springer Nature Switzerland (2026)

work page 2025

[6] [6]

PloS one10(7), e0130140 (2015)

Bach, S., Binder, A., Montavon, G., Klauschen, F., Müller, K.R., Samek, W.: On pixel-wise explanations for non-linear classifier decisions by layer-wise relevance propagation. PloS one10(7), e0130140 (2015)

work page 2015

[7] [7]

Evaluating and aggre- gating feature-based model explanations.arXiv preprint arXiv:2005.00631, 2020

Bhatt, U., Weller, A., Moura, J.M.: Evaluating and aggregating feature-based model explanations. arXiv preprint arXiv:2005.00631 (2020)

work page arXiv 2005

[8] [8]

In: International Conference on Ma- chine Learning

Chalasani, P., Chen, J., Chowdhury, A.R., Wu, X., Jha, S.: Concise explanations of neural networks using adversarial training. In: International Conference on Ma- chine Learning. pp. 1383–1391. PMLR (2020)

work page 2020

[9] [9]

In: 2018 IEEE winter conference on applications of computer vision (WACV)

Chattopadhay, A., Sarkar, A., Howlader, P., Balasubramanian, V.N.: Grad- cam++: Generalized gradient-based visual explanations for deep convolutional networks. In: 2018 IEEE winter conference on applications of computer vision (WACV). pp. 839–847. IEEE (2018)

work page 2018

[10] [10]

In: International Conference on Machine Learning (ICML)

Chen, T., Kornblith, S., Norouzi, M., Hinton, G.: A simple framework for con- trastive learning of visual representations. In: International Conference on Machine Learning (ICML). pp. 1597–1607. PMLR (2020) XAI in Contrastive Learning 19

work page 2020

[11] [11]

In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P., Vaughan, J.W

Chen, T., Luo, C., Li, L.: Intriguing properties of contrastive losses. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems. vol. 34, pp. 11834–11845. Curran Asso- ciates, Inc. (2021)

work page 2021

[12] [12]

In: 2005 IEEE computer society conference on computer vision and pattern recognition (CVPR’05)

Chopra, S., Hadsell, R., LeCun, Y.: Learning a similarity metric discriminatively, with application to face verification. In: 2005 IEEE computer society conference on computer vision and pattern recognition (CVPR’05). vol. 1, pp. 539–546. IEEE (2005)

work page 2005

[13] [13]

XAI in Action: Past, Present, and Future Applications (2023)

Deck, L., Schoeffer, J., De-Arteaga, M., Kuehl, N.: A critical survey on fairness benefits of XAI. XAI in Action: Past, Present, and Future Applications (2023)

work page 2023

[14] [14]

In: 2009 IEEE conference on computer vision and pattern recognition

Deng, J., Dong, W., Socher, R., Li, L.J., Li, K., Fei-Fei, L.: Imagenet: A large- scale hierarchical image database. In: 2009 IEEE conference on computer vision and pattern recognition. pp. 248–255. Ieee (2009)

work page 2009

[15] [15]

Fan, L., Liu, S., Chen, P.Y., Zhang, G., Gan, C.: When does contrastive learning preserve adversarial robustness from pretraining to finetuning? Advances in neural information processing systems34, 21480–21492 (2021)

work page 2021

[16] [16]

Gao, S., Li, Z.Y., Yang, M.H., Cheng, M.M., Han, J., Torr, P.: Large-scale Unsu- pervised Semantic Segmentation (2022)

work page 2022

[17] [17]

In: Proceedings of the AAAI conference on artificial intelligence

Ghorbani, A., Abid, A., Zou, J.: Interpretation of neural networks is fragile. In: Proceedings of the AAAI conference on artificial intelligence. vol. 33, pp. 3681–3688 (2019)

work page 2019

[18] [18]

https://github.com/ jacobgil/pytorch-grad-cam (2021)

Gildenblat, J., contributors: Pytorch library for cam methods. https://github.com/ jacobgil/pytorch-grad-cam (2021)

work page 2021

[19] [19]

In: International Conference on Machine Learning

Graf, F., Hofer, C., Niethammer, M., Kwitt, R.: Dissecting supervised contrastive learning. In: International Conference on Machine Learning. pp. 3821–3830. PMLR (2021)

work page 2021

[20] [20]

Journal of machine learning research3(Mar), 1157–1182 (2003)

Guyon, I., Elisseeff, A.: An introduction to variable and feature selection. Journal of machine learning research3(Mar), 1157–1182 (2003)

work page 2003

[21] [21]

In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition

He, K., Fan, H., Wu, Y., Xie, S., Girshick, R.: Momentum contrast for unsupervised visual representation learning. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. pp. 9729–9738 (2020)

work page 2020

[22] [22]

He,K.,Zhang,X.,Ren,S.,Sun,J.:Deepresiduallearningforimagerecognition.In: Proceedings of the IEEE conference on computer vision and pattern recognition. pp. 770–778 (2016)

work page 2016

[23] [23]

In: Signal and Information Processing, Net- working and Computers

He, M., Li, B., Sun, S.: A Survey of Class Activation Mapping for the Interpretabil- ity of Convolution Neural Networks. In: Signal and Information Processing, Net- working and Computers. pp. 399–407. Springer Nature (2023)

work page 2023

[24] [24]

Journal of Machine Learning Research24(34), 1–11 (2023)

Hedström, A., Weber, L., Krakowczyk, D., Bareeva, D., Motzkus, F., Samek, W., Lapuschkin, S., Höhne, M.M.C.: Quantus: An explainable ai toolkit for respon- sible evaluation of neural network explanations and beyond. Journal of Machine Learning Research24(34), 1–11 (2023)

work page 2023

[25] [25]

In Defense of the Triplet Loss for Person Re-Identification

Hermans, A., Beyer, L., Leibe, B.: In defense of the triplet loss for person re- identification. arXiv preprint arXiv:1703.07737 (2017)

work page Pith review arXiv 2017

[26] [26]

Advances in neural information processing systems32(2019)

Hooker, S., Erhan, D., Kindermans, P.J., Kim, B.: A benchmark for interpretabil- ity methods in deep neural networks. Advances in neural information processing systems32(2019)

work page 2019

[27] [27]

Jacovi, A., Goldberg, Y.: Towards faithfully interpretable nlp systems: How should we define and evaluate faithfulness? In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. pp. 4198–4205 (2020)

work page 2020

[28] [28]

arXiv preprint arXiv:2506.09810 (2025) 20 Arrighi et al

Jeong, M., Hero, A.: Generalizing supervised contrastive learning: A projection perspective. arXiv preprint arXiv:2506.09810 (2025) 20 Arrighi et al

work page arXiv 2025

[29] [29]

In: Proceedings of the 2020 CHI conference on human factors in computing systems

Kaur, H., Nori, H., Jenkins, S., Caruana, R., Wallach, H., Wortman Vaughan, J.: Interpreting interpretability: understanding data scientists’ use of interpretability tools for machine learning. In: Proceedings of the 2020 CHI conference on human factors in computing systems. pp. 1–14 (2020)

work page 2020

[30] [30]

Advances in neural information processing systems33, 18661–18673 (2020)

Khosla, P., Teterwak, P., Wang, C., Sarna, A., Tian, Y., Isola, P., Maschinot, A., Liu, C., Krishnan, D.: Supervised contrastive learning. Advances in neural information processing systems33, 18661–18673 (2020)

work page 2020

[31] [31]

In: 2020 international joint conference on neural networks (IJCNN)

Kohlbrenner, M., Bauer, A., Nakajima, S., Binder, A., Samek, W., Lapuschkin, S.: Towards best practice in explaining neural network decisions with lrp. In: 2020 international joint conference on neural networks (IJCNN). pp. 1–7. IEEE (2020)

work page 2020

[32] [32]

Krizhevsky, A., Hinton, G., et al.: Learning multiple layers of features from tiny images (2009)

work page 2009

[33] [33]

Information Fusion106, 102301 (2024)

Longo, L., Brcic, M., Cabitza, F., Choi, J., Confalonieri, R., Del Ser, J., Guidotti, R., Hayashi, Y., Herrera, F., Holzinger, A., et al.: Explainable Artificial Intelligence (XAI) 2.0: A manifesto of open challenges and interdisciplinary research directions. Information Fusion106, 102301 (2024)

work page 2024

[34] [34]

Multimedia Tools and Applications81(28), 41059–41077 (2022)

Lopes, J.F., da Costa, V.G.T., Barbin, D.F., Cruz-Tirado, L.J.P., Baeten, V., Bar- bon Junior, S.: Deep computer vision system for cocoa classification. Multimedia Tools and Applications81(28), 41059–41077 (2022)

work page 2022

[35] [35]

Advances in neural information processing systems30(2017)

Lundberg, S.M., Lee, S.I.: A unified approach to interpreting model predictions. Advances in neural information processing systems30(2017)

work page 2017

[36] [36]

In: 2020 international joint conference on neural networks (IJCNN)

Muhammad, M.B., Yeasin, M.: Eigen-cam: Class activation map using principal components. In: 2020 international joint conference on neural networks (IJCNN). pp. 1–7. IEEE (2020)

work page 2020

[37] [37]

ACM Computing Surveys55(13s), 1–42 (2023)

Nauta, M., Trienes, J., Pathak, S., Nguyen, E., Peters, M., Schmitt, Y., Schlötterer, J., van Keulen, M., Seifert, C.: From anecdotal evidence to quantitative evalua- tion methods: A systematic review on evaluating explainable ai. ACM Computing Surveys55(13s), 1–42 (2023)

work page 2023

[38] [38]

In: Proceedings of the IEEE con- ference on computer vision and pattern recognition

Nguyen, A., Yosinski, J., Clune, J.: Deep neural networks are easily fooled: High confidence predictions for unrecognizable images. In: Proceedings of the IEEE con- ference on computer vision and pattern recognition. pp. 427–436 (2015)

work page 2015

[39] [39]

Advances in neural information process- ing systems32(2019)

Ovadia, Y., Fertig, E., Ren, J., Nado, Z., Sculley, D., Nowozin, S., Dillon, J., Lak- shminarayanan, B., Snoek, J.: Can you trust your model’s uncertainty? evaluating predictive uncertainty under dataset shift. Advances in neural information process- ing systems32(2019)

work page 2019

[40] [40]

In: 2023 IEEE International Symposium on Information Theory (ISIT)

Paes, L.M., Cruz, R., Calmon, F.P., Diaz, M.: On the inevitability of the rashomon effect. In: 2023 IEEE International Symposium on Information Theory (ISIT). pp. 549–554. IEEE (2023)

work page 2023

[41] [41]

Proceedings of the National Academy of Sciences117(40), 24652–24663 (2020)

Papyan, V., Han, X., Donoho, D.L.: Prevalence of neural collapse during the ter- minal phase of deep learning training. Proceedings of the National Academy of Sciences117(40), 24652–24663 (2020)

work page 2020

[42] [42]

Pattern Recognition131, 108889 (2022)

Qian, Z., Huang, K., Wang, Q.F., Zhang, X.Y.: A survey of robust adversarial training in pattern recognition: Fundamental, theory, and methodologies. Pattern Recognition131, 108889 (2022)

work page 2022

[43] [43]

In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition

Rebuffi, S.A., Fong, R., Ji, X., Vedaldi, A.: There and back again: Revisiting back- propagation saliency methods. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. pp. 8839–8848 (2020)

work page 2020

[44] [44]

Model-Agnostic Interpretability of Machine Learning

Ribeiro, M.T., Singh, S., Guestrin, C.: Model-agnostic interpretability of machine learning. arXiv preprint arXiv:1606.05386 (2016)

work page Pith review arXiv 2016

[45] [45]

Applied Sciences14(19), 8884 (2024) XAI in Contrastive Learning 21

Saarela, M., Podgorelec, V.: Recent applications of Explainable AI (XAI): A sys- tematic literature review. Applied Sciences14(19), 8884 (2024) XAI in Contrastive Learning 21

work page 2024

[46] [46]

In: Proceedings of the IEEE conference on computer vision and pattern recognition

Schroff, F., Kalenichenko, D., Philbin, J.: Facenet: A unified embedding for face recognition and clustering. In: Proceedings of the IEEE conference on computer vision and pattern recognition. pp. 815–823 (2015)

work page 2015

[47] [47]

International Journal of Computer Vision128(2), 336–359 (2020)

Selvaraju, R.R., Cogswell, M., Abhishek, D., Ramakrishna, V., Devi, P., Dhruv, B.: Grad-CAM: Visual Explanations from Deep Networks via Gradient-Based Lo- calization. International Journal of Computer Vision128(2), 336–359 (2020)

work page 2020

[48] [48]

In: International conference on machine learn- ing

Shrikumar, A., Greenside, P., Kundaje, A.: Learning important features through propagating activation differences. In: International conference on machine learn- ing. pp. 3145–3153. PMlR (2017)

work page 2017

[49] [49]

In: International conference on machine learning

Sundararajan, M., Taly, A., Yan, Q.: Axiomatic attribution for deep networks. In: International conference on machine learning. pp. 3319–3328. PMLR (2017)

work page 2017

[50] [50]

Intriguing properties of neural networks

Szegedy, C., Zaremba, W., Sutskever, I., Bruna, J., Erhan, D., Goodfellow, I., Fer- gus, R.: Intriguing properties of neural networks. arXiv preprint arXiv:1312.6199 (2013)

work page internal anchor Pith review arXiv 2013

[51] [51]

Applied Sciences14(10) (2024)

Tang, D., Chen, J., Ren, L., Wang, X., Li, D., Zhang, H.: Reviewing CAM-Based Deep Explainable Methods in Healthcare. Applied Sciences14(10) (2024)

work page 2024

[52] [52]

Advances in Neural Information Processing Systems36, 74952–74965 (2023)

Turpin, M., Michael, J., Perez, E., Bowman, S.: Language models don’t always say what they think: Unfaithful explanations in chain-of-thought prompting. Advances in Neural Information Processing Systems36, 74952–74965 (2023)

work page 2023

[53] [53]

PeerJ 2, e453 (2014)

van der Walt, S., Schönberger, J.L., Nunez-Iglesias, J., Boulogne, F., Warner, J.D., Yager, N., Gouillart, E., Yu, T.: scikit-image: image processing in python. PeerJ 2, e453 (2014)

work page 2014

[54] [54]

In: Proceedings of the IEEE/CVF conference on computer vision and pat- tern recognition workshops

Wang, H., Wang, Z., Du, M., Yang, F., Zhang, Z., Ding, S., Mardziel, P., Hu, X.: Score-cam: Score-weighted visual explanations for convolutional neural net- works. In: Proceedings of the IEEE/CVF conference on computer vision and pat- tern recognition workshops. pp. 24–25 (2020)

work page 2020

[55] [55]

Variational supervised contrastive learning

Wang, Z., Fan, J., Nguyen, T., Ji, H., Liu, G.: Variational supervised contrastive learning. arXiv preprint arXiv:2506.07413 (2025)

work page arXiv 2025

[56] [56]

In: 32nd European Symposium on Artificial Neural Networks, Computational Intelligence and Machine Learning (2024)

Wilfling, J., Valdenegro-Toro, M., Zullich, M.: Evaluating the Quality of Saliency Maps for Distilled Convolutional Neural Networks. In: 32nd European Symposium on Artificial Neural Networks, Computational Intelligence and Machine Learning (2024)

work page 2024

[57] [57]

PLoS medicine15(11), e1002683 (2018)

Zech, J.R., Badgeley, M.A., Liu, M., Costa, A.B., Titano, J.J., Oermann, E.K.: Variable generalization performance of a deep learning model to detect pneumonia in chest radiographs: a cross-sectional study. PLoS medicine15(11), e1002683 (2018)

work page 2018

[58] [58]

International Journal of Computer Vision 126(10), 1084–1102 (2018)

Zhang, J., Bargal, S.A., Lin, Z., Brandt, J., Shen, X., Sclaroff, S.: Top-down neu- ral attention by excitation backprop. International Journal of Computer Vision 126(10), 1084–1102 (2018)

work page 2018

[59] [59]

arXiv preprint arXiv:2111.14271 (2021)

Zhang, Z., Jang, J., Trabelsi, C., Li, R., Sanner, S., Jeong, Y., Shim, D.: Excon: Explanation-driven supervised contrastive learning for image classification. arXiv preprint arXiv:2111.14271 (2021)

work page arXiv 2021

[60] [60]

In: The Thirteenth International Conference on Learning Representations (2025)

Zheng,X.,Shirani,F.,Chen,Z.,Lin,C.,Cheng,W.,Guo,W.,Luo,D.:F-fidelity:A robust framework for faithfulness evaluation of explainable AI. In: The Thirteenth International Conference on Learning Representations (2025)

work page 2025

[61] [61]

In: Proceedings of the 2021 Conference on Empirical Meth- ods in Natural Language Processing

Zhou, W., Liu, F., Chen, M.: Contrastive Out-of-Distribution Detection for Pre- trained Transformers. In: Proceedings of the 2021 Conference on Empirical Meth- ods in Natural Language Processing. pp. 1100–1111 (2021)

work page 2021