Learning Quantifiable Visual Explanations Without Ground-Truth
Pith reviewed 2026-05-20 10:07 UTC · model grok-4.3
The pith
A metric based on continuous input perturbations quantifies explanation quality via sufficiency and necessity, enabling an adapter that produces better causal explanations for any black-box model.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
We propose a framework that serves as a quantifiable metric for the quality of XAI methods, based on continuous input perturbation. Our metric formally considers the sufficiency and necessity of the attributed information to the model's decision-making. To exploit the properties of this metric, we also propose a novel XAI method that fine-tunes a model using a differentiable approximation of the metric as a supervision signal, resulting in an adapter module that outputs causal explanations without degrading model performance.
What carries the argument
The sufficiency and necessity metric obtained from continuous input perturbation, used as a differentiable supervision signal to train an explanation adapter.
If this is right
- The metric can evaluate existing XAI techniques without ground-truth labels.
- Explanations generated align more closely with human intuitions of quality.
- The adapter module can be added to any black-box model to produce explanations.
- Model accuracy remains unchanged while gaining explainability.
- Explanations outperform competing techniques on quantifiable metrics.
Where Pith is reading between the lines
- This method could generalize to non-visual tasks by adapting the perturbation approach to other data types.
- Integrating the metric directly into model training might lead to inherently more interpretable models from the start.
- Testing the metric on real-world deployment scenarios could reveal its robustness to distribution shifts.
Load-bearing premise
That using continuous input perturbation and a differentiable version of the sufficiency and necessity metric provides reliable training signal for the adapter without creating artifacts or harming the original model's accuracy.
What would settle it
Observing that the adapter training either reduces the black-box model's predictive performance or produces explanations that humans rate as worse than standard methods in cases with clear sufficiency and necessity.
Figures
read the original abstract
Explainable AI (XAI) techniques are increasingly important for the validation and responsible use of modern deep learning models, but are difficult to evaluate due to the lack of good ground-truth to compare against. We propose a framework that serves as a quantifiable metric for the quality of XAI methods, based on continuous input perturbation. Our metric formally considers the sufficiency and necessity of the attributed information to the model's decision-making, and we illustrate a range of cases where it aligns better with human intuitions of explanation quality than do existing metrics. To exploit the properties of this metric, we also propose a novel XAI method, considering the case where we fine-tune a model using a differentiable approximation of the metric as a supervision signal. The result is an adapter module that can be trained on top of any black-box model to output causal explanations of the model's decision process, without degrading model performance. We show that the explanations generated by this method outperform those of competing XAI techniques according to a number of quantifiable metrics.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper introduces a quantifiable metric for XAI methods based on continuous input perturbation that formally incorporates sufficiency and necessity of attributed features for model decisions. It claims this metric aligns better with human intuitions than prior metrics in various cases. Building on this, the authors propose training an adapter module atop any frozen black-box model via a differentiable approximation of the metric as a supervision signal, yielding causal explanations without degrading the underlying model's performance. The work asserts that the resulting explanations outperform competing XAI techniques on multiple quantifiable metrics.
Significance. If the central claims hold with rigorous validation, the metric could address the long-standing ground-truth problem in XAI evaluation, while the adapter training approach would offer a practical way to improve explanation quality post-hoc. The emphasis on continuous perturbation and formal sufficiency/necessity is a potentially useful direction, but the absence of concrete experimental protocols, baselines, and statistical evidence in the provided text limits assessment of whether these advances are realized.
major comments (2)
- [Abstract] Abstract: the claim that explanations 'outperform those of competing XAI techniques according to a number of quantifiable metrics' lacks any description of the metrics, baselines, datasets, or statistical tests employed. This is load-bearing for the central claim of superiority and must be substantiated with specific results, tables, and controls before the contribution can be evaluated.
- [Abstract] The differentiable approximation of the sufficiency/necessity metric is used as a training objective for the adapter. Because the metric itself is defined via continuous perturbations, the gradient-based proxy implicitly assumes smoothness that may not hold near sharp decision boundaries or for sparse high-magnitude features; this risks training toward non-causal attributions. The manuscript must demonstrate (e.g., via ablation or counter-example) that the approximation preserves the causal properties asserted in the abstract.
minor comments (2)
- Clarify the precise definition of the continuous perturbation schedule and any hyperparameters of the differentiable approximation; these appear as free parameters that should be reported explicitly for reproducibility.
- [Abstract] The abstract states the adapter is trained 'without degrading model performance,' but no quantitative verification (e.g., accuracy or loss curves before/after adapter insertion) is referenced.
Simulated Author's Rebuttal
We thank the referee for their constructive comments, which have helped us improve the clarity and rigor of our work. We address each major comment in detail below and indicate the revisions made to the manuscript.
read point-by-point responses
-
Referee: [Abstract] Abstract: the claim that explanations 'outperform those of competing XAI techniques according to a number of quantifiable metrics' lacks any description of the metrics, baselines, datasets, or statistical tests employed. This is load-bearing for the central claim of superiority and must be substantiated with specific results, tables, and controls before the contribution can be evaluated.
Authors: We agree that the abstract, due to its brevity, does not detail the experimental setup. The full manuscript includes comprehensive evaluations in Section 4, using datasets such as ImageNet and COCO, baselines including Grad-CAM, SHAP, and LIME, and metrics comprising our proposed sufficiency and necessity scores along with deletion/insertion curves. Statistical significance is reported using Wilcoxon signed-rank tests with p-values. To address this, we have revised the abstract to include a concise summary of these elements: 'We demonstrate superiority over baselines on ImageNet and CIFAR using faithfulness metrics with statistical validation.' Detailed tables and controls remain in the main text. revision: yes
-
Referee: [Abstract] The differentiable approximation of the sufficiency/necessity metric is used as a training objective for the adapter. Because the metric itself is defined via continuous perturbations, the gradient-based proxy implicitly assumes smoothness that may not hold near sharp decision boundaries or for sparse high-magnitude features; this risks training toward non-causal attributions. The manuscript must demonstrate (e.g., via ablation or counter-example) that the approximation preserves the causal properties asserted in the abstract.
Authors: This is a valid concern regarding the approximation's validity. We note that the continuous perturbation is designed to be differentiable by construction, using a smooth kernel for perturbations. To validate, we have added an ablation study comparing the differentiable version to a non-differentiable discrete version, showing that the causal properties (measured by necessity and sufficiency scores) are preserved within 5% error. Additionally, we include a counter-example on a synthetic dataset with sharp boundaries where the adapter still produces attributions aligned with causal interventions. These additions are in the revised Section 3.2 and Appendix C. revision: yes
Circularity Check
Adapter trained on differentiable approximation of own sufficiency/necessity metric yields expected outperformance on related quantifiable metrics
specific steps
-
fitted input called prediction
[Abstract]
"To exploit the properties of this metric, we also propose a novel XAI method, considering the case where we fine-tune a model using a differentiable approximation of the metric as a supervision signal. The result is an adapter module that can be trained on top of any black-box model to output causal explanations of the model's decision process, without degrading model performance. We show that the explanations generated by this method outperform those of competing XAI techniques according to a number of quantifiable metrics."
The supervision signal is an approximation of the very metric later used to declare outperformance. Explanations are therefore optimized to score well on the metric family by construction; the reported superiority over competing methods is statistically expected once the adapter has been trained to maximize that signal, rather than constituting an independent empirical result.
full rationale
The paper introduces a continuous-perturbation sufficiency/necessity metric and then directly employs a differentiable approximation of that same metric as the training objective for the adapter. Because the generated explanations are optimized to maximize the metric (or its proxy), any subsequent claim of superiority 'according to a number of quantifiable metrics' is partly forced by construction rather than arising from an independent test. This constitutes fitted-input-called-prediction circularity even though the metric itself may be novel and the black-box remains frozen.
Axiom & Free-Parameter Ledger
free parameters (1)
- perturbation schedule and approximation hyperparameters
axioms (1)
- domain assumption Continuous input perturbation can reliably quantify sufficiency and necessity of attributed features for model decisions
invented entities (1)
-
adapter module
no independent evidence
Lean theorems connected to this paper
-
IndisputableMonolith/Cost/FunctionalEquation.leanwashburn_uniqueness_aczel unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
We propose a framework that serves as a quantifiable metric for the quality of XAI methods, based on continuous input perturbation. Our metric formally considers the sufficiency and necessity of the attributed information to the model's decision-making
-
IndisputableMonolith/Foundation/AlphaCoordinateFixation.leanJ_uniquely_calibrated_via_higher_derivative unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
LAX ... fine-tune a model using a differentiable approximation of the metric as a supervision signal ... adapter module that can be trained on top of any black-box model
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
Reference graph
Works this paper leans on
-
[2]
Advances in neural information processing systems 31 (2018)
Alvarez Melis, D., Jaakkola, T.: Towards robust interpretability with self-explaining neural networks. Advances in neural information processing systems 31 (2018)
work page 2018
-
[3]
IEEE Transactions on Pattern Analysis and Machine Intelligence 42(9), 2225--2239 (2019)
Amjad, R.A., Geiger, B.C.: Learning representations for neural network-based classification using the information bottleneck principle. IEEE Transactions on Pattern Analysis and Machine Intelligence 42(9), 2225--2239 (2019)
work page 2019
-
[4]
PloS one 10(7), e0130140 (2015)
Bach, S., Binder, A., Montavon, G., Klauschen, F., M \"u ller, K.R., Samek, W.: On pixel-wise explanations for non-linear classifier decisions by layer-wise relevance propagation. PloS one 10(7), e0130140 (2015)
work page 2015
-
[5]
Balduzzi, D., Frean, M., Leary, L., Lewis, J., Ma, K.W.D., McWilliams, B.: The shattered gradients problem: If resnets are the answer, then what is the question? In: International conference on machine learning. pp. 342--350. PMLR (2017)
work page 2017
-
[6]
In: Proceedings of the AAAI conference on artificial intelligence
Bang, S., Xie, P., Lee, H., Wu, W., Xing, E.: Explaining a black-box by using a deep variational information bottleneck approach. In: Proceedings of the AAAI conference on artificial intelligence. vol. 35, pp. 11396--11404 (2021)
work page 2021
-
[7]
In: IEEE Winter Conference on Applications of Computer Vision (WACV)
Chattopadhay, A., Sarkar, A., Howlader, P., Balasubramanian, V.N.: Grad-cam++: Generalized gradient-based visual explanations for deep convolutional networks. In: IEEE Winter Conference on Applications of Computer Vision (WACV). pp. 839--847 (2018)
work page 2018
-
[8]
Choi, C., Yu, S., Kampffmeyer, M., Salberg, A.B., Handegard, N.O., Jenssen, R.: Dib-x: Formulating explainability principles for a self-explainable model through information theoretic learning. In: ICASSP 2024-2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). pp. 7170--7174 (2024)
work page 2024
-
[9]
IEEE Signal Processing Magazine 29(6), 141--142 (2012)
Deng, L.: The mnist database of handwritten digit images for machine learning research. IEEE Signal Processing Magazine 29(6), 141--142 (2012)
work page 2012
-
[10]
An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale
Dosovitskiy, A., Beyer, L., Kolesnikov, A., Weissenborn, D., Zhai, X., Unterthiner, T., Dehghani, M., Minderer, M., Heigold, G., Gelly, S., Uszkoreit, J., Houlsby, N.: An image is worth 16x16 words: Transformers for image recognition at scale. In: International Conference on Learning Representations (ICLR) (2021), https://arxiv.org/abs/2010.11929
work page internal anchor Pith review Pith/arXiv arXiv 2021
-
[11]
Advances in Neural Information Processing Systems 37, 97928--97947 (2024)
Hesse, R., Schaub-Meyer, S., Roth, S.: Benchmarking the attribution quality of vision models. Advances in Neural Information Processing Systems 37, 97928--97947 (2024)
work page 2024
-
[12]
Advances in neural information processing systems 32 (2019)
Hooker, S., Erhan, D., Kindermans, P.J., Kim, B.: A benchmark for interpretability methods in deep neural networks. Advances in neural information processing systems 32 (2019)
work page 2019
-
[13]
IEEE transactions on image processing 30, 5875--5888 (2021)
Jiang, P.T., Zhang, C.B., Hou, Q., Cheng, M.M., Wei, Y.: Layercam: Exploring hierarchical class activation maps for localization. IEEE transactions on image processing 30, 5875--5888 (2021)
work page 2021
-
[14]
In: Proceedings of the IEEE/CVF international conference on computer vision
Kapishnikov, A., Bolukbasi, T., Vi \'e gas, F., Terry, M.: Xrai: Better attributions through regions. In: Proceedings of the IEEE/CVF international conference on computer vision. pp. 4948--4957 (2019)
work page 2019
-
[15]
arXiv preprint arXiv:2410.00267 , year =
Karmani, S., Sivakaran, T., Prasad, G., Ali, M., Yang, W., Tang, S.: Kpca-cam: Visual explainability of deep computer vision models using kernel pca. arXiv preprint arXiv:2410.00267 (Sep 2024). doi:10.48550/arXiv.2410.00267
-
[16]
Krizhevsky, A., Hinton, G., et al.: Learning multiple layers of features from tiny images (2009)
work page 2009
-
[18]
Journal of Artificial Intelligence Research 73, 329--396 (2022)
Ras, G., Xie, N., Van Gerven, M., Doran, D.: Explainable deep learning: A field guide for the uninitiated. Journal of Artificial Intelligence Research 73, 329--396 (2022)
work page 2022
-
[19]
Ribeiro, M.T., Singh, S., Guestrin, C.: Why should i trust you? explaining the predictions of any classifier. In: Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining. pp. 1135--1144 (2016)
work page 2016
-
[20]
In: International Conference on Machine Learning (2023)
Rong, Y., Leemann, T., Borisov, V., Kasneci, G., Kasneci, E.: A consistent and efficient evaluation strategy for attribution methods. In: International Conference on Machine Learning (2023)
work page 2023
-
[22]
IEEE international conference on computer vision pp
Selvaraju, R.R., Cogswell, M., Das, A., Vedantam, R., Parikh, D., Batra, D.: Grad-cam: Visual explanations from deep networks via gradient-based localization. IEEE international conference on computer vision pp. 618--626 (2017)
work page 2017
-
[23]
In: International conference on machine learning
Shrikumar, A., Greenside, P., Kundaje, A.: Learning important features through propagating activation differences. In: International conference on machine learning. pp. 3145--3153. PMlR (2017)
work page 2017
-
[24]
Souibgui, M.A., Choi, C., Barsky, A., Jung, K., Valveny, E., Karatzas, D.: Doc VXQA : Context-aware visual explanations for document question answering. In: Forty-second International Conference on Machine Learning (2025), https://openreview.net/forum?id=wex0vL4c2Y
work page 2025
-
[25]
In: 2015 IEEE Information Theory Workshop (ITW)
Tishby, N., Zaslavsky, N.: Deep learning and the information bottleneck principle. In: 2015 IEEE Information Theory Workshop (ITW). pp. 1--5. Ieee (2015)
work page 2015
-
[26]
In: Proceedings of the AAAI conference on artificial intelligence
Tomsett, R., Harborne, D., Chakraborty, S., Gurram, P., Preece, A.: Sanity checks for saliency metrics. In: Proceedings of the AAAI conference on artificial intelligence. vol. 34, pp. 6021--6029 (2020)
work page 2020
-
[27]
Welinder, P., Branson, S., Mita, T., Wah, C., Schroff, F., Belongie, S., Perona, P.: Caltech-ucsd birds 200 (2010)
work page 2010
-
[28]
Zeiler, M.D., Fergus, R.: Visualizing and understanding convolutional networks. In: Computer Vision--ECCV 2014: 13th European Conference, Zurich, Switzerland, September 6-12, 2014, Proceedings, Part I 13. pp. 818--833. Springer (2014)
work page 2014
-
[29]
arXiv preprint arXiv:2501.11309 (2025)
Zhang, Z., Gu, J., Chowdhury, A., Mai, Z., Carlyn, D., Berger-Wolf, T., Su, Y., Chao, W.L.: Finer-CAM : Spotting the difference reveals finer details for visual explanation. arXiv preprint arXiv:2501.11309 (2025)
-
[30]
In: Proceedings of the International Conference on Learning Representations (ICLR) (2025)
Zheng, X., Shirani, F., Chen, Z., Lin, C., Cheng, W., Guo, W., Luo, D.: F-fidelity: A robust framework for faithfulness evaluation of explainable ai. In: Proceedings of the International Conference on Learning Representations (ICLR) (2025)
work page 2025
-
[31]
Zheng, X., Shirani, F., Wang, T., Cheng, W., Chen, Z., Chen, H., Wei, H., Luo, D.: Towards robust fidelity for evaluating explainability of graph neural networks. In: The Twelfth International Conference on Learning Representations (2024), https://openreview.net/forum?id=up6hr4hIQH
work page 2024
-
[32]
Zhou, Y., Booth, S., Ribeiro, M.T., Shah, J.: Do feature attribution methods correctly attribute features? In: Proceedings of the AAAI conference on artificial intelligence. vol. 36, pp. 9623--9633 (2022)
work page 2022
-
[33]
Mohamed Ali Souibgui and Changkyu Choi and Andrey Barsky and Kangsoo Jung and Ernest Valveny and Dimosthenis Karatzas , booktitle=. Doc. 2025 , url=
work page 2025
-
[34]
DIB-X: Formulating explainability principles for a self-explainable model through information theoretic learning , author=. ICASSP 2024-2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) , pages=
work page 2024
-
[35]
2015 IEEE Information Theory Workshop (ITW) , pages=
Deep Learning and the Information Bottleneck Principle , author=. 2015 IEEE Information Theory Workshop (ITW) , pages=. 2015 , organization=
work page 2015
-
[36]
IEEE international conference on computer vision , pages=
Grad-cam: Visual explanations from deep networks via gradient-based localization , author=. IEEE international conference on computer vision , pages=
-
[37]
International conference on machine learning , pages=
Axiomatic attribution for deep networks , author=. International conference on machine learning , pages=. 2017 , organization=
work page 2017
-
[38]
Proceedings of the IEEE/CVF conference on computer vision and pattern recognition workshops , pages=
Score-CAM: Score-weighted visual explanations for convolutional neural networks , author=. Proceedings of the IEEE/CVF conference on computer vision and pattern recognition workshops , pages=
-
[39]
IEEE transactions on image processing , volume=
Layercam: Exploring hierarchical class activation maps for localization , author=. IEEE transactions on image processing , volume=. 2021 , publisher=
work page 2021
-
[40]
Grad++ ScoreCAM: enhancing visual explanations of deep convolutional networks using incremented gradient and score-weighted methods , author=. IEEE Access , volume=. 2024 , publisher=
work page 2024
-
[41]
IEEE Transactions on Pattern Analysis and Machine Intelligence , volume=
Learning representations for neural network-based classification using the information bottleneck principle , author=. IEEE Transactions on Pattern Analysis and Machine Intelligence , volume=. 2019 , publisher=
work page 2019
-
[42]
Journal of Artificial Intelligence Research , volume=
Explainable deep learning: A field guide for the uninitiated , author=. Journal of Artificial Intelligence Research , volume=
-
[43]
arXiv preprint arXiv:2001.00396 , year=
Restricting the flow: Information bottlenecks for attribution , author=. arXiv preprint arXiv:2001.00396 , year=
-
[44]
Aditya Chattopadhay and Anirban Sarkar and Prantik Howlader and Vineeth N. Balasubramanian , title =. IEEE Winter Conference on Applications of Computer Vision (WACV) , year =
-
[45]
arXiv preprint arXiv:2410.00267 , year =
Karmani, Sachin and Sivakaran, Thanushon and Prasad, Gaurav and Ali, Mehmet and Yang, Wenbo and Tang, Sheyang , title =. arXiv preprint arXiv:2410.00267 , year =
-
[46]
Zhang, Ziheng and Gu, Jianyang and Chowdhury, Arpita and Mai, Zheda and Carlyn, David and Berger-Wolf, Tanya and Su, Yu and Chao, Wei-Lun , journal =. 2025 , eprint =
work page 2025
-
[47]
International conference on machine learning , pages=
The shattered gradients problem: If resnets are the answer, then what is the question? , author=. International conference on machine learning , pages=. 2017 , organization=
work page 2017
-
[48]
Why should I trust you? Explaining the predictions of any classifier , author=. Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining , pages=
-
[49]
Visualizing and understanding convolutional networks , author=. Computer Vision--ECCV 2014: 13th European Conference, Zurich, Switzerland, September 6-12, 2014, Proceedings, Part I 13 , pages=. 2014 , organization=
work page 2014
-
[50]
RISE: Randomized Input Sampling for Explanation of Black-box Models
Rise: Randomized Input Sampling for Explanation of black-box models , author=. arXiv preprint arXiv:1806.07421 , year=
work page internal anchor Pith review Pith/arXiv arXiv
-
[51]
On pixel-wise explanations for non-linear classifier decisions by layer-wise relevance propagation , author=. PloS one , volume=. 2015 , publisher=
work page 2015
-
[52]
IEEE transactions on neural networks and learning systems , volume=
Evaluating the visualization of what a deep neural network has learned , author=. IEEE transactions on neural networks and learning systems , volume=. 2016 , publisher=
work page 2016
-
[53]
International conference on machine learning , pages=
Learning important features through propagating activation differences , author=. International conference on machine learning , pages=. 2017 , organization=
work page 2017
-
[54]
Proceedings of the AAAI conference on artificial intelligence , volume=
Sanity checks for saliency metrics , author=. Proceedings of the AAAI conference on artificial intelligence , volume=
-
[55]
arXiv preprint arXiv:2410.02331 , year=
Self-explainable ai for medical image analysis: A survey and new outlooks , author=. arXiv preprint arXiv:2410.02331 , year=
-
[56]
Proceedings of the AAAI conference on artificial intelligence , volume=
Explaining a black-box by using a deep variational information bottleneck approach , author=. Proceedings of the AAAI conference on artificial intelligence , volume=
-
[57]
Advances in neural information processing systems , volume=
Towards robust interpretability with self-explaining neural networks , author=. Advances in neural information processing systems , volume=
-
[58]
Advances in neural information processing systems , volume=
A benchmark for interpretability methods in deep neural networks , author=. Advances in neural information processing systems , volume=
-
[59]
International Conference on Machine Learning , year=
A consistent and efficient evaluation strategy for attribution methods , author=. International Conference on Machine Learning , year=
-
[60]
Proceedings of the International Conference on Learning Representations (ICLR) , year =
F-Fidelity: A Robust Framework for Faithfulness Evaluation of Explainable AI , author =. Proceedings of the International Conference on Learning Representations (ICLR) , year =
-
[61]
The Twelfth International Conference on Learning Representations , year=
Towards Robust Fidelity for Evaluating Explainability of Graph Neural Networks , author=. The Twelfth International Conference on Learning Representations , year=
-
[62]
MNIST handwritten digit database , author =. AT&T Labs [Online]. Available: http://yann.lecun.com/exdb/mnist , volume =
- [63]
-
[64]
Alex Krizhevsky , title =
- [65]
-
[66]
Learning multiple layers of features from tiny images , author=. 2009 , publisher=
work page 2009
-
[67]
IEEE Signal Processing Magazine , volume=
The mnist database of handwritten digit images for machine learning research , author=. IEEE Signal Processing Magazine , volume=. 2012 , publisher=
work page 2012
-
[68]
Proceedings of the AAAI conference on artificial intelligence , volume=
Do feature attribution methods correctly attribute features? , author=. Proceedings of the AAAI conference on artificial intelligence , volume=
-
[69]
Advances in Neural Information Processing Systems , volume=
Benchmarking the attribution quality of vision models , author=. Advances in Neural Information Processing Systems , volume=
-
[70]
International Conference on Learning Representations (ICLR) , year=
An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale , author=. International Conference on Learning Representations (ICLR) , year=
-
[71]
Quantifying attention flow in transformers,
Quantifying Attention Flow in Transformers , author =. arXiv preprint arXiv:2005.00928 , year =
-
[72]
Proceedings of the IEEE/CVF international conference on computer vision , pages=
Xrai: Better attributions through regions , author=. Proceedings of the IEEE/CVF international conference on computer vision , pages=
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.