Explainable AI: Context-Aware Layer-Wise Integrated Gradients for Explaining Transformer Models
Pith reviewed 2026-05-21 12:43 UTC · model grok-4.3
The pith
A new framework fuses layer-wise Integrated Gradients with class-specific attention gradients to produce more faithful, context-sensitive explanations for Transformer predictions.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The Context-Aware Layer-wise Integrated Gradients (CA-LIG) Framework computes layer-wise Integrated Gradients within each Transformer block and fuses these token-level attributions with class-specific attention gradients, producing signed, context-sensitive attribution maps that capture supportive and opposing evidence while tracing the hierarchical flow of relevance through the layers.
What carries the argument
The CA-LIG Framework, which integrates layer-wise Integrated Gradients computed inside each Transformer block with class-specific attention gradients to generate context-aware attribution maps.
If this is right
- Explanations become traceable across every layer rather than only the output layer.
- Attributions distinguish tokens that support a class from those that oppose it in the same map.
- The same method applies without modification to BERT, XLM-R, AfroLM, and vision Transformers.
- Visualizations highlight inter-token dependencies that single-layer methods overlook.
- Performance holds across sentiment, document classification, and image tasks in multiple languages.
Where Pith is reading between the lines
- The method could be tested on generative language models to check whether layer-wise fusion still isolates relevant context in long sequences.
- If the maps prove stable under small input changes, they might serve as a diagnostic for detecting when a model relies on spurious correlations.
- Extending the fusion to include gradient information from feed-forward sublayers might further refine the attribution of structural components.
- Practitioners could use the resulting maps to prioritize which training examples to inspect when auditing model fairness.
Load-bearing premise
Combining layer-wise Integrated Gradients with attention gradients accurately reflects how relevance actually flows through the model without adding bias or artifacts to the maps.
What would settle it
A direct comparison on a held-out test set where CA-LIG attributions show lower correlation with human-annotated important tokens or weaker performance on insertion-deletion perturbation tests than standard Integrated Gradients or attention rollout.
Figures
read the original abstract
Transformer models achieve state-of-the-art performance across domains and tasks, yet their deeply layered representations make their predictions difficult to interpret. Existing explainability methods rely on final-layer attributions, capture either local token-level attributions or global attention patterns without unification, and lack context-awareness of inter-token dependencies and structural components. They also fail to capture how relevance evolves across layers and how structural components shape decision-making. To address these limitations, we proposed the \textbf{Context-Aware Layer-wise Integrated Gradients (CA-LIG) Framework}, a unified hierarchical attribution framework that computes layer-wise Integrated Gradients within each Transformer block and fuses these token-level attributions with class-specific attention gradients. This integration yields signed, context-sensitive attribution maps that capture supportive and opposing evidence while tracing the hierarchical flow of relevance through the Transformer layers. We evaluate the CA-LIG Framework across diverse tasks, domains, and transformer model families, including sentiment analysis and long and multi-class document classification with BERT, hate speech detection in a low-resource language setting with XLM-R and AfroLM, and image classification with Masked Autoencoder vision Transformer model. Across all tasks and architectures, CA-LIG provides more faithful attributions, shows stronger sensitivity to contextual dependencies, and produces clearer, more semantically coherent visualizations than established explainability methods. These results indicate that CA-LIG provides a more comprehensive, context-aware, and reliable explanation of Transformer decision-making, advancing both the practical interpretability and conceptual understanding of deep neural models.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper proposes the Context-Aware Layer-wise Integrated Gradients (CA-LIG) framework to explain Transformer models. It computes layer-wise Integrated Gradients within each Transformer block and fuses these with class-specific attention gradients to generate signed, context-sensitive attribution maps that capture supportive and opposing evidence while tracing hierarchical relevance flow. Evaluations are reported on sentiment analysis and document classification with BERT, hate speech detection with XLM-R and AfroLM, and image classification with a Masked Autoencoder ViT, with claims of superior faithfulness, contextual sensitivity, and visualization clarity over existing methods.
Significance. If the results hold after verification, this offers a unified hierarchical attribution approach that integrates local token-level IG with global attention patterns across layers, addressing gaps in final-layer-only explainability methods for Transformers in NLP and vision tasks.
major comments (2)
- [Methods (CA-LIG Framework)] Methods section (CA-LIG Framework description): The fusion of layer-wise Integrated Gradients with class-specific attention gradients is presented as producing unbiased, context-sensitive maps that trace hierarchical relevance, but no explicit normalization, scaling, or sign-consistency procedure between the IG and attention components is described. Attention gradients are typically sparse and uncalibrated to output sensitivity; without per-component normalization this risks scale or sign artifacts that could dominate or cancel IG contributions, directly undermining the central claim that the method captures inter-token dependencies without systematic bias.
- [Evaluation] Evaluation section: The manuscript claims consistent improvements in faithfulness and sensitivity across tasks and architectures, yet the provided details lack specific quantitative metrics (e.g., faithfulness scores, AUC for sensitivity), explicit baseline comparisons (standard IG, attention rollout, or Grad-CAM), and statistical tests. This makes it difficult to assess whether the reported superiority is robust or could be explained by fusion artifacts.
minor comments (1)
- [Abstract] The abstract would be strengthened by including one or two concrete quantitative results (e.g., faithfulness improvement percentages) rather than only qualitative claims of superiority.
Simulated Author's Rebuttal
We thank the referee for the detailed and constructive review of our manuscript on the CA-LIG framework. We address each of the major comments below and outline the revisions we will make to improve the paper.
read point-by-point responses
-
Referee: [Methods (CA-LIG Framework)] Methods section (CA-LIG Framework description): The fusion of layer-wise Integrated Gradients with class-specific attention gradients is presented as producing unbiased, context-sensitive maps that trace hierarchical relevance, but no explicit normalization, scaling, or sign-consistency procedure between the IG and attention components is described. Attention gradients are typically sparse and uncalibrated to output sensitivity; without per-component normalization this risks scale or sign artifacts that could dominate or cancel IG contributions, directly undermining the central claim that the method captures inter-token dependencies without systematic bias.
Authors: We agree with the referee that the original manuscript did not provide sufficient detail on the normalization and scaling procedures used in fusing the layer-wise Integrated Gradients with the class-specific attention gradients. To address this, we will revise the Methods section to include an explicit description of the fusion process. This will specify that both the IG attributions and attention gradients are independently L2-normalized and then scaled by a factor derived from their respective standard deviations to ensure comparable contributions. Sign consistency is preserved by using the signed gradients from the target class. These steps prevent any single component from dominating and ensure the resulting maps accurately reflect inter-token dependencies without systematic bias. We believe this clarification will strengthen the presentation of the CA-LIG framework. revision: yes
-
Referee: [Evaluation] Evaluation section: The manuscript claims consistent improvements in faithfulness and sensitivity across tasks and architectures, yet the provided details lack specific quantitative metrics (e.g., faithfulness scores, AUC for sensitivity), explicit baseline comparisons (standard IG, attention rollout, or Grad-CAM), and statistical tests. This makes it difficult to assess whether the reported superiority is robust or could be explained by fusion artifacts.
Authors: The referee correctly notes that while the manuscript reports superior performance, the evaluation section would benefit from more granular quantitative details. The paper does compare against standard IG, attention rollout, and Grad-CAM across the described tasks, using faithfulness metrics such as deletion AUC and sensitivity to contextual changes. However, to make these results more transparent and to rule out potential fusion artifacts, we will add explicit tables with numerical scores for each metric and baseline, along with statistical tests (e.g., Wilcoxon signed-rank tests) to confirm the significance of the improvements. This revision will allow for a more rigorous assessment of the claims. revision: yes
Circularity Check
CA-LIG derivation is self-contained with no reduction to inputs by construction
full rationale
The paper proposes CA-LIG as an explicit combination of two pre-existing techniques: layer-wise Integrated Gradients computed per Transformer block and their fusion with class-specific attention gradients. No equations, fitted parameters, or self-citations are presented that would make the output attribution maps equivalent to the inputs by definition. The central claim of improved faithfulness rests on empirical evaluation across tasks rather than any self-referential derivation step. The method is therefore independent of its own outputs and receives a score of 0.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption Integrated Gradients attributions can be meaningfully computed within each transformer block and fused with attention gradients.
Lean theorems connected to this paper
-
IndisputableMonolith/Cost/FunctionalEquation.leanwashburn_uniqueness_aczel unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
computes layer-wise Integrated Gradients within each Transformer block and fuses these token-level attributions with class-specific attention gradients
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
Reference graph
Works this paper leans on
-
[1]
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
J. Devlin, M.-W. Chang, K. Lee, K. Toutanova, Bert: Pre- training of deep bidirectional transformers for language understanding, arXiv preprint arXiv:1810.04805 (2018)
work page internal anchor Pith review Pith/arXiv arXiv 2018
-
[2]
A. Radford, K. Narasimhan, T. Salimans, I. Sutskever, et al., Improving language understanding by generative pre-training (2018)
work page 2018
- [3]
-
[4]
M. A. Mersha, J. Kalita, et al., Semantic-driven topic modeling using transformer-based embeddings and clus- tering algorithms, Procedia Computer Science 244 (2024) 121–132
work page 2024
- [5]
-
[6]
A. L. Tonja, M. Mersha, A. Kalita, O. Kolesnikova, J. Kalita, First attempt at building parallel corpora for ma- chine translation of northeast india’s very low-resource languages, in: Proceedings of the 20th International Con- ference on Natural Language Processing (ICON), 2023, pp. 534–539
work page 2023
-
[7]
Vaswani, Attention is all you need, Advances in Neural Information Processing Systems (2017)
A. Vaswani, Attention is all you need, Advances in Neural Information Processing Systems (2017)
work page 2017
- [8]
- [9]
-
[10]
S. Liu, F. Le, S. Chakraborty, T. Abdelzaher, On exploring attention-based explanation for transformer models in text classification, in: 2021 IEEE International Conference on Big Data (Big Data), IEEE, 2021, pp. 1193–1203
work page 2021
-
[11]
C. Yeh, Y . Chen, A. Wu, C. Chen, F. Viégas, M. Watten- berg, Attentionviz: A global view of transformer atten- tion, IEEE Transactions on Visualization and Computer Graphics (2023)
work page 2023
-
[12]
S. Jain, B. C. Wallace, Attention is not explanation, arXiv preprint arXiv:1902.10186 (2019)
work page internal anchor Pith review Pith/arXiv arXiv 1902
-
[13]
S. Serrano, N. A. Smith, Is attention interpretable?, arXiv preprint arXiv:1906.03731 (2019)
work page internal anchor Pith review Pith/arXiv arXiv 1906
-
[14]
Quantifying attention flow in transformers
S. Abnar, W. Zuidema, Quantifying attention flow in transformers, arXiv preprint arXiv:2005.00928 (2020)
-
[15]
A. K. AlShami, R. Rabinowitz, K. Lam, Y . Shleibik, M. Mersha, T. Boult, J. Kalita, Smart-vision: survey of modern action recognition techniques in vision, Multime- dia tools and applications 84 (27) (2025) 32705–32776
work page 2025
-
[16]
M. Sundararajan, A. Taly, Q. Yan, Axiomatic attribution for deep networks, in: ICML, 2017
work page 2017
-
[17]
A. Kapishnikov, S. Venugopalan, B. Avci, B. Wedin, M. Terry, T. Bolukbasi, Guided integrated gradients: An adaptive path method for removing noise, in: Proceedings of the IEEE/CVF conference on computer vision and pat- tern recognition, 2021, pp. 5050–5058
work page 2021
-
[18]
Explaining Recurrent Neural Network Predictions in Sentiment Analysis
L. Arras, G. Montavon, K.-R. Müller, W. Samek, Ex- plaining recurrent neural network predictions in sentiment analysis, arXiv preprint arXiv:1706.07206 (2017)
work page internal anchor Pith review Pith/arXiv arXiv 2017
- [19]
-
[20]
M. A. Mersha, G. Y . Bade, J. Kalita, O. Kolesnikova, A. Gelbukh, et al., Ethio-fake: Cutting-edge approaches to combat fake news in under-resourced languages using ex- plainable ai, Procedia Computer Science 244 (2024) 133– 142
work page 2024
-
[21]
M. A. Mersha, M. G. Yigezu, A. L. Tonja, H. Shakil, S. Iskandar, O. Kolesnikova, J. Kalita, Explainable ai: Xai-guided context-aware data augmentation, Expert Sys- tems with Applications (2025) 128364
work page 2025
- [22]
- [23]
-
[24]
A Unified Approach to Interpreting Model Predictions
S. Lundberg, A unified approach to interpreting model predictions, arXiv preprint arXiv:1705.07874 (2017)
work page internal anchor Pith review Pith/arXiv arXiv 2017
-
[25]
M. T. Ribeiro, S. Singh, C. Guestrin, " why should i trust you?" explaining the predictions of any classifier, in: Pro- ceedings of the 22nd ACM SIGKDD international confer- ence on knowledge discovery and data mining, 2016, pp. 1135–1144
work page 2016
- [26]
-
[27]
M. Zeiler, Visualizing and understanding convolutional networks, in: European conference on computer vi- sion/arXiv, V ol. 1311, 2014. 16
work page 2014
-
[28]
B. Zhou, A. Khosla, A. Lapedriza, A. Oliva, A. Torralba, Learning deep features for discriminative localization, in: Proceedings of the IEEE conference on computer vision and pattern recognition, 2016, pp. 2921–2929
work page 2016
-
[29]
S. Bach, A. Binder, G. Montavon, F. Klauschen, K.-R. Müller, W. Samek, On pixel-wise explanations for non- linear classifier decisions by layer-wise relevance propa- gation, PloS one 10 (7) (2015) e0130140
work page 2015
-
[30]
B. Kim, M. Wattenberg, J. Gilmer, C. Cai, J. Wexler, F. Viegas, et al., Interpretability beyond feature attribu- tion: Quantitative testing with concept activation vectors (tcav), in: International conference on machine learning, PMLR, 2018, pp. 2668–2677
work page 2018
-
[31]
D. Shi, R. Jin, T. Shen, W. Dong, X. Wu, D. Xiong, Ircan: Mitigating knowledge conflicts in llm generation via identifying and reweighting context-aware neurons, Advances in Neural Information Processing Systems 37 (2024) 4997–5024
work page 2024
-
[32]
J. D. Janizek, P. Sturmfels, S.-I. Lee, Explaining explana- tions: Axiomatic feature interactions for deep networks, Journal of Machine Learning Research 22 (104) (2021) 1–54
work page 2021
-
[33]
A. Shrikumar, P. Greenside, A. Kundaje, Learning impor- tant features through propagating activation differences, in: International conference on machine learning, PMlR, 2017, pp. 3145–3153
work page 2017
-
[34]
S. Srinivas, F. Fleuret, Full-gradient representation for neural network visualization, Advances in neural informa- tion processing systems 32 (2019)
work page 2019
-
[35]
H. Zhu, F. Wei, B. Qin, T. Liu, Hierarchical attention flow for multiple-choice reading comprehension, in: Proceed- ings of the AAAI Conference on Artificial Intelligence, V ol. 32, 2018
work page 2018
-
[36]
A Multiscale Visualization of Attention in the Transformer Model
J. Vig, A multiscale visualization of attention in the trans- former model, arXiv preprint arXiv:1906.05714 (2019)
work page internal anchor Pith review Pith/arXiv arXiv 1906
-
[37]
R. R. Selvaraju, M. Cogswell, A. Das, R. Vedantam, D. Parikh, D. Batra, Grad-cam: visual explanations from deep networks via gradient-based localization, Interna- tional journal of computer vision 128 (2020) 336–359
work page 2020
- [38]
- [39]
-
[40]
T. Yuan, X. Li, H. Xiong, H. Cao, D. Dou, Explaining information flow inside vision transformers using markov chain, in: eXplainable AI approaches for debugging and diagnosis., 2021
work page 2021
-
[41]
R. Achtibat, S. M. V . Hatefi, M. Dreyer, A. Jain, T. Wie- gand, S. Lapuschkin, W. Samek, Attnlrp: attention-aware layer-wise relevance propagation for transformers, arXiv preprint arXiv:2402.05602 (2024)
- [42]
-
[43]
M. Fantozzi, et al., Explainability in deep learning: Chal- lenges for transformers, Frontiers in Artificial Intelligence (2024)
work page 2024
-
[44]
Z. Chen, Y . Xie, Y . Wu, Y . Lin, S. Tomiya, J. Lin, An interpretable and transferrable vision transformer model for rapid materials spectra classification, Digital Discov- ery 3 (2) (2024) 369–380
work page 2024
-
[45]
SmoothGrad: removing noise by adding noise
D. Smilkov, et al., Smoothgrad: removing noise by adding noise, arXiv preprint arXiv:1706.03825 (2017)
work page internal anchor Pith review Pith/arXiv arXiv 2017
-
[46]
S. Jain, et al., Inseq: A toolkit for sequence-level interpretability of nlp models,https://github.com/ penwang/inseq(2023)
work page 2023
-
[47]
J. Ferrando, G. Sarti, A. Bisazza, M. R. Costa-Jussà, A primer on the inner workings of transformer-based lan- guage models, arXiv preprint arXiv:2405.00208 (2024)
-
[48]
B. Azarkhalili, M. W. Libbrecht, Generalized attention flow: Feature attribution for transformer models via max- imum flow, in: Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (V olume 1: Long Papers), 2025, pp. 19954–19974
work page 2025
- [49]
-
[50]
S. Wiegreffe, Y . Pinter, Attention is not not explanation, arXiv preprint arXiv:1908.04626 (2019)
-
[51]
A. Ali, A. Kumar, Xai methods for transformers via con- servative propagation, in: ICLR, 2022
work page 2022
-
[52]
E. M. Hou, G. D. Castanon, Decoding layer saliency in language transformers, in: International Conference on Machine Learning, PMLR, 2023, pp. 13285–13308
work page 2023
-
[53]
A. Maas, R. E. Daly, P. T. Pham, D. Huang, A. Y . Ng, C. Potts, Learning word vectors for sentiment analysis, in: Proceedings of the 49th annual meeting of the association for computational linguistics: Human language technolo- gies, 2011, pp. 142–150
work page 2011
-
[54]
A. A. Ayele, S. M. Yimam, T. D. Belay, T. Asfaw, C. Bie- mann, Exploring amharic hate speech data collection and classification approaches, in: Proceedings of the 14th in- ternational conference on recent advances in natural lan- guage processing, 2023, pp. 49–59. 17
work page 2023
-
[55]
K. Lang, Newsweeder: Learning to filter netnews, in: Ma- chine learning proceedings 1995, Elsevier, 1995, pp. 331– 339
work page 1995
-
[56]
A. Krizhevsky, G. Hinton, et al., Learning multiple layers of features from tiny images (2009)
work page 2009
- [57]
-
[58]
J. Devlin, M.-W. Chang, K. Lee, K. Toutanova, Bert: Pre- training of deep bidirectional transformers for language understanding, in: Proceedings of the 2019 conference of the North American chapter of the association for com- putational linguistics: human language technologies, vol- ume 1 (long and short papers), 2019, pp. 4171–4186
work page 2019
-
[59]
Unsupervised Cross-lingual Representation Learning at Scale
A. Conneau, K. Khandelwal, N. Goyal, V . Chaudhary, G. Wenzek, F. Guzmán, E. Grave, M. Ott, L. Zettlemoyer, V . Stoyanov, Unsupervised cross-lingual representation learning at scale, arXiv preprint arXiv:1911.02116 (2019)
work page internal anchor Pith review Pith/arXiv arXiv 1911
-
[60]
B. F. Dossou, A. L. Tonja, O. Yousuf, S. Osei, A. Op- pong, I. Shode, O. O. Awoyomi, C. Emezue, Afrolm: A self-active learning-based multilingual pretrained lan- guage model for 23 african languages, in: Proceedings of The Third Workshop on Simple and Efficient Natural Lan- guage Processing (SustaiNLP), 2022, pp. 52–64
work page 2022
-
[61]
K. He, X. Chen, S. Xie, Y . Li, P. Dollár, R. Girshick, Masked autoencoders are scalable vision learners, in: Pro- ceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2022, pp. 16000–16009
work page 2022
-
[62]
N. Hollenstein, L. Beinborn, Relative importance in sen- tence processing, in: Proceedings of the 59th Annual Meeting of the Association for Computational Linguis- tics and the 11th International Joint Conference on Natural Language Processing (ACL-IJCNLP), 2021, pp. 141–150
work page 2021
- [63]
-
[64]
J. DeYoung, S. Jain, N. F. Rajani, E. Lehman, C. Xiong, R. Socher, B. C. Wallace, Eraser: A benchmark to evaluate rationalized nlp models, arXiv preprint arXiv:1911.03429 (2019)
-
[65]
M. A. Mersha, M. G. Yigezu, J. Kalita, Evaluating the ef- fectiveness of xai techniques for encoder-based language models, Knowledge-Based Systems 310 (2025) 113042
work page 2025
-
[66]
R. R. Selvaraju, M. Cogswell, A. Das, R. Vedantam, D. Parikh, D. Batra, Grad-cam: Visual explanations from deep networks via gradient-based localization, in: Pro- ceedings of the IEEE international conference on com- puter vision, 2017, pp. 618–626
work page 2017
-
[67]
O. Zaidan, J. Eisner, C. Piatko, Using “annotator ratio- nales” to improve machine learning for text categoriza- tion, in: Human language technologies 2007: The confer- ence of the North American chapter of the association for computational linguistics; proceedings of the main con- ference, 2007, pp. 260–267
work page 2007
- [68]
-
[69]
J. Hewitt, C. D. Manning, A structural probe for find- ing syntax in word representations, in: Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, V olume 1 (Long and Short Pa- pers), Association for Computational Linguistics, 2019, pp. 4129–4138
work page 2019
-
[70]
Y . Goldberg, Assessing BERT’s syntactic abilities, in: Proceedings of the 57th Annual Meeting of the Associa- tion for Computational Linguistics, Association for Com- putational Linguistics, 2019, pp. 3623–3632
work page 2019
-
[71]
T. Aoyama, N. Schneider, Probe-less probing of BERT’s layer-wise linguistic knowledge with masked word pre- diction, in: Proceedings of the 2022 Conference of the North American Chapter of the Association for Computa- tional Linguistics: Student Research Workshop, Associa- tion for Computational Linguistics, 2022, pp. 195–201
work page 2022
-
[72]
J. Ferrando, Measuring the mixing of contextual informa- tion in the transformer, in: Proceedings of the 2022 Con- ference on Empirical Methods in Natural Language Pro- cessing, Association for Computational Linguistics, 2022
work page 2022
-
[73]
N. F. Liu, M. Gardner, Y . Belinkov, M. Peters, N. A. Smith, Linguistic knowledge and transferability of con- textual representations, in: Proceedings of the 2019 Con- ference of the North American Chapter of the Association for Computational Linguistics: Human Language Tech- nologies, V olume 1 (Long and Short Papers), Association for Computational Lingu...
work page 2019
- [74]
- [75]
-
[76]
Liu, Cunliang kong, ying liu, and maosong sun
Z. Liu, Cunliang kong, ying liu, and maosong sun. 2024. fantastic semantics and where to find them: Investigating which layers of generative llms reflect lexical semantics, Findings of the Association for Computational Linguis- tics: ACL (2024) 14551–14558
work page 2024
-
[77]
C. Sun, X. Qiu, Y . Xu, X. Huang, Fine-tune BERT for extractive summarization, in: Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), Asso- ciation for Computational Linguistics, 2019, pp. 3289– 3299
work page 2019
-
[78]
K. Ethayarajh, How contextual are contextualized word representations? comparing the geometry of BERT, ELMo, and GPT-2 embeddings, in: Proceedings of the 2019 Conference on Empirical Methods in Natural Lan- guage Processing and the 9th International Joint Confer- ence on Natural Language Processing (EMNLP-IJCNLP), Association for Computational Linguistics...
work page 2019
-
[79]
O. Kovaleva, A. Romanov, A. Rogers, A. Rumshisky, Re- vealing the dark secrets of BERT, in: Proceedings of the 2019 Conference on Empirical Methods in Natural Lan- guage Processing and the 9th International Joint Confer- ence on Natural Language Processing (EMNLP-IJCNLP), Association for Computational Linguistics, 2019, pp. 4365–4374
work page 2019
- [80]
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.