pith. sign in

arxiv: 2604.21690 · v1 · submitted 2026-04-23 · 💻 cs.LG

Evaluating Post-hoc Explanations of the Transformer-based Genome Language Model DNABERT-2

Pith reviewed 2026-05-09 22:26 UTC · model grok-4.3

classification 💻 cs.LG
keywords post-hoc explanationsgenome language modelsDNABERT-2AttnLRPlayer-wise relevance propagationbiological patternstransformer modelsCNN comparison
0
0 comments X

The pith

AttnLRP applied to DNABERT-2 produces explanations that align with known biological patterns in DNA sequences.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

This paper adapts a relevance propagation method called AttnLRP to explain predictions from the transformer-based genome language model DNABERT-2. It tests these explanations on genomic datasets and compares them directly to those from a convolutional neural network baseline. The authors find that the explanations highlight nucleotide patterns that match established biological knowledge. If correct, this means advanced language models for genomes can be used to generate testable biological hypotheses, just as simpler models already do. The work focuses on making the internal decisions of these complex models interpretable for biologists.

Core claim

The central claim is that AttnLRP yields reliable explanations for DNABERT-2 that correspond to known biological patterns, allowing genome language models to help derive biological insights similar to how convolutional neural networks do.

What carries the argument

AttnLRP, an extension of layer-wise relevance propagation adapted to handle the attention mechanisms in transformer models like DNABERT-2, which attributes relevance scores back to individual nucleotides or tokens in the input genome sequence.

If this is right

  • Explanations can be transferred from token level to nucleotide level for direct biological interpretation.
  • Genome language models produce explanations comparable to those from convolutional networks.
  • Post-hoc methods like AttnLRP enable hypothesis generation from gLM predictions on genome sequences.
  • Relevance attributions become comparable across transformer and convolutional architectures.
  • gLMs can support biological insight extraction in the same way CNNs have been shown to do.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • Similar adaptation of relevance methods could be tested on other genome language models to check consistency.
  • Biologists could use the highlighted patterns to select candidate regions for targeted experiments.
  • The approach might extend to non-genomic sequence tasks where transformers are applied.
  • Performance gains from more complex models may not come at the cost of lost interpretability.

Load-bearing premise

The chosen evaluation metrics and comparison to a CNN baseline are sufficient to show that the explanations capture genuine biological relevance rather than model-specific artifacts or biases.

What would settle it

If AttnLRP explanations on a held-out genomic dataset with verified known motifs fail to highlight those motifs while the model still predicts accurately, that would show the attributions do not reflect biological patterns.

Figures

Figures reproduced from arXiv: 2604.21690 by Bernhard Y. Renard, Isabel Kurth, Paulo Yanez Sarmiento.

Figure 1
Figure 1. Figure 1: Relevance attribution of DNABERT-2 by AttnLRP for a sample of the [PITH_FULL_IMAGE:figures/full_fig_p006_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Distribution of the continuous Jaccard similarity between explanations of DNABERT-2 and [PITH_FULL_IMAGE:figures/full_fig_p006_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Results for faithfulness: difference in prediction scores for the positive class after perturbing [PITH_FULL_IMAGE:figures/full_fig_p008_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: Database comparison: best match of pattern of positive class in the AttnLRP patterns with [PITH_FULL_IMAGE:figures/full_fig_p008_4.png] view at source ↗
read the original abstract

Explaining deep neural network predictions on genome sequences enables biological insight and hypothesis generation-often of greater interest than predictive performance alone. While explanations of convolutional neural networks (CNNs) have been shown to capture relevant patterns in genome sequences, it is unclear whether this transfers to more expressive Transformer-based genome language models (gLMs). To answer this question, we adapt AttnLRP, an extension of layer-wise relevance propagation to the attention mechanism, and apply it to the state-of-the-art gLM DNABERT-2. Thereby, we propose strategies to transfer explanations from token and nucleotide level. We evaluate the adaption of AttnLRP on genomic datasets using multiple metrics. Further, we provide an extensive comparison between the explanations of DNABERT-2 and a baseline CNN. Our results demonstrate that AttnLRP yields reliable explanations corresponding to known biological patterns. Hence, like CNNs, gLMs can also help derive biological insights. This work contributes to the explainability of gLMs and addresses the comparability of relevance attributions across different architectures.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The manuscript adapts AttnLRP (an extension of layer-wise relevance propagation to attention) to the Transformer-based genome language model DNABERT-2. It introduces strategies for transferring token-level attributions to the nucleotide level and evaluates the resulting explanations on genomic datasets via multiple metrics (motif overlap, conservation, etc.), with an extensive comparison to a CNN baseline. The central claim is that AttnLRP produces reliable explanations that correspond to known biological patterns, implying that gLMs can derive biological insights in the same manner as CNNs.

Significance. If the evaluation holds after addressing controls, the work would meaningfully extend post-hoc explainability from CNNs to more expressive Transformer gLMs, supporting their use for hypothesis generation in genomics. The architecture-comparison angle and the explicit transfer strategies from token to nucleotide level are constructive contributions.

major comments (2)
  1. [Evaluation / Results] The evaluation lacks composition-matched negative controls (e.g., dinucleotide-frequency or GC-content shuffled sequences) that would distinguish attributions reflecting genuine functional elements from those driven by shared statistical biases between DNABERT-2 and the CNN baseline. This directly undermines the claim that the observed correspondence to biological patterns demonstrates reliable, biologically meaningful explanations rather than metric artifacts.
  2. [Evaluation / Results] No details are provided on the exact genomic datasets, precise definitions of the reported metrics (overlap, conservation scores, etc.), or statistical procedures (multiple-testing correction, permutation baselines). Without these, it is impossible to assess whether the positive results across metrics are robust or subject to post-hoc selection.
minor comments (2)
  1. [Abstract] The abstract states that explanations 'correspond to known biological patterns' but does not name the specific patterns or datasets; adding one concrete example (e.g., a particular motif or conservation track) would improve clarity.
  2. [Methods] Notation for the token-to-nucleotide transfer strategy should be formalized (e.g., an equation or pseudocode block) rather than described only in prose.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive and detailed comments, which highlight important aspects for strengthening the evaluation. We address each major comment below and indicate the corresponding revisions to the manuscript.

read point-by-point responses
  1. Referee: The evaluation lacks composition-matched negative controls (e.g., dinucleotide-frequency or GC-content shuffled sequences) that would distinguish attributions reflecting genuine functional elements from those driven by shared statistical biases between DNABERT-2 and the CNN baseline. This directly undermines the claim that the observed correspondence to biological patterns demonstrates reliable, biologically meaningful explanations rather than metric artifacts.

    Authors: We agree that composition-matched negative controls are a valuable addition to distinguish genuine biological signals from potential statistical artifacts shared across models. Although the CNN baseline comparison provides partial mitigation by holding the training data constant, it does not fully address compositional biases. In the revised manuscript we will incorporate dinucleotide-frequency and GC-content shuffled sequence controls, recompute the attribution metrics on these controls, and report the resulting reductions in motif overlap and conservation scores to demonstrate that the positive results are not artifacts. revision: yes

  2. Referee: No details are provided on the exact genomic datasets, precise definitions of the reported metrics (overlap, conservation scores, etc.), or statistical procedures (multiple-testing correction, permutation baselines). Without these, it is impossible to assess whether the positive results across metrics are robust or subject to post-hoc selection.

    Authors: We acknowledge that the original manuscript omitted sufficient detail on these elements. The revised version will add a dedicated subsection in the Methods section specifying the exact genomic datasets (including accession numbers, sequence lengths, and preprocessing), precise definitions of each metric (e.g., how motif overlap is quantified via position-specific scoring and how conservation scores are obtained from phastCons or equivalent tracks), and the complete statistical procedures (including multiple-testing corrections and permutation-based baselines for significance assessment). revision: yes

Circularity Check

0 steps flagged

No significant circularity in evaluation methodology

full rationale

The paper conducts an empirical evaluation of adapted AttnLRP explanations on DNABERT-2 by measuring overlap with known biological motifs, conservation scores, and comparison against a separate CNN baseline using multiple independent metrics. These steps rely on external genomic annotations and cross-architecture benchmarks rather than any self-referential definitions, fitted parameters renamed as predictions, or load-bearing self-citations that reduce the central claim to the paper's own inputs by construction. The derivation chain is self-contained against external data and does not exhibit any of the enumerated circular patterns.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

This is an empirical evaluation study. It relies on standard machine-learning assumptions about post-hoc explanation methods and the validity of biological pattern matching, with no new free parameters, axioms, or invented entities introduced.

pith-pipeline@v0.9.0 · 5495 in / 1149 out tokens · 26049 ms · 2026-05-09T22:26:56.474355+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

32 extracted references · 32 canonical work pages

  1. [1]

    In: International Conference on Machine Learning

    Achtibat, R., Hatefi, S.M.V ., Dreyer, M., Jain, A., Wiegand, T., Lapuschkin, S., Samek, W.: At- tnLRP: Attention-Aware Layer-Wise Relevance Propagation for Transformers. In: International Conference on Machine Learning. pp. 135–168. PMLR (2024)

  2. [2]

    In: International Conference on Machine Learning

    Ali, A., Schnake, T., Eberle, O., Montavon, G., Müller, K.R., Wolf, L.: Xai for transformers: Better explanations through conservative propagation. In: International Conference on Machine Learning. pp. 435–451. PMLR (2022)

  3. [3]

    Nature genetics54(5), 613–624 (2022) 9

    de Almeida, B.P., Reiter, F., Pagani, M., Stark, A.: DeepSTARR predicts enhancer activity from DNA sequence and enables the de novo design of synthetic enhancers. Nature genetics54(5), 613–624 (2022) 9

  4. [4]

    A close look at decomposition-based xai-methods for transformer language models, 2025

    Arras, L., Puri, B., Kahardipraja, P., Lapuschkin, S., Samek, W.: A Close Look at Decomposition-based XAI-Methods for Transformer Language Models. arXiv preprint arXiv:2502.15886 (2025)

  5. [5]

    PLoS ONE10(2015)

    Bach, S., Binder, A., Montavon, G., Klauschen, F., Müller, K.R., Samek, W.: On Pixel-Wise Explanations for Non-Linear Classifier Decisions by Layer-Wise Relevance Propagation. PLoS ONE10(2015)

  6. [6]

    NAR genomics and bioinformatics3(1), lqab004 (2021)

    Bartoszewicz, J.M., Seidel, A., Renard, B.Y .: Interpretable detection of novel human viruses from genome sequencing data. NAR genomics and bioinformatics3(1), lqab004 (2021)

  7. [7]

    Evaluating and aggregating feature-based model explanations,

    Bhatt, U., Weller, A., Moura, J.M.: Evaluating and aggregating feature-based model explana- tions. arXiv preprint arXiv:2005.00631 (2020)

  8. [8]

    In: International Conference on Machine Learning

    Chalasani, P., Chen, J., Chowdhury, A.R., Wu, X., Jha, S.: Concise explanations of neural networks using adversarial training. In: International Conference on Machine Learning. pp. 1383–1391. PMLR (2020)

  9. [9]

    In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition

    Chefer, H., Gur, S., Wolf, L.: Transformer interpretability beyond attention visualization. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. pp. 782–791 (2021)

  10. [10]

    Briefings in bioinformatics22(5), bbab060 (2021)

    Clauwaert, J., Menschaert, G., Waegeman, W.: Explainability in transformer models for functional genomics. Briefings in bioinformatics22(5), bbab060 (2021)

  11. [11]

    Nature Machine Intelligence pp

    Consens, M.E., Dufault, C., Wainberg, M., Forster, D., Karimzadeh, M., Goodarzi, H., Theis, F.J., Moses, A., Wang, B.: Transformers and genome language models. Nature Machine Intelligence pp. 1–17 (2025)

  12. [12]

    In: 2023 ICML Workshop on Computational Biology, Honolulu, Hawaii, USA (2023)

    Consens, M.E., Papernot, N., Wang, B., Moses, A.: Transforming Genomic Interpretability: A DNABERT Case Study. In: 2023 ICML Workshop on Computational Biology, Honolulu, Hawaii, USA (2023)

  13. [13]

    Nature Methods22(2), 287–297 (2025)

    Dalla-Torre, H., Gonzalez, L., Mendoza-Revilla, J., Lopez Carranza, N., Grzywaczewski, A.H., Oteri, F., Dallago, C., Trop, E., de Almeida, B.P., Sirelkhatim, H., Richard, G., Skwark, M., Beguir, K., Lopez, M., Pierrot, T.: Nucleotide Transformer: building and evaluating robust foundation models for human genomics. Nature Methods22(2), 287–297 (2025)

  14. [14]

    In: Proceedings of NAACL-HLT

    Devlin, J., Chang, M.W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. In: Proceedings of NAACL-HLT. Association for Computational Linguistics (2019)

  15. [15]

    In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics

    DeYoung, J., Jain, S., Rajani, N.F., Lehman, E., Xiong, C., Socher, R., Wallace, B.C.: ERASER: A benchmark to evaluate rationalized NLP models. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. pp. 4443–4458 (2020)

  16. [16]

    Nature Reviews Genetics20(7), 389–403 (2019)

    Eraslan, G., Avsec, Ž., Gagneur, J., Theis, F.J.: Deep learning: new computational modelling techniques for genomics. Nature Reviews Genetics20(7), 389–403 (2019)

  17. [17]

    BMC Genomic Data24(1), 25 (2023)

    Grešová, K., Martinek, V ., ˇCechák, D., Šime ˇcek, P., Alexiou, P.: Genomic benchmarks: a collection of datasets for genomic sequence classification. BMC Genomic Data24(1), 25 (2023)

  18. [18]

    Genome biology8(2), R24 (2007)

    Gupta, S., Stamatoyannopoulos, J.A., Bailey, T.L., Noble, W.S.: Quantifying similarity between motifs. Genome biology8(2), R24 (2007)

  19. [19]

    Journal of Machine Learning Research24(34), 1–11 (2023)

    Hedström, A., Weber, L., Krakowczyk, D., Bareeva, D., Motzkus, F., Samek, W., Lapuschkin, S., Höhne, M.M.C.: Quantus: An explainable ai toolkit for responsible evaluation of neural network explanations and beyond. Journal of Machine Learning Research24(34), 1–11 (2023)

  20. [20]

    Bioinformatics37(15), 2112–2120 (2021)

    Ji, Y ., Zhou, Z., Liu, H., Davuluri, R.V .: DNABERT: pre-trained Bidirectional Encoder Rep- resentations from Transformers model for DNA-language in genome. Bioinformatics37(15), 2112–2120 (2021)

  21. [21]

    In: Advances in Neural Information Processing Systems

    Lundberg, S.M., Lee, S.I.: A Unified Approach to Interpreting Model Predictions. In: Advances in Neural Information Processing Systems. vol. 30. Curran Associates, Inc. (2017)

  22. [22]

    In: The Twelfth International Conference on Learning Representations (2024) 10

    Marin, F.I., Teufel, F., Horlacher, M., Madsen, D., Pultz, D., Winther, O., Boomsma, W.: BEND: Benchmarking DNA Language Models on Biologically Meaningful Tasks. In: The Twelfth International Conference on Learning Representations (2024) 10

  23. [23]

    Explainable AI: interpreting, explaining and visualizing deep learning pp

    Montavon, G., Binder, A., Lapuschkin, S., Samek, W., Müller, K.R.: Layer-Wise Relevance Propagation: An Overview. Explainable AI: interpreting, explaining and visualizing deep learning pp. 193–209 (2019)

  24. [24]

    Nature Reviews Genetics24(2), 125–137 (2023)

    Novakovsky, G., Dexter, N., Libbrecht, M.W., Wasserman, W.W., Mostafavi, S.: Obtaining genetics insights from deep learning via explainable artificial intelligence. Nature Reviews Genetics24(2), 125–137 (2023)

  25. [25]

    Proceedings of the IEEE 109(3), 247–278 (2021)

    Samek, W., Montavon, G., Lapuschkin, S., Anders, C.J., Müller, K.R.: Explaining Deep Neural Networks and Beyond: A Review of Methods and Applications. Proceedings of the IEEE 109(3), 247–278 (2021)

  26. [26]

    Nucleic acids research 32(suppl_1), D91–D94 (2004)

    Sandelin, A., Alkema, W., Engström, P., Wasserman, W.W., Lenhard, B.: JASPAR: an open- access database for eukaryotic transcription factor binding profiles. Nucleic acids research 32(suppl_1), D91–D94 (2004)

  27. [27]

    In: Proceedings of the 34th International Conference on Machine Learning

    Shrikumar, A., Greenside, P., Kundaje, A.: Learning Important Features Through Propagating Activation Differences. In: Proceedings of the 34th International Conference on Machine Learning. pp. 3145–3153. PMLR (2017)

  28. [28]

    Shrikumar, A., Tian, K., Avsec, Ž., Shcherbina, A., Banerjee, A., Sharmin, M., Nair, S., Kundaje, A.: Technical note on transcription factor motif discovery from importance scores (TF-MoDISco) version 0.5. 6.5. arXiv preprint arXiv:1811.00416 (2018)

  29. [29]

    Genome Biology26(1), 203 (2025)

    Tang, Z., Somia, N., Yu, Y ., Koo, P.K.: Evaluating the representational power of pre-trained DNA language models for regulatory genomics. Genome Biology26(1), 203 (2025)

  30. [30]

    In: Proceedings of the 2020 conference on empirical methods in natural language processing: system demonstrations

    Wolf, T., Debut, L., Sanh, V ., Chaumond, J., Delangue, C., Moi, A., Cistac, P., Rault, T., Louf, R., Funtowicz, M., Davison, J., Shleifer, S., von Platen, P., Ma, C., Jernite, Y ., Plu, J., Xu, C., Le Scao, T., Gugger, S., Drame, M., Lhoest, Q., Rush, A.: Transformers: State-of-the-art natural language processing. In: Proceedings of the 2020 conference o...

  31. [31]

    arXiv preprint arXiv:2306.15006 , year=

    Zhou, Z., Ji, Y ., Li, W., Dutta, P., Davuluri, R., Liu, H.: Dnabert-2: Efficient foundation model and benchmark for multi-species genome. arXiv preprint arXiv:2306.15006 (2023)

  32. [32]

    Nature genetics51(1), 12–18 (2019) 11

    Zou, J., Huss, M., Abid, A., Mohammadi, P., Torkamani, A., Telenti, A.: A primer on deep learning in genomics. Nature genetics51(1), 12–18 (2019) 11