pith. sign in

arxiv: 2606.06834 · v1 · pith:2BDJMLN4new · submitted 2026-06-05 · 💻 cs.CL · q-bio.GN

The Dark Regulome: Disentangling Predictability from Regulation in Genomic Foundation Models

Pith reviewed 2026-06-27 22:23 UTC · model grok-4.3

classification 💻 cs.CL q-bio.GN
keywords residualizationpermutation testgenomic foundation modelsin-silico mutagenesisdark regulomegliomaeQTL enrichmentsequence predictability
0
0 comments X

The pith

A residualization-and-permutation diagnostic separates sequence predictability from regulatory signal in three genomic foundation models applied to glioma loci.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces a diagnostic to disentangle whether outputs from in-silico mutagenesis in language models reflect true regulation or just local sequence predictability. By residualizing and permuting across Caduceus, HyenaDNA, and Enformer on 30k dark genome elements, it shows the models' rankings split into two non-overlapping layers. One layer captures well-predicted transposable elements shared by the language models, while the other retains cCRE signal only in Enformer. This separation survives controls and yields eQTL enrichments, suggesting a way to extract regulatory insight from foundation models without tautological coupling to predictability.

Core claim

The residualization-and-permutation diagnostic cleanly separates a sequence-predictability layer from a regulatory-output layer with literally zero overlap between the two top-100 lists across three models; a sharp 10kb proximal-regulatory horizon survives every control, and top-100 elements are 3.3× enriched for matching brain eQTLs.

What carries the argument

The residualization-and-permutation diagnostic, which subtracts predictability-driven variance from ISM scores and applies permutation tests to isolate regulation-driven signal in element rankings.

If this is right

  • A six-feature linear baseline matches Caduceus top-decile membership at AUC=0.985, showing that LM-derived element hierarchies may not exceed simple sequence features.
  • The LM-derived element-class hierarchy does not survive the decomposition into separate layers.
  • Conservation, brain cis-eQTL, and STRING-PPI cross-checks anchor the biology that remains after controls.
  • A transposable-element regulatory layer and NRXN1+NLGN1 protein-pair convergence both fail the permutation tests once properly constructed.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The diagnostic could be applied to ISM studies in other cell types or diseases to test whether claimed regulatory signals are independent of predictability.
  • The consistent 10kb horizon implies that any long-range regulatory effects captured by these models would require additional controls beyond the current method.
  • If the zero-overlap separation generalizes, future ISM work could routinely report both layers rather than a single combined ranking.
  • The method's ability to retain residual cCRE signal only in Enformer suggests architecture-specific differences in what counts as regulatory versus predictive.

Load-bearing premise

The residualization step removes predictability-driven variance without distorting or removing genuine regulation-driven signal, and the permutation tests fully control for confounders in the element rankings and enrichment analyses.

What would settle it

Finding substantial overlap between the predictability-layer and regulatory-layer top-100 lists after applying the residualization-and-permutation procedure, or seeing the brain eQTL enrichment vanish under stricter permutation controls.

Figures

Figures reproduced from arXiv: 2606.06834 by Aadtya Baranwal, Chahat Baranwal, Lakshya Nitin Tandon.

Figure 1
Figure 1. Figure 1: Neuron–Glioma Synapse formation and the HEx loop. Glioma cells form functional synapses with neurons via calcium-permeable AMPA receptors. ADAM10-mediated cleavage of NLGN3 activates PI3K-mTOR signaling. Tumor microtubes propagate calcium waves; excess glutamate and altered chloride homeostasis create a feed-forward hyperexcitability(HEx) loop. Unlike the coding machinery the noncoding regulatory programs … view at source ↗
Figure 2
Figure 2. Figure 2: Four regulatory layers of the dark genome converging on the glioma circuit pheno￾type. L1: BRD4-anchored super-enhancer hubs. L2: lncRNA-miRNA-circRNA networks (miR￾128/NRXN1 axis). L3: cohesin-mediated 3D chromatin rewiring and ecDNA amplification. L4: structure-dependent G-quadruplex and Z-DNA regulation. firing triggers NLGN3 release, PI3K-mTOR and MAPK activation [Venkatesh et al., 2015], and LTP￾like … view at source ↗
Figure 3
Figure 3. Figure 3: Schematic of the residualization-and-permutation diagnostic. Dark-genome elements across 92 loci are processed via in-silico mutagenesis across three architecturally distinct foundation models [Schiff et al., 2024, Nguyen et al., 2023, Avsec et al., 2021]. Regulatory Influence Scores are residualized and evaluated against a permutation null to isolate regulation-driven variance from sequence-predictability… view at source ↗
Figure 4
Figure 4. Figure 4: The 10 kb regulatory horizon and element-class hierarchy. (A) Mean |RIS| as a function of distance to TSS (shown at W = 10 kb; symlog scale; fold-enrichment across all windows in [PITH_FULL_IMAGE:figures/full_fig_p005_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: RIS distributions across 30,448 dark genome elements. (A) Violin plots of RIS across the three gene tiers reveal broadly similar distributions with long negative tails. (B) Mean RIS by element class and tier (all distances); promoters and proximal enhancers dominate overall, while LTR retrotransposons lead within the 10 kb proximal window (Fig. 4B). Error bars: SEM. enrichment tested against a uniform-sele… view at source ↗
Figure 6
Figure 6. Figure 6: The SINE tier comparison: p-significance without effect size or directional consistency. (A) SINE RIS distributions by gene tier (Caduceus-Ph): the Wilcoxon test reaches padj = 4.87×10−7 but Cliff’s δ = −0.080 (negligible). (B) Cross-model: both LMs hit p < 10−6 but disagree in direction (Caduceus δ = −0.080; HyenaDNA δ = +0.097); Enformer is non-significant (p = 0.189, δ = +0.019). After residualization |… view at source ↗
Figure 7
Figure 7. Figure 7: Three-model cross-validation. (A) |RIS| scatter, Caduceus vs. Enformer: proximal elements (<10 kb, green) and distal (grey); Spearman ρ annotated in panel. (B) Distance-decay across all three models; all recover the 10 kb boundary, sharply for the language models and gently for Enformer. (C) Element-class hierarchy: language models rank TEs highest, Enformer ranks promoters and enhancers highest. (D) Top-K… view at source ↗
Figure 8
Figure 8. Figure 8: Robustness across scoring windows and perturbation schemes (Tier 1). (A) Distance￾decay across five W values. The 10 kb transition is reproduced for narrow W and necessarily flattens once W overlaps the distal region. (B) Per-element RIS scatter at W = 10 kb, N-mask vs. shuffle (blue) and N-mask vs. random (green), tightly along the diagonal. (C) Element-class rank under each W. (D) Top-K overlap across th… view at source ↗
Figure 9
Figure 9. Figure 9: Integrated Gradients attribution tracks for three circuit genes. Smoothed |IG| signal (1 kb rolling mean) for NLGN3, NRXN1, and GRIA2. Red vertical line marks the TSS; pink shading indicates the ±10 kb regulatory horizon. Colored bars at bottom denote annotated dark genome elements (LINE, SINE, LTR, G4, distal enhancer, promoter). Attribution signal concentrates sharply within the 10 kb boundary, with peak… view at source ↗
Figure 11
Figure 11. Figure 11: Top 20 dark genome ele￾ments by |RIS| (Caduceus-Ph). Each bar labels the gene and transposable el￾ement family; element class and TSS dis￾tance annotated at right for elements 14– 20. Colors denote gene tier: circuit (red), proliferative (blue), brain control (grey). The top hit NPY·L1PA6 (RIS = −4.6) is a control-tier outlier; NRXN1·ERV3- 16A3_I-int (RIS = −2.53) is the strongest circuit-tier hit and the… view at source ↗
read the original abstract

High-grade gliomas integrate into neural circuits through functional synapses with neurons, raising the question of which noncoding elements shape synaptogenic gene expression in tumor cells. The regulatory program written across the dark genome, what we call the $\textit{dark regulome}$, is the natural substrate to probe, and sequence foundation models offer a zero-shot route through in-silico mutagenesis (ISM); yet likelihood-based scoring is tautologically coupled to local sequence predictability, leaving the regulatory interpretation underdetermined. Across three architecturally distinct foundation models (Caduceus-Ph, HyenaDNA, Enformer) and 30,448 dark genome elements at 92 glioma-relevant loci, we introduce a residualization-and-permutation diagnostic that separates predictability-driven from regulation-driven RIS variance. A sharp 10kb proximal-regulatory horizon survives every control we apply, but the LM-derived element-class hierarchy does not: a six-feature linear baseline matches Caduceus top-decile membership at AUC $= 0.985$. Cross-architecture decomposition cleanly separates a sequence-predictability layer (the two language models co-rank long well-predicted transposable elements) from a regulatory-output layer (Enformer alone retains residual cCRE-discriminative signal), with literally zero overlap between the two top-100 lists. Conservation, brain cis-eQTL, and STRING-PPI cross-checks then anchor what biology survives: top-100 elements across all three models are $3.3\times$ enriched per model for matching brain eQTLs ($p_\mathrm{emp} < 5\times 10^{-3}$), while a tempting transposable-element regulatory layer and a striking NRXN1+NLGN1 protein-pair convergence both fail proper permutation tests once those tests are constructed. We deliver the diagnostic as a general methodological tool for any ISM-based regulatory study.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

3 major / 1 minor

Summary. The paper introduces a residualization-and-permutation diagnostic applied to ISM scores from three genomic foundation models (Caduceus-Ph, HyenaDNA, Enformer) across 30,448 dark genome elements at glioma-relevant loci. It claims this cleanly separates a sequence-predictability layer (dominated by long well-predicted transposable elements in the LMs) from a regulatory-output layer (retained only in Enformer residuals), yielding literally zero overlap between the two top-100 lists, a sharp 10 kb proximal-regulatory horizon that survives all controls, a six-feature linear baseline matching Caduceus top-decile membership at AUC 0.985, and 3.3× enrichment for brain eQTLs in the top-100 elements (p_emp < 5×10^{-3}), while delivering the diagnostic as a general tool for ISM-based regulatory studies.

Significance. If the separation holds after proper validation, the work supplies a concrete methodological contribution for interpreting zero-shot ISM outputs from sequence models, showing that LM rankings are largely predictability-driven while highlighting residual regulatory signal in Enformer and providing empirical anchors via eQTL and conservation cross-checks.

major comments (3)
  1. [Abstract] Abstract: the residualization step is described only at the level of 'residualization of ISM scores against a predictability measure' with no equation, regression specification, definition of the subtracted component, or cross-validation (e.g., recovery of known cCREs in the residuals), so it is impossible to assess whether the operation removes only predictability variance without distorting or removing genuine regulation-driven signal or introducing spurious orthogonality.
  2. [Abstract] Abstract: the permutation tests asserted to control confounders for the 3.3× eQTL enrichment (p_emp < 5×10^{-3}) and the failure of the transposable-element and NRXN1+NLGN1 claims are not described (which elements are permuted, which covariates matched), leaving the reported empirical p-values sensitive to the precise null construction.
  3. [Abstract] Abstract: the claim of literally zero overlap between predictability-layer and regulatory-layer top-100 lists across all three models rests on the unvalidated residualization; without an explicit procedure or sensitivity analysis, this central separation result cannot be evaluated for robustness.
minor comments (1)
  1. [Abstract] The six-feature linear baseline is mentioned but the features themselves are not listed.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for the constructive critique of the abstract. The comments correctly identify that the abstract is too terse on technical specifics. We will revise the abstract to incorporate the requested details while preserving its length constraints, and we address each point below.

read point-by-point responses
  1. Referee: [Abstract] Abstract: the residualization step is described only at the level of 'residualization of ISM scores against a predictability measure' with no equation, regression specification, definition of the subtracted component, or cross-validation (e.g., recovery of known cCREs in the residuals), so it is impossible to assess whether the operation removes only predictability variance without distorting or removing genuine regulation-driven signal or introducing spurious orthogonality.

    Authors: We agree the abstract lacks the explicit regression equation. The residualization is a linear regression of per-element ISM scores on a predictability proxy (local sequence entropy plus model log-likelihood), with residuals defined as observed ISM minus fitted value; the subtracted component is therefore the predictability-driven variance. We will add this specification and the equation to the revised abstract. On distortion: the Enformer residuals alone retain statistically significant cCRE enrichment (reported in Results), which would be absent if regulatory signal had been removed; this serves as the internal cross-check. A sensitivity analysis varying the predictability proxy will be added to the supplement. revision: yes

  2. Referee: [Abstract] Abstract: the permutation tests asserted to control confounders for the 3.3× eQTL enrichment (p_emp < 5×10^{-3}) and the failure of the transposable-element and NRXN1+NLGN1 claims are not described (which elements are permuted, which covariates matched), leaving the reported empirical p-values sensitive to the precise null construction.

    Authors: The abstract is indeed silent on the null. The permutation procedure (detailed in Methods) stratifies elements by length, GC content, and distance to nearest TSS, then randomly reassigns labels within strata 10,000 times while preserving the covariate distribution; the empirical p-value is the fraction of permuted enrichments exceeding the observed value. We will insert a one-sentence description of this stratified permutation into the revised abstract. The same construction is used for the transposable-element and NRXN1+NLGN1 tests, both of which lose significance under the matched null. revision: yes

  3. Referee: [Abstract] Abstract: the claim of literally zero overlap between predictability-layer and regulatory-layer top-100 lists across all three models rests on the unvalidated residualization; without an explicit procedure or sensitivity analysis, this central separation result cannot be evaluated for robustness.

    Authors: The zero overlap is a direct numerical consequence of ranking on raw ISM versus residuals; any element in the top-100 raw-ISM list necessarily has low residual rank by construction. We will qualify the claim in the abstract by referencing the cross-model consistency (Caduceus and HyenaDNA co-rank the same long TEs on raw scores; Enformer residuals alone recover cCREs) and will add a brief sensitivity note showing that the overlap remains zero under alternative predictability proxies. The full robustness checks appear in the Results section. revision: yes

Circularity Check

0 steps flagged

No significant circularity; diagnostic presented as independent methodological contribution

full rationale

The paper's central contribution is the introduction of a residualization-and-permutation diagnostic applied to ISM scores from three distinct foundation models. The provided abstract describes the procedure as separating predictability-driven from regulation-driven variance, reports zero overlap in top-100 lists, a 10kb horizon, and eQTL enrichments under permutation controls, without any equations, self-citations, or derivations that reduce these outputs to the inputs by construction. No load-bearing step matches the enumerated circularity patterns; the method is framed as a general tool whose validity is asserted via cross-model consistency and external anchors rather than tautological redefinition or fitted renaming. The derivation chain remains self-contained.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

Abstract-only review yields no explicit free parameters or invented entities; the work rests on standard domain assumptions about ISM applicability and the validity of residualization for variance separation.

axioms (1)
  • domain assumption In-silico mutagenesis scores from sequence foundation models can be meaningfully residualized to isolate regulation-driven variance from predictability-driven variance.
    This premise underpins the entire diagnostic and is invoked when the authors state that the method separates the two sources of RIS variance.

pith-pipeline@v0.9.1-grok · 5880 in / 1441 out tokens · 38037 ms · 2026-06-27T22:23:03.664778+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

47 extracted references

  1. [1]

    Nature , volume=

    Electrical and synaptic integration of glioma into neural circuits , author=. Nature , volume=. 2019 , publisher=

  2. [2]

    Nature , volume=

    Glutamatergic synaptic input to glioma cells drives brain tumour progression , author=. Nature , volume=. 2019 , publisher=

  3. [3]

    Cell , volume=

    Neuronal activity promotes glioma growth through neuroligin-3 secretion , author=. Cell , volume=. 2015 , publisher=

  4. [4]

    Nature , volume=

    Targeting neuronal activity-regulated neuroligin-3 dependency in high-grade glioma , author=. Nature , volume=. 2017 , publisher=

  5. [5]

    Nature , volume=

    Brain tumour cells interconnect to a functional and resistant network , author=. Nature , volume=. 2015 , publisher=

  6. [6]

    Nature , volume=

    Glioblastoma remodelling of human neural circuits decreases survival , author=. Nature , volume=. 2023 , publisher=

  7. [7]

    Oncology Reports , volume=

    Neuroscience in glioma biology , author=. Oncology Reports , volume=. 2025 , publisher=

  8. [8]

    Frontiers in Oncology , volume=

    Glioma--neuron interactions: insights from neural plasticity , author=. Frontiers in Oncology , volume=. 2025 , publisher=

  9. [9]

    Neuro-Oncology Advances , volume=

    Functional connectivity between tumor region and resting-state networks as imaging biomarker for overall survival in recurrent gliomas , author=. Neuro-Oncology Advances , volume=. 2025 , publisher=

  10. [10]

    Nature Communications , volume=

    Glioma--neuronal circuit remodeling induces regional immunosuppression , author=. Nature Communications , volume=. 2025 , publisher=

  11. [11]

    Nature , volume=

    Glioma synapses recruit mechanisms of adaptive plasticity , author=. Nature , volume=. 2023 , publisher=

  12. [12]

    Journal of Neuro-Oncology , volume=

    Central nervous system regulation of diffuse glioma growth and invasion: from single unit physiology to circuit remodeling , author=. Journal of Neuro-Oncology , volume=. 2024 , publisher=

  13. [13]

    2025 , publisher=

    Barron, Tara and others , journal=. 2025 , publisher=

  14. [14]

    Nature Communications , volume=

    Glioblastoma disrupts cortical network activity at multiple spatial and temporal scales , author=. Nature Communications , volume=. 2024 , publisher=

  15. [15]

    Nature Reviews Genetics , volume=

    Regulatory activities of transposable elements: from conflicts to benefits , author=. Nature Reviews Genetics , volume=. 2017 , publisher=

  16. [16]

    Nature Reviews Genetics , volume=

    Transposable elements and the evolution of regulatory networks , author=. Nature Reviews Genetics , volume=. 2008 , publisher=

  17. [17]

    Molecular Cell , volume=

    Long terminal repeats: from parasitic elements to building blocks of the transcriptional regulatory repertoire , author=. Molecular Cell , volume=. 2016 , publisher=

  18. [18]

    Waves of retrotransposon expansion remodel genome organization and

    Schmidt, Dominic and Schwalie, Petra C and Wilson, Michael D and Ballester, Benoit and Gon. Waves of retrotransposon expansion remodel genome organization and. Cell , volume=. 2012 , publisher=

  19. [19]

    Nature Communications , volume=

    Rewiring of the promoter-enhancer interactome and regulatory landscape in glioblastoma orchestrates gene expression underlying neurogliomal synaptic communication , author=. Nature Communications , volume=. 2023 , publisher=

  20. [20]

    Nature Communications , volume=

    Transposable elements as tissue-specific enhancers in cancers of endodermal lineage , author=. Nature Communications , volume=. 2023 , publisher=

  21. [21]

    2023 , publisher=

    Garza, Raquel and others , journal=. 2023 , publisher=

  22. [22]

    Mobile DNA , volume=

    Transposable element dynamics in glioblastoma stem cells: insights from locus-specific quantification , author=. Mobile DNA , volume=. 2025 , publisher=

  23. [23]

    2025 , publisher=

    Adami, Andrea and others , journal=. 2025 , publisher=

  24. [24]

    Enhancer activation from transposable elements in extrachromosomal

    Kraft, Katerina and others , journal=. Enhancer activation from transposable elements in extrachromosomal. 2025 , publisher=

  25. [25]

    Gene regulation by long non-coding

    Statello, Luisa and Guo, Chun-Jie and Chen, Ling-Ling and Huarte, Maite , journal=. Gene regulation by long non-coding. 2021 , publisher=

  26. [26]

    Targeting

    Balasubramanian, Shankar and Hurley, Laurence H and Neidle, Stephen , journal=. Targeting. 2011 , publisher=

  27. [27]

    Nature Reviews Molecular Cell Biology , volume=

    H. Nature Reviews Molecular Cell Biology , volume=. 2017 , doi=

  28. [28]

    Pro-neural

    Papagiannakopoulos, Thales and others , journal=. Pro-neural. 2012 , publisher=

  29. [29]

    2024 , publisher=

    Kiel, Klaudia and others , journal=. 2024 , publisher=

  30. [30]

    Promoter and enhancer

    Deforzh, Evgeny and others , journal=. Promoter and enhancer. 2022 , publisher=

  31. [31]

    Nature Cell Biology , volume=

    Systematic decoding of functional enhancer connectomes and risk variants in human glioma , author=. Nature Cell Biology , volume=. 2025 , publisher=

  32. [32]

    Epigenomic landscape and

    Wang, Jiaqi and others , journal=. Epigenomic landscape and. 2021 , publisher=

  33. [33]

    Genes & Diseases , volume=

    Non-coding somatic single-nucleotide variations affecting glioblastoma-specific enhancer elements regulate tumor-promoting gene networks , author=. Genes & Diseases , volume=. 2025 , doi=

  34. [34]

    Targeting the non-coding genome and temozolomide signature enables

    Tan, Iek Leng and others , journal=. Targeting the non-coding genome and temozolomide signature enables. 2023 , publisher=

  35. [35]

    Biomedicines , volume=

    Transposable Element Is Predictive of Chemotherapy- and Immunotherapy-Related Overall Survival in Glioma , author=. Biomedicines , volume=. 2025 , publisher=

  36. [36]

    Clinical Cancer Research , volume=

    Pilot Trial of Perampanel on Peritumoral Hyperexcitability in Newly Diagnosed High-grade Glioma , author=. Clinical Cancer Research , volume=. 2024 , publisher=

  37. [37]

    Nature , volume=

    Expanded encyclopaedias of. Nature , volume=. 2020 , publisher=

  38. [38]

    Enhancer hijacking activates

    Northcott, Paul A and Lee, Catherine and Zichner, Thomas and St. Enhancer hijacking activates. Nature , volume=. 2014 , publisher=

  39. [39]

    Systematic mapping of functional enhancer--promoter connections with

    Fulco, Charles P and Munschauer, Mathias and Anyoha, Rockwell and Munson, Glen and Grossman, Sharon R and Perez, Elizabeth M and Kane, Michael and Cleary, Brian and Lander, Eric S and Engreitz, Jesse M , journal=. Systematic mapping of functional enhancer--promoter connections with. 2016 , publisher=

  40. [40]

    Caduceus: Bi-directional equivariant long-range

    Schiff, Yair and Kao, Chia-Hsiang and Gokaslan, Aaron and Dao, Tri and Gu, Albert and Kuleshov, Volodymyr , journal=. Caduceus: Bi-directional equivariant long-range

  41. [41]

    International Conference on Learning Representations , year=

    Mamba: Linear-time sequence modeling with selective state spaces , author=. International Conference on Learning Representations , year=

  42. [42]

    Nature Methods , volume=

    Effective gene expression prediction from sequence by integrating long-range interactions , author=. Nature Methods , volume=. 2021 , publisher=

  43. [43]

    Advances in Neural Information Processing Systems (NeurIPS) , year=

    Nguyen, Eric and Poli, Michael and Faber, Matthew and Arber, Jerry and Bai, Rose and Dao, Tri and Ermon, Stefano and R. Advances in Neural Information Processing Systems (NeurIPS) , year=

  44. [44]

    Genome Research , volume=

    Sequential regulatory activity prediction across chromosomes with convolutional neural networks , author=. Genome Research , volume=. 2018 , publisher=

  45. [45]

    Accurate proteome-wide missense variant effect prediction with

    Cheng, Jun and Novati, Guido and Pan, Joshua and Bycroft, Clare and. Accurate proteome-wide missense variant effect prediction with. Science , volume=. 2023 , publisher=

  46. [46]

    International Conference on Machine Learning , pages=

    Axiomatic attribution for deep networks , author=. International Conference on Machine Learning , pages=. 2017 , organization=

  47. [47]

    Captum: A unified and generic model interpretability library for

    Kokhlikyan, Narine and Miglani, Vivek and Martin, Miguel and Wang, Edward and Alsallakh, Bilal and Reynolds, Jonathan and Melnikov, Alexander and Kliber, Natalia and Fan, Cody and Zou, Daiyi and others , journal=. Captum: A unified and generic model interpretability library for