pith. sign in

arxiv: 2605.21454 · v1 · pith:EFXHNMVMnew · submitted 2026-05-20 · 💻 cs.CV · q-bio.QM· q-bio.TO

ProtoPathway: Biologically Structured Prototype-Pathway Fusion for Multimodal Cancer Survival Prediction

Pith reviewed 2026-05-21 04:54 UTC · model grok-4.3

classification 💻 cs.CV q-bio.QMq-bio.TO
keywords multimodal fusioncancer survival predictionwhole slide imagingtranscriptomicspathway embeddingsprototype learninggraph neural networksinterpretable AI
0
0 comments X

The pith

ProtoPathway fuses learnable morphological prototypes with pathway graph embeddings so that cross-modal attention both predicts cancer survival and directly attributes risk to specific gene programs and tissue patterns.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper tries to establish that structuring multimodal fusion around biologically meaningful units on each side—task-learned prototypes that compress slide patches and Reactome pathway embeddings from a bipartite graph—produces both accurate survival predictions and native, hierarchy-spanning explanations. A sympathetic reader would care because most current multimodal models hide how molecular changes relate to visible tumor structure, limiting trust and follow-up experiments. If the central mechanism works, the attention weights themselves become a readable map from genes through pathways and prototypes to spatial regions in the slide. The design also compresses the image input into a small fixed set of tokens, lowering compute while keeping or improving performance across standard cohorts.

Core claim

Training end-to-end morphological prototypes that replace raw patch sets and encoding gene expression through bidirectional message passing on a gene–pathway bipartite graph allows a single cross-modal attention step—prototypes querying pathways—to serve simultaneously as the fusion operator and as an interpretable attribution output that spans the full biological hierarchy from genes to spatial tissue maps.

What carries the argument

Cross-modal attention over a compact prototype-by-pathway matrix in which the learned morphological prototypes act as queries to the pathway embeddings produced by the bipartite graph neural network.

If this is right

  • The model achieves competitive or superior survival prediction on five TCGA cancer cohorts while using fewer compute resources than prior multimodal baselines.
  • The attention matrix supplies direct, inference-time attribution from individual genes through pathways and prototypes to specific regions on the slide.
  • Because both the prototypes and pathway embeddings carry stable task-learned identities, the same attention weights remain meaningful across different patients and folds.
  • Population-level rank-based analysis across cross-validation folds confirms that the attributions align with broader biological structure rather than dataset artifacts.
  • The fixed-size prototype representation eliminates the need to process every image patch at inference time without sacrificing predictive power.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The fixed Reactome pathway graph could be replaced with other hierarchical knowledge bases to adapt the same fusion pattern to different diseases or non-cancer multimodal tasks.
  • If the prototype-to-pathway direction matches the actual causal influence from molecules to morphology, the attributions could be used to prioritize genes for functional experiments in high-risk morphological patterns.
  • Adding explicit spatial coordinates to the prototypes might further sharpen localization of which tissue regions are most influenced by particular pathways.
  • The overall structure suggests a template for other multimodal settings where one modality has a natural hierarchical ontology and the other has compressible visual units.

Load-bearing premise

The attention weights between prototypes and pathways correctly reflect the biological direction in which molecular programs shape tissue morphology instead of only tracking statistical associations with survival labels.

What would settle it

A systematic mismatch between the highest-ranked pathways for each prototype and independent experimental data on which genes actually alter tissue morphology in the same cancer types would show the attributions do not capture the claimed biological flow.

Figures

Figures reproduced from arXiv: 2605.21454 by Amaya Gallagher-Syed, Costantino Pitzalis, Gregory Slabaugh, Michael R. Barnes, Myles J. Lewis.

Figure 1
Figure 1. Figure 1: ProtoPathway architecture. I. Prototype Encoding [PITH_FULL_IMAGE:figures/full_fig_p005_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Kaplan-Meier survival curves for ProtoPathway across five TCGA cohorts. Patients stratified into high-risk (red) and low-risk (green) groups by median predicted risk score. All separations significant at p < 0.005 (log-rank test). The unimodal ablations shows an instructive asymmetry: ProtoPathgene (0.597) falls be￾low both MLP (0.630) and SNN (0.629), yet the full model recovers to the best overall score … view at source ↗
Figure 3
Figure 3. Figure 3: Prototype analysis in BLCA. Gating shift analysis reveals which morphological proto￾types the model attends to before (WSI gate, left) and after (fusion gate, right) molecular pathway context is incorporated, with the rank shift between the two (center). Muscle and connective tissue prototypes (Proto 5, 11) gain importance after fusion (green), while tumor prototypes (Proto 2, 8) lose rank (grey), indicati… view at source ↗
Figure 4
Figure 4. Figure 4: Spatial attribution for a high-risk BLCA case (TCGA-4Z-AA84). Top row: [PITH_FULL_IMAGE:figures/full_fig_p013_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: Pathway and gene-level attribution in BLCA. (A) [PITH_FULL_IMAGE:figures/full_fig_p014_5.png] view at source ↗
read the original abstract

We introduce ProtoPathway, an interpretable-by-design multimodal framework for cancer survival prediction that unifies whole slide imaging and transcriptomics through encoders producing biologically grounded representations on both sides of the fusion. On the histopathology side, $K$ learnable morphological prototypes, trained end-to-end with the survival objective, serve as the slide representation itself: patches flow into prototype tokens via soft assignment, compressing variable-length patch sets into fixed task-adaptive tokens. On the genomic side, a bipartite graph neural network encodes gene expression within the Reactome pathway hierarchy, producing pathway embeddings that reflect both constituent genes and their broader biological context through bidirectional message passing over a shared gene--pathway graph. Cross-modal attention then operates over a compact prototype $\times$ pathway matrix in which prototypes query pathways, modeling the biological direction in which molecular programs give rise to tissue morphology. Because both axes carry stable task-learned identity, the attention matrix is itself an interpretability output, yielding native inference-time attribution across the full biological hierarchy, from genes through pathways and prototypes to spatial tissue maps. We evaluate on five TCGA cancer cohorts, demonstrating competitive or superior survival prediction with substantially improved biological interpretability and reduced computational cost, with interpretability claims validated through fold-stratified rank-based population-level analysis. Our source code, model weights, and Reactome pathways, together with a unified codebase reimplementing all multimodal survival baselines under identical preprocessing and evaluation, are available at: https://github.com/AmayaGS/ProtoPathway.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

1 major / 2 minor

Summary. The paper proposes ProtoPathway, a multimodal framework for cancer survival prediction that unifies whole-slide imaging (WSI) and transcriptomics. WSI patches are compressed into K learnable morphological prototypes via soft assignment, serving as the slide representation. Gene expression is encoded via a bipartite graph neural network over the Reactome pathway hierarchy, producing pathway embeddings. Cross-modal attention is performed with prototypes querying pathways to model the direction in which molecular programs give rise to tissue morphology. The resulting attention matrix provides native interpretability across genes, pathways, prototypes, and spatial maps. The method is evaluated on five TCGA cohorts, claiming competitive or superior survival prediction, substantially improved biological interpretability, and reduced computational cost, with interpretability validated via fold-stratified rank-based population-level analysis. Source code, weights, and a unified baseline reimplementation are released.

Significance. If the central interpretability claims hold, the work would be significant for advancing biologically grounded multimodal survival models that directly link morphology to molecular pathways while maintaining competitive predictive performance. The provision of a unified codebase reimplementing all multimodal survival baselines under identical preprocessing is a clear strength for reproducibility and fair comparison.

major comments (1)
  1. [Abstract / Cross-modal attention description] Abstract and cross-modal attention section: The claim that prototypes querying pathways 'modeling the biological direction in which molecular programs give rise to tissue morphology' is not supported by the described training procedure. Both the prototype and pathway encoders are trained end-to-end solely on the survival objective; no auxiliary loss, causal constraint, or directional supervision is mentioned that would enforce this specific hierarchy over symmetric correlations or task-driven co-occurrences. The fold-stratified rank-based population-level analysis can identify consistent attention patterns across patients but cannot validate that the attention reflects the claimed causal biological direction rather than predictive associations useful for survival.
minor comments (2)
  1. [Abstract] The abstract states 'competitive or superior performance' without reporting specific concordance index values, hazard ratios, baseline comparisons, or statistical tests; these details should be summarized in the main text or a results table for immediate assessment.
  2. [Method] Clarify the precise construction of the prototype × pathway matrix and the attention formulation (e.g., query/key/value definitions and any masking) to ensure the interpretability output is fully reproducible from the provided description.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for their constructive and detailed review. We appreciate the positive assessment of the work's potential significance and the emphasis on reproducibility through the released codebase. We address the major comment below.

read point-by-point responses
  1. Referee: [Abstract / Cross-modal attention description] Abstract and cross-modal attention section: The claim that prototypes querying pathways 'modeling the biological direction in which molecular programs give rise to tissue morphology' is not supported by the described training procedure. Both the prototype and pathway encoders are trained end-to-end solely on the survival objective; no auxiliary loss, causal constraint, or directional supervision is mentioned that would enforce this specific hierarchy over symmetric correlations or task-driven co-occurrences. The fold-stratified rank-based population-level analysis can identify consistent attention patterns across patients but cannot validate that the attention reflects the claimed causal biological direction rather than predictive associations useful for survival.

    Authors: We agree that the model is trained end-to-end solely on the survival objective with no auxiliary losses, causal constraints, or explicit directional supervision. The reference to 'modeling the biological direction' describes the deliberate architectural asymmetry in the cross-modal attention module, where morphological prototypes serve as queries attending to pathway embeddings. This design choice is motivated by the biological hypothesis that molecular programs (captured via Reactome pathways) give rise to observable tissue morphology (captured via prototypes), rather than the reverse; the resulting attention matrix therefore supports directed attribution along this axis. The fold-stratified rank-based analysis demonstrates that the learned attention patterns are stable and rank consistently across patients and folds, providing evidence that the structured fusion captures biologically coherent associations useful for survival prediction. We do not claim to perform causal inference. We will revise the abstract and the cross-modal attention section to clarify that the directionality is imposed by the query-key structure to reflect a hypothesized biological flow, while the learned relationships remain predictive associations. revision: yes

Circularity Check

0 steps flagged

Derivation chain is self-contained with no circular reductions

full rationale

The paper trains morphological prototypes and pathway embeddings end-to-end solely via the survival objective, which constitutes standard supervised multimodal learning rather than any reduction of outputs to inputs by construction. The cross-modal attention direction (prototypes querying pathways) is an explicit architectural choice justified by the claimed biological hierarchy, not derived from or equivalent to the fitted parameters themselves. Interpretability outputs such as the attention matrix are presented as a direct consequence of the stable task-learned identities, with validation performed separately via fold-stratified rank-based population-level analysis on external TCGA cohorts. No self-citations, uniqueness theorems, or ansatzes imported from prior author work appear load-bearing in the abstract or described framework; the Reactome graph structure and prototype soft-assignment are independent external or learnable components. The central survival prediction performance is therefore independently falsifiable and not tautological.

Axiom & Free-Parameter Ledger

1 free parameters · 1 axioms · 0 invented entities

The central claim rests on the Reactome hierarchy supplying biologically meaningful structure and on the assumption that end-to-end training with survival loss produces prototypes and pathway embeddings that align with real tissue-molecular relationships.

free parameters (1)
  • K (number of morphological prototypes)
    Hyperparameter controlling the number of learnable prototype tokens; chosen or tuned to compress patch sets into fixed representations.
axioms (1)
  • domain assumption Reactome pathway hierarchy provides a biologically valid structure for encoding gene expression via bidirectional message passing on a shared gene-pathway graph.
    Invoked in the genomic encoder description to produce pathway embeddings that reflect both genes and broader context.

pith-pipeline@v0.9.0 · 5833 in / 1373 out tokens · 31050 ms · 2026-05-21T04:54:46.560412+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

57 extracted references · 57 canonical work pages

  1. [1]

    European urology81(1), 75–94 (2022) 11, 21

    Babjuk, M., Burger, M., Capoun, O., Cohen, D., Compérat, E.M., Escrig, J.L.D., Gontero, P., Liedberg, F., Masson-Lecomte, A., Mostafid, A.H., et al.: European association of urology guidelines on non–muscle-invasive bladder cancer (ta, t1, and carcinoma in situ). European urology81(1), 75–94 (2022) 11, 21

  2. [2]

    Brody, S., Alon, U., Yahav, E.: How attentive are graph attention networks? International Conference on Learning Representations (2022) 7

  3. [3]

    Kristianto, G

    Chattopadhyay, A., Slocum, S., Haeffele, B.D., Vidal, R., Geman, D.: Interpretable by Design: Learning Predictors by Composing Interpretable Queries . IEEE Transactions on Pattern Anal- ysis & Machine Intelligence45(06), 7430–7443 (Jun 2023).https://doi.org/10.1109/TPAMI. 2022.3225162,https://doi.ieeecomputersociety.org/10.1109/TPAMI.2022.32251622

  4. [4]

    Advances in neural information processing systems32(2019) 2

    Chen, C., Li, O., Tao, D., Barnett, A., Rudin, C., Su, J.K.: This looks like that: deep learning for interpretable image recognition. Advances in neural information processing systems32(2019) 2

  5. [5]

    Nature Medicine (2024) 9

    Chen, R.J., Ding, T., Lu, M.Y., Williamson, D.F., Jaume, G., Chen, B., Zhang, A., Shao, D., Song, A.H., Shaban, M., et al.: Towards a general-purpose foundation model for computational pathology. Nature Medicine (2024) 9

  6. [6]

    IEEE Transactions on Medical Imaging41(4), 757–770 (2022).https://doi.org/10.1109/TMI.2020.30213871

    Chen, R.J., Lu, M.Y., Wang, J., Williamson, D.F.K., Rodig, S.J., Lindeman, N.I., Mahmood, F.: Pathomic fusion: An integrated framework for fusing histopathology and genomic features for cancer diagnosis and prognosis. IEEE Transactions on Medical Imaging41(4), 757–770 (2022).https://doi.org/10.1109/TMI.2020.30213871

  7. [7]

    In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV)

    Chen, R.J., Lu, M.Y., Weng, W.H., Chen, T.Y., Williamson, D.F., Manz, T., Shady, M., Mahmood, F.: Multimodal co-attention transformer for survival prediction in gigapixel whole slide images. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV). pp. 4015–4025 (October 2021) 3, 9, 10

  8. [8]

    Cancer cell40(8), 865–878 (2022) 3, 9, 10

    Chen, R.J., Lu, M.Y., Williamson, D.F., Chen, T.Y., Lipkova, J., Noor, Z., Shaban, M., Shady, M., Williams, M., Joo, B., et al.: Pan-cancer integrative histology-genomic analysis via multi- modal deep learning. Cancer cell40(8), 865–878 (2022) 3, 9, 10

  9. [9]

    Nature598(7880), 348–352 (2021) 2

    Elmarakeby, H.A., Hwang, J., Arafeh, R., Crowdis, J., Gang, S., Liu, D., AlDubayan, S.H., Salari, K., Kregel, S., Richter, C., et al.: Biologically informed deep neural network for prostate cancer discovery. Nature598(7880), 348–352 (2021) 2

  10. [10]

    PLoS One17(8), e0271339 (2022) 13

    Gillezeau, C., Movva, N., van Gerwen, M., Rabon-Stith, K., Shire, N., Brohawn, P.Z., Taioli, E., Fryzek, J.: Interferon gamma expression and mortality in unselected cohorts of urothelial bladder cancer patients. PLoS One17(8), e0271339 (2022) 13

  11. [11]

    Ad- vances in neural information processing systems30(2017) 6 16 Gallagher-Syed et al

    Hamilton, W., Ying, Z., Leskovec, J.: Inductive representation learning on large graphs. Ad- vances in neural information processing systems30(2017) 6 16 Gallagher-Syed et al

  12. [12]

    cell144(5), 646–674 (2011) 13

    Hanahan, D., Weinberg, R.A.: Hallmarks of cancer: the next generation. cell144(5), 646–674 (2011) 13

  13. [13]

    In: Urologic Oncology: Seminars and Original Investigations

    Hayden, A., Douglas, J., Sommerlad, M., Andrews, L., Gould, K., Hussain, S., Thomas, G.J., Packham, G., Crabb, S.J.: The nrf2 transcription factor contributes to resistance to cisplatin in bladder cancer. In: Urologic Oncology: Seminars and Original Investigations. vol. 32, pp. 806–814. Elsevier (2014) 13

  14. [14]

    Genes & development17(18), 2205–2232 (2003) 13

    Hogan, P.G., Chen, L., Nardone, J., Rao, A.: Transcriptional regulation by calcium, calcineurin, and nfat. Genes & development17(18), 2205–2232 (2003) 13

  15. [15]

    In: Inter- national conference on machine learning

    Ilse, M., Tomczak, J., Welling, M.: Attention-based deep multiple instance learning. In: Inter- national conference on machine learning. pp. 2127–2136. PMLR (2018) 2, 3, 10

  16. [16]

    Jain, S., Wallace, B.C.: Attention is not explanation. In: Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Lan- guage Technologies, Volume 1 (Long and Short Papers). pp. 3543–3556 (2019) 13

  17. [17]

    Pro- ceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2024) 3, 9, 10, 6, 16, 17

    Jaume, G., Vaidya, A., Chen, R., Williamson, D., Liang, P., Mahmood, F.: Modeling dense multimodal interactions between biological pathways and histology for survival prediction. Pro- ceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2024) 3, 9, 10, 6, 16, 17

  18. [18]

    European urology77(4), 420–433 (2020) 13

    Kamoun, A., de Reynies, A., Allory, Y., Sjödahl, G., Robertson, A.G., Seiler, R., Hoadley, K.A., Groeneveld, C.S., Al-Ahmadie, H., Choi, W., et al.: A consensus molecular classification of muscle-invasive bladder cancer. European urology77(4), 420–433 (2020) 13

  19. [19]

    Oncotarget6(3), 1582 (2015) 13

    Kawahara, T., Kashiwagi, E., Ide, H., Li, Y., Zheng, Y., Miyamoto, Y., Netto, G.J., Ishiguro, H., Miyamoto, H.: Cyclosporine a and tacrolimus inhibit bladder cancer growth through down- regulation of nfatc1. Oncotarget6(3), 1582 (2015) 13

  20. [20]

    arXiv preprint arXiv:2311.17466 (2023) 2

    Keum, S., Kim, S., Lee, S., Lee, J.: Slot-mixup with subsampling: a simple regularization for wsi classification. arXiv preprint arXiv:2311.17466 (2023) 2

  21. [21]

    In: International Conference on Machine Learning (2017),https://api

    Kim, B., Wattenberg, M., Gilmer, J., Cai, C.J., Wexler, J., Viégas, F.B., Sayres, R.: Interpretability beyond feature attribution: Quantitative testing with concept activation vectors (tcav). In: International Conference on Machine Learning (2017),https://api. semanticscholar.org/CorpusID:517371702

  22. [22]

    King, T., Butcher, S., Zalewski, L.: Apocrita - High Performance Computing Cluster for Queen Mary University of London (Mar 2017).https://doi.org/10.5281/zenodo.438045,https: //doi.org/10.5281/zenodo.43804515

  23. [23]

    Advances in neural information processing systems30(2017) 2, 10

    Klambauer, G., Unterthiner, T., Mayr, A., Hochreiter, S.: Self-normalizing neural networks. Advances in neural information processing systems30(2017) 2, 10

  24. [24]

    In: Proceedings of the 37th International Conference on Machine Learning

    Koh, P.W., Nguyen, T., Tang, Y.S., Mussmann, S., Pierson, E., Kim, B., Liang, P.: Concept bottleneck models. In: Proceedings of the 37th International Conference on Machine Learning. ICML’20, JMLR.org (2020) 2

  25. [25]

    In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition

    Li, B., Li, Y., Eliceiri, K.W.: Dual-stream multiple instance learning network for whole slide image classification with self-supervised contrastive learning. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. pp. 14318–14328 (2021) 2, 10

  26. [26]

    Cell systems1(6), 417–425 (2015) 6, 4

    Liberzon, A., Birger, C., Thorvaldsdóttir, H., Ghandi, M., Mesirov, J.P., Tamayo, P.: The molecular signatures database hallmark gene set collection. Cell systems1(6), 417–425 (2015) 6, 4

  27. [27]

    Lipkova, J., Chen, R.J., Chen, B., Lu, M.Y., Barbieri, M., Shao, D., Vaidya, A.J., Chen, C., Zhuang, L., Williamson, D.F.K., Shaban, M., Chen, T.Y., Mahmood, F.: Artificial Intelligence forMultimodalDataIntegrationinOncology.Cancercell40(10),1095–1110(Oct2022).https: //doi.org/10.1016/j.ccell.2022.09.012,https://www.ncbi.nlm.nih.gov/pmc/articles/ PMC10655164/1

  28. [28]

    Bioinformatics40(5), btae316 (2024) 2 ProtoPathway 17

    Liu, X., Tao, Y., Cai, Z., Bao, P., Ma, H., Li, K., Li, M., Zhu, Y., Lu, Z.J.: Pathformer: a biological pathway informed transformer for disease diagnosis and prognosis using multi-omics data. Bioinformatics40(5), btae316 (2024) 2 ProtoPathway 17

  29. [29]

    Advances in neural infor- mation processing systems33, 11525–11538 (2020) 2

    Locatello, F., Weissenborn, D., Unterthiner, T., Mahendran, A., Heigold, G., Uszkoreit, J., Dosovitskiy, A., Kipf, T.: Object-centric learning with slot attention. Advances in neural infor- mation processing systems33, 11525–11538 (2020) 2

  30. [30]

    Advances in Neural Information Processing Systems 30 (2017),http://papers.nips.cc/paper/7062-a- unified-approach-to-interpreting-model-predictions.pdf2

    Lundberg, S.M., Lee, S.I.: A unified approach to interpreting model predictions. Advances in Neural Information Processing Systems 30 (2017),http://papers.nips.cc/paper/7062-a- unified-approach-to-interpreting-model-predictions.pdf2

  31. [31]

    Bioinformatics40(4), btae165 (2024) 2

    Ma, T., Wang, J.: Graphpath: a graph attention model for molecular stratification with inter- pretability based on the pathway–pathway interaction network. Bioinformatics40(4), btae165 (2024) 2

  32. [32]

    Oncogene 14(3), 323–330 (1997) 12

    Diez de Medina, S.G., Chopin, D., Marjou, A.E., Delouvée, A., LaRochelle, W.J., Hoznek, A., Abbou, C., Aaronson, S.A., Thiery, J.P., Radvanyi, F.: Decreased expression of keratinocyte growth factor receptor in a subset of human transitional cell bladder carcinomas. Oncogene 14(3), 323–330 (1997) 12

  33. [33]

    Milacic, M., Beavers, D., Conley, P., Gong, C., Gillespie, M., Griss, J., Haw, R., Jassal, B., Matthews,L.,May,B.,etal.:Thereactomepathwayknowledgebase2024.Nucleicacidsresearch 52(D1), D672–D678 (2024) 2, 6

  34. [34]

    Why Should I Trust You?

    Ribeiro, M.T., Singh, S., Guestrin, C.: "why should i trust you?": Explaining the predic- tions of any classifier. In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. p. 1135–1144. KDD ’16, Association for Comput- ing Machinery, New York, NY, USA (2016).https://doi.org/10.1145/2939672.2939778, https://do...

  35. [35]

    Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead

    Rudin, C.: Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead. Nature Machine Intelligence1(5), 206–215 (May 2019).https: //doi.org/10.1038/s42256-019-0048-x,https://www.nature.com/articles/s42256-019- 0048-x2

  36. [36]

    In: Joint European conference on machine learning and knowledge discovery in databases

    Rymarczyk, D., Pardyl, A., Kraus, J., Kaczyńska, A., Skomorowski, M., Zieliński, B.: Protomil: Multiple instance learning with prototypical parts for whole-slide image classification. In: Joint European conference on machine learning and knowledge discovery in databases. pp. 421–436. Springer (2022) 2

  37. [37]

    Advances in Neural Information Processing Systems34, 2136–2147 (2021) 2, 10

    Shao, Z., Bian, H., Chen, Y., Wang, Y., Zhang, J., Ji, X., et al.: Transmil: Transformer based correlated multiple instance learning for whole slide image classification. Advances in Neural Information Processing Systems34, 2136–2147 (2021) 2, 10

  38. [38]

    In: Proceedings of the AAAI/ACM Conference on AI, Ethics, and Society

    Slack, D., Hilgard, S., Jia, E., Singh, S., Lakkaraju, H.: Fooling lime and shap: Adversarial attacks on post hoc explanation methods. In: Proceedings of the AAAI/ACM Conference on AI, Ethics, and Society. p. 180–186. AIES ’20, Association for Computing Machinery, New York, NY, USA (2020).https://doi.org/10.1145/3375627.3375830,https://doi.org/10.1145/ 33...

  39. [39]

    In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition

    Song, A.H., Chen, R.J., Ding, T., Williamson, D.F., Jaume, G., Mahmood, F.: Morphologi- cal prototyping for unsupervised slide representation learning in computational pathology. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. pp. 11566–11578 (2024) 2, 3

  40. [40]

    In: Salakhutdinov, R., Kolter, Z., Heller, K., Weller, A., Oliver, N., Scarlett, J., Berkenkamp, F

    Song, A.H., Chen, R.J., Jaume, G., Vaidya, A.J., Baras, A., Mahmood, F.: Multimodal proto- typing for cancer survival prediction. In: Salakhutdinov, R., Kolter, Z., Heller, K., Weller, A., Oliver, N., Scarlett, J., Berkenkamp, F. (eds.) Proceedings of the 41st International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 235...

  41. [41]

    International Conference on Learning Representations (2018) 7

    Veličković, P., Cucurull, G., Casanova, A., Romero, A., Lio, P., Bengio, Y.: Graph attention networks. International Conference on Learning Representations (2018) 7

  42. [42]

    Nature Reviews Cancer17(12), 709–724 (2017) 13

    Vijayan, D., Young, A., Teng, M.W., Smyth, M.J.: Targeting immunosuppressive adenosine in cancer. Nature Reviews Cancer17(12), 709–724 (2017) 13

  43. [43]

    transcriptome-wide expression–morphology analysis in breast cancer

    Wang, Y., Kartasalo, K., Weitz, P., Acs, B., Valkonen, M., Larsson, C., Ruusuvuori, P., Hart- man, J., Rantalainen, M.: Predicting molecular phenotypes from histopathology images: a 18 Gallagher-Syed et al. transcriptome-wide expression–morphology analysis in breast cancer. Cancer research81(19), 5115–5126 (2021) 1

  44. [44]

    Nature genetics45(10), 1113–1120 (2013) 16

    Weinstein, J.N., Collisson, E.A., Mills, G.B., Shaw, K.R., Ozenberger, B.A., Ellrott, K., Shmule- vich, I., Sander, C., Stuart, J.M.: The cancer genome atlas pan-cancer analysis project. Nature genetics45(10), 1113–1120 (2013) 16

  45. [45]

    International Journal of Medical Sciences17(6), 762 (2020) 13

    Wu, Z., Zhu, K., Liu, Q., Liu, Y., Chen, L., Cui, J., Guo, H., Zhou, N., Zhu, Y., Li, Y., et al.: Profiles of immune infiltration in bladder cancer and its clinical significance: an integrative genomic analysis. International Journal of Medical Sciences17(6), 762 (2020) 13

  46. [46]

    2023 IEEE/CVF International Conference on Computer Vision (ICCV) pp

    Xu, Y., Chen, H.: Multimodal optimal transport-based co-attention transformer with global structure consistency for survival prediction. 2023 IEEE/CVF International Conference on Computer Vision (ICCV) pp. 21184–21194 (2023),https: // api .semanticscholar. org/ CorpusID:2591646223, 9, 10

  47. [47]

    Medical Imaging with Deep Learning (2023) 2

    Yang, L., Mehta, D., Liu, S., Mahapatra, D., Di Ieva, A., Ge, Z.: Tpmil: Trainable prototype enhanced multiple instance learning for whole slide image classification. Medical Imaging with Deep Learning (2023) 2

  48. [48]

    In: Proceedings of the 34th International Conference on Neural Information Processing Systems

    Yeh, C.K., Kim, B., Arik, S.O., Li, C.L., Pfister, T., Ravikumar, P.: On completeness-aware concept-based explanations in deep neural networks. In: Proceedings of the 34th International Conference on Neural Information Processing Systems. NIPS ’20, Curran Associates Inc., Red Hook, NY, USA (2020) 2

  49. [49]

    IEEE transactions on pattern analysis and machine intelligence43(9), 3126–3137 (2020) 9, 17

    Zadeh, S.G., Schmid, M.: Bias in cross-entropy-based training of deep survival networks. IEEE transactions on pattern analysis and machine intelligence43(9), 3126–3137 (2020) 9, 17

  50. [50]

    Journal of Cancer15(3), 623 (2024) 13

    Zhang, W., Yu, L., Chang, Z., Xiong, H.: Bcg immunotherapy promotes tumor-derived t-cell activation through the flt3/flt3lg pathway in bladder cancer. Journal of Cancer15(3), 623 (2024) 13

  51. [51]

    In: International Conference on Learning Representations (2024),https://openreview.net/forum?id=otHZ8JAIgh3, 9, 10

    Zhang, Y., Xu, Y., Chen, J., Xie, F., Chen, H.: Prototypical information bottlenecking and dis- entangling for multimodal cancer survival prediction. In: International Conference on Learning Representations (2024),https://openreview.net/forum?id=otHZ8JAIgh3, 9, 10

  52. [52]

    Frontiers in cell and developmental biology 9, 748280 (2021) 13

    Zheng, X., Liao, X., Nie, L., Lin, T., Xu, H., Yang, L., Shen, B., Qiu, S., Ai, J., Wei, Q.: Lck and cd3e orchestrate the tumor microenvironment and promote immunotherapy response and survival of muscle-invasive bladder cancer patients. Frontiers in cell and developmental biology 9, 748280 (2021) 13

  53. [53]

    Immune System

    Zhu, K., Xiaoqiang, L., Deng, W., Wang, G., Fu, B.: Development and validation of a novel lipid metabolism-related gene prognostic signature and candidate drugs for patients with bladder cancer. Lipids in health and disease20(1), 146 (2021) 13 ProtoPathway 1 Supplementary Material This supplement provides full details on all components of ProtoPathway. SM...

  54. [54]

    Thep-value is clamped to[10 −15,1−10 −15]before inversion for numerical stability

    Each per-foldp-valuep jk is converted to a directionalZ-score:z jk =Φ −1(1−p jk /2)· sign(rjk), whereΦ−1 is the standard normal quantile function andrjk is the rank-biserial correlationforentityjinfoldk.Thesignensuresthatconsistentdirectionaleffectsacross folds reinforce each other, while inconsistent directions cancel. Thep-value is clamped to[10 −15,1−1...

  55. [55]

    Fold weights are set proportional to the square root of the fold’s total patient count: wk = √nk, giving more influence to larger folds

  56. [56]

    The combinedZ-score is: Zj = PF k=1 wk zjkqPF k=1 w2 k , (S5) and the combinedp-value isp j = 2 (1−Φ(|Z j|))

  57. [57]

    Prototype 0

    The combined effect size is the weighted mean:¯rj =P k wk rjk /P k wk. An entity requires results from at least two folds for meaningful combination. Multiple testing across all entities is corrected via the Benjamini-Hochberg procedure atα= 0.05. D.5 Application to Specific Signals The fold-stratified analysis pipeline is applied to the following interpr...