pith. sign in

arxiv: 2605.16392 · v1 · pith:WQ6PSOWVnew · submitted 2026-05-12 · 🧬 q-bio.QM · cs.CV· cs.LG

Bridging the Modality Bottleneck in Pathology MIL through Virtual Molecular Staining

Pith reviewed 2026-05-20 22:14 UTC · model grok-4.3

classification 🧬 q-bio.QM cs.CVcs.LG
keywords multiple instance learningcomputational pathologyspatial transcriptomicsvirtual stainingwhole slide imagesmolecular prototypesbiomarker predictionsurvival prediction
0
0 comments X

The pith

MIST replaces the morphology-only projection layer in pathology MIL with virtual molecular stains derived from spatial transcriptomics prototypes during training.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces MIST as a plug-in replacement for the projection layer inside standard multiple instance learning pipelines for whole-slide pathology images. Typical MIL setups freeze a patch encoder and then project its features through a morphology-only layer before aggregation, which leaves out molecular state information needed for accurate survival, subtyping, and biomarker endpoints. MIST instead clusters gene expression profiles from paired spatial transcriptomics data into cross-modal prototypes, anchors those prototypes in the encoder feature space, and re-organizes the H&E patch features along the resulting molecular axes. The transformed features are then fed to any standard aggregator, and no transcriptomics data is required at inference time. A sympathetic reader would care because the approach keeps the practical advantages of frozen encoders and off-the-shelf aggregators while addressing the molecular information gap that currently limits clinical utility.

Core claim

MIST clusters gene expression profiles into cross-modal prototypes, anchors them in the frozen foundation model feature space, and uses them to reorganize H&E patch features along molecularly guided axes, serving as a plug-in replacement for the standard projection layer in MIL frameworks that improves performance across 23 downstream tasks and 8 aggregators.

What carries the argument

Molecularly Informed Staining Transform (MIST), which constructs virtual molecular stains from gene-derived cross-modal prototypes to guide feature reorganization.

If this is right

  • MIST improves 240 of 256 configurations over the standard projection layer with an average gain of +3.5%.
  • Gains appear consistently across endpoint types: +5.2% on survival prediction, +3.3% on tissue subtyping, and +2.6% on biomarker prediction.
  • The method integrates with 8 different MIL aggregators and requires no transcriptomics input at inference.
  • Ablations identify gene-derived prototypes as the primary source of the observed improvements.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • If the virtual stains recover spatially coherent molecular programs, the same training-time pairing strategy could be tested on other paired imaging-molecular datasets outside pathology.
  • The approach opens a route to inspect which molecular axes are recoverable from routine H&E morphology alone by examining the learned prototype affinities.
  • Extending the prototype construction to multi-center cohorts would test whether the gains persist when the training distribution of molecular states shifts.

Load-bearing premise

The cross-modal prototypes learned from spatial transcriptomics during training capture molecular programs that are both missing from H&E morphology and causally relevant to the downstream clinical endpoints.

What would settle it

On a held-out set of slides with paired spatial transcriptomics, replace the learned prototypes with random vectors of the same dimension and check whether the reported gains over the baseline projection layer disappear.

Figures

Figures reproduced from arXiv: 2605.16392 by Jiangdong Qiu, Jingying Ma, Kai He, Ling Huang, Mengling Feng, Pei Liu, Ruping Hong, Tianyu Liu, Yucheng Xing.

Figure 1
Figure 1. Figure 1: (A) MIST separates patches with similar morphology but distinct molecular profiles. (B) Mean performance gains across downstream tasks for eight MIL aggregators. MIST realizes feature-level virtual molecular staining in three steps. First, it derives molec￾ular prototypes by clustering gene-expression profiles and anchoring the resulting prototypes in the pathology FM feature space, allowing each prototype… view at source ↗
Figure 2
Figure 2. Figure 2: Overview of MIST. Each patch is encoded by a frozen foundation model and matched to organ-specific molecular prototypes through soft affinities. Prototype-conditioned low-rank residual transforms generate stained views over a shared basis, which are combined to produce a patch representation consumable by any MIL aggregator. The prototype bank is constructed once per organ from paired ST and H&E data, with… view at source ↗
Figure 3
Figure 3. Figure 3: Interpretability analysis of MIST. (A) Prototype affinity maps on a held-out HEST slide with paired ST. GT and P denote transcriptomics ground truth and H&E-based prediction, respectively. (B) Top upregulated genes associated with representative molecular prototypes. (C) Patch-level affinity distributions across representative tissue regions annotated by board-certified pathologists, including normal tissu… view at source ↗
Figure 4
Figure 4. Figure 4: Additional interpretability results on a held-out brain HEST slide. Whole-slide prototype affinity visualizations generated from H&E images without transcriptomics at inference. it may also broaden access to molecularly-informed computational pathology systems in settings where transcriptomics is unavailable. At the same time, MIST relies on molecular supervision derived from existing ST datasets, which re… view at source ↗
Figure 5
Figure 5. Figure 5: Additional interpretability results on a held-out breast HEST slide. Whole-slide prototype affinity visualizations generated from H&E images without transcriptomics at inference. 21 [PITH_FULL_IMAGE:figures/full_fig_p021_5.png] view at source ↗
Figure 6
Figure 6. Figure 6: Additional interpretability results on a held-out lung HEST slide. Whole-slide prototype affinity visualizations generated from H&E images without transcriptomics at inference. 22 [PITH_FULL_IMAGE:figures/full_fig_p022_6.png] view at source ↗
Figure 7
Figure 7. Figure 7: Parameter comparison before and after inserting MIST. Parameter counts (millions) across eight MIL aggregators with and without MIST (K=8, r=16). MIST introduces minimal overhead for most architectures and reduces the parameter count for ILRA through shared feature compression. 23 [PITH_FULL_IMAGE:figures/full_fig_p023_7.png] view at source ↗
read the original abstract

Multiple instance learning (MIL) is the dominant framework for whole-slide image analysis in computational pathology, typically combining a frozen patch encoder, a projection layer, and a slide-level aggregator. While encoders and aggregators have been extensively studied, the projection layer remains a largely morphology-only bottleneck. This limits endpoints such as biomarker status and survival, which are governed by a molecular state that is not fully captured by H&E morphology. We introduce Molecularly Informed Staining Transform (MIST), a plug-in replacement for the MIL projection layer that uses paired spatial transcriptomics only during training to construct virtual molecular stains. MIST clusters gene expression profiles into cross-modal prototypes, anchors them in the frozen foundation model feature space, and uses them to reorganize H&E patch features along molecularly guided axes. It requires no transcriptomics at inference and can be inserted before standard MIL aggregators. We evaluate MIST across 23 downstream tasks and 8 MIL aggregators. MIST improves 240 of 256 configurations over the standard projection layer, with an average gain of +3.5%, observed consistently across endpoint types: +5.2% on survival prediction, +3.3% on tissue subtyping, and +2.6% on biomarker prediction. Ablations confirm that gene-derived prototypes are the primary source of the gains, while spatial, biological, and pathological analyses show that cross-modal prototype affinities capture spatially coherent molecular programs from H&E alone.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The manuscript introduces Molecularly Informed Staining Transform (MIST), a plug-in replacement for the projection layer in multiple instance learning (MIL) pipelines for whole-slide pathology images. MIST uses paired spatial transcriptomics data exclusively during training to cluster gene expression profiles into cross-modal prototypes, anchors these in a frozen foundation model feature space, and reorganizes H&E patch features along molecularly guided axes to produce virtual molecular stains. No transcriptomics is required at inference. The method is evaluated across 23 downstream tasks and 8 MIL aggregators, reporting improvement in 240 of 256 configurations with an average gain of +3.5% (higher on survival prediction at +5.2%). Ablations attribute gains primarily to the gene-derived prototypes, with supporting spatial and biological analyses of prototype affinities.

Significance. If the central claim holds, MIST offers a practical way to inject molecular information into morphology-only MIL models without changing inference requirements, potentially improving performance on endpoints like survival and biomarker prediction that are only partially reflected in H&E. The scale of the evaluation (256 configurations, multiple endpoint types, and ablations) is a strength, as is the demonstration that gains persist across different aggregators. However, the significance depends on whether the observed improvements reflect genuinely new, causally relevant molecular programs absent from H&E morphology rather than dataset-specific correlations learned from the paired training distribution.

major comments (2)
  1. [Methods and §4] Methods and §4 Experiments: the manuscript provides no details on how data splits are constructed to prevent leakage between slides with paired transcriptomics and those used only for downstream evaluation, nor any statistical significance tests, confidence intervals, or multiple-testing correction for the 256 configurations. This is load-bearing for the claim of consistent +3.5% average improvement.
  2. [§5.3 and §5.4] §5.3 Ablations and §5.4 Biological analyses: while ablations show that removing gene-derived prototypes reduces gains, this does not rule out the possibility that prototypes encode training-set morphology–molecular correlations rather than molecular programs that are both missing from H&E and causally relevant to the clinical endpoints. No external validation cohort, orthogonal assay, or perturbation experiment is presented to distinguish these alternatives.
minor comments (2)
  1. [Figure 3] Figure 3 and associated text: the visualization of prototype affinities would benefit from explicit scale bars or quantitative measures of spatial coherence to allow readers to assess the claimed biological interpretability.
  2. [§3] Notation in §3: the definition of cross-modal prototype anchoring could be clarified with a short equation or pseudocode to make the projection-layer replacement fully reproducible from the text alone.

Simulated Author's Rebuttal

2 responses · 1 unresolved

We thank the referee for their constructive and detailed comments. We have addressed each major point below, incorporating revisions to improve methodological transparency and statistical reporting. We provide honest responses on the scope of what can be addressed within the current study.

read point-by-point responses
  1. Referee: [Methods and §4] Methods and §4 Experiments: the manuscript provides no details on how data splits are constructed to prevent leakage between slides with paired transcriptomics and those used only for downstream evaluation, nor any statistical significance tests, confidence intervals, or multiple-testing correction for the 256 configurations. This is load-bearing for the claim of consistent +3.5% average improvement.

    Authors: We agree that explicit documentation of data splits and statistical analyses is essential to support the reported gains. In the revised manuscript, we have added a dedicated subsection in Methods detailing the patient-level splitting protocol: all slides with paired spatial transcriptomics data were reserved exclusively for cross-modal prototype construction and were strictly excluded from all downstream evaluation sets, with no patient overlap. For the 256 configurations, we now report 95% bootstrap confidence intervals around the mean performance deltas and include a sign test assessing the consistency of improvements. We have also added a brief discussion of multiple comparisons in §4, noting that while formal correction (e.g., Bonferroni) was not applied given the multi-task exploratory design, the gains remain consistent across independent endpoint families (survival, subtyping, biomarkers) and across all eight aggregators. These changes are incorporated in the revised §4. revision: yes

  2. Referee: [§5.3 and §5.4] §5.3 Ablations and §5.4 Biological analyses: while ablations show that removing gene-derived prototypes reduces gains, this does not rule out the possibility that prototypes encode training-set morphology–molecular correlations rather than molecular programs that are both missing from H&E and causally relevant to the clinical endpoints. No external validation cohort, orthogonal assay, or perturbation experiment is presented to distinguish these alternatives.

    Authors: We appreciate the referee’s careful distinction between correlation and causal relevance. The ablations in §5.3 show that substituting gene-derived prototypes with random or purely morphological clusters substantially attenuates the observed gains, indicating that the molecular signal is the primary contributor. The spatial and pathway analyses in §5.4 further demonstrate that prototype affinities recover spatially coherent structures and align with known molecular programs from the transcriptomic data. We have expanded the discussion in both sections to more explicitly frame these results as evidence that MIST reorganizes features along axes that are at least partially orthogonal to standard H&E morphology. However, we acknowledge that internal ablations and analyses cannot fully exclude the possibility of dataset-specific correlations learned from the paired training distribution. External validation cohorts, orthogonal assays, or perturbation experiments would be required to strengthen causal claims; such experiments lie outside the scope and data resources of the present study. revision: partial

standing simulated objections not resolved
  • Distinguishing training-set morphology–molecular correlations from causally relevant molecular programs via external validation cohorts, orthogonal assays, or perturbation experiments

Circularity Check

0 steps flagged

No significant circularity; empirical gains are measured against independent baseline

full rationale

The paper's central claim consists of observed performance improvements (+3.5% average across 240/256 configurations) when replacing the standard MIL projection layer with MIST. MIST learns cross-modal prototypes from paired spatial transcriptomics data exclusively during training and applies the resulting reorganization to H&E features at inference. The reported gains are quantified on held-out downstream tasks (survival, subtyping, biomarker prediction) against a morphology-only projection baseline rather than being defined by or forced to equal the training objective. No equations, self-citations, or uniqueness theorems are invoked that would reduce the empirical result to a tautology or fitted input. The derivation chain remains self-contained with external paired data and falsifiable evaluation metrics.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 1 invented entities

The central claim rests on the assumption that spatial transcriptomics provides an independent molecular signal that can be transferred to H&E morphology via prototype anchoring; no free parameters are explicitly named in the abstract, but the number of prototypes and clustering method are implicit modeling choices.

axioms (1)
  • domain assumption Spatial transcriptomics paired with H&E provides a faithful proxy for the molecular state that governs the clinical endpoints of interest.
    Invoked when the method is presented as bridging the modality bottleneck for biomarker and survival prediction.
invented entities (1)
  • Cross-modal prototypes no independent evidence
    purpose: To serve as anchors that reorganize H&E features along molecular axes.
    New construct introduced to link transcriptomic clusters to the frozen encoder feature space.

pith-pipeline@v0.9.0 · 5823 in / 1294 out tokens · 27189 ms · 2026-05-20T22:14:39.919059+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

44 extracted references · 44 canonical work pages · 1 internal anchor

  1. [1]

    Classifica- tion and mutation prediction from non–small cell lung cancer histopathology images using deep learning.Nature medicine, 24(10):1559–1567, 2018

    Nicolas Coudray, Paolo Santiago Ocampo, Theodore Sakellaropoulos, Navneet Narula, Matija Snuderl, David Fenyö, Andre L Moreira, Narges Razavian, and Aristotelis Tsirigos. Classifica- tion and mutation prediction from non–small cell lung cancer histopathology images using deep learning.Nature medicine, 24(10):1559–1567, 2018

  2. [2]

    Deep learning can predict microsatellite instability directly from histology in gastrointestinal cancer.Nature medicine, 25(7):1054–1056, 2019

    Jakob Nikolas Kather, Alexander T Pearson, Niels Halama, Dirk Jäger, Jeremias Krause, Sven H Loosen, Alexander Marx, Peter Boor, Frank Tacke, Ulf Peter Neumann, et al. Deep learning can predict microsatellite instability directly from histology in gastrointestinal cancer.Nature medicine, 25(7):1054–1056, 2019

  3. [3]

    Pan-cancer integrative histology-genomic analysis via multimodal deep learning.Cancer cell, 40(8):865–878, 2022

    Richard J Chen, Ming Y Lu, Drew FK Williamson, Tiffany Y Chen, Jana Lipkova, Zahra Noor, Muhammad Shaban, Maha Shady, Mane Williams, Bumjin Joo, et al. Pan-cancer integrative histology-genomic analysis via multimodal deep learning.Cancer cell, 40(8):865–878, 2022

  4. [4]

    Clinical-grade computational pathology using weakly supervised deep learning on whole slide images.Nature medicine, 25(8):1301–1309, 2019

    Gabriele Campanella, Matthew G Hanna, Luke Geneslaw, Allen Miraflor, Vitor Werneck Krauss Silva, Klaus J Busam, Edi Brogi, Victor E Reuter, David S Klimstra, and Thomas J Fuchs. Clinical-grade computational pathology using weakly supervised deep learning on whole slide images.Nature medicine, 25(8):1301–1309, 2019

  5. [5]

    Cell-of-origin patterns dominate the molecular classification of 10,000 tumors from 33 types of cancer.Cell, 173(2):291–304, 2018

    Katherine A Hoadley, Christina Yau, Toshinori Hinoue, Denise M Wolf, Alexander J Lazar, Esther Drill, Ronglai Shen, Alison M Taylor, Andrew D Cherniack, Vésteinn Thorsson, et al. Cell-of-origin patterns dominate the molecular classification of 10,000 tumors from 33 types of cancer.Cell, 173(2):291–304, 2018

  6. [6]

    Chen, Drew F

    Daniel Shao, Joel Runevic, Richard J. Chen, Drew F. K. Williamson, Ahrong Kim, Andrew H. Song, and Faisal Mahmood. Mixture of mini experts: Overcoming the linear layer bottleneck in multiple instance learning. InInternational Conference on Learning Representations (ICLR), 2026

  7. [7]

    Towards a general-purpose foundation model for computational pathology.Nature Medicine, 2024

    Richard J Chen, Tong Ding, Ming Y Lu, Drew FK Williamson, Guillaume Jaume, Bowen Chen, Andrew Zhang, Daniel Shao, Andrew H Song, Muhammad Shaban, et al. Towards a general-purpose foundation model for computational pathology.Nature Medicine, 2024

  8. [8]

    A visual-language foundation model for computational pathology.Nature medicine, 30(3):863–874, 2024

    Ming Y Lu, Bowen Chen, Drew FK Williamson, Richard J Chen, Ivy Liang, Tong Ding, Guil- laume Jaume, Igor Odintsov, Long Phi Le, Georg Gerber, et al. A visual-language foundation model for computational pathology.Nature medicine, 30(3):863–874, 2024

  9. [9]

    Transmil: Transformer based correlated multiple instance learning for whole slide image classification

    Zhuchen Shao, Hao Bian, Yang Chen, Yifeng Wang, Jian Zhang, Xiangyang Ji, et al. Transmil: Transformer based correlated multiple instance learning for whole slide image classification. Advances in neural information processing systems, 34:2136–2147, 2021

  10. [10]

    Data-efficient and weakly supervised computational pathology on whole-slide images.Nature biomedical engineering, 5(6):555–570, 2021

    Ming Y Lu, Drew FK Williamson, Tiffany Y Chen, Richard J Chen, Matteo Barbieri, and Faisal Mahmood. Data-efficient and weakly supervised computational pathology on whole-slide images.Nature biomedical engineering, 5(6):555–570, 2021

  11. [11]

    Exploring low-rank property in multiple instance learning for whole slide image classification

    Jinxi Xiang and Jun Zhang. Exploring low-rank property in multiple instance learning for whole slide image classification. InThe Eleventh International Conference on Learning Representations, 2023

  12. [12]

    Feature re-embedding: Towards foundation model-level performance in computational pathology

    Wenhao Tang, Fengtao Zhou, Sheng Huang, Xiang Zhu, Yi Zhang, and Bo Liu. Feature re-embedding: Towards foundation model-level performance in computational pathology. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 11343–11352, 2024

  13. [13]

    An introduction to the performance of immunohistochemistry

    Shino Magaki, Seyed A Hojat, Bowen Wei, Alexandra So, and William H Yong. An introduction to the performance of immunohistochemistry. InBiobanking: methods and protocols, pages 289–298. Springer, 2018

  14. [14]

    Song, Ming Y

    Guillaume Jaume, Paul Doucet, Andrew H. Song, Ming Y . Lu, Cristina Almagro-Perez, Sophia J. Wagner, Anurag J. Vaidya, Richard J. Chen, Drew F. K. Williamson, Ahrong Kim, and Faisal Mahmood. Hest-1k: A dataset for spatial transcriptomics and histology image analysis. In Advances in Neural Information Processing Systems, December 2024. 11

  15. [15]

    Song, Cristina Almagro-Perez, Guillaume Jaume, Sophie J

    Konstantin Hemker, Andrew H. Song, Cristina Almagro-Perez, Guillaume Jaume, Sophie J. Wagner, Anurag Vaidya, Nikola Simidjievski, Mateja Jamnik, and Faisal Mahmood. Towards spatial transcriptomics-driven pathology foundation models.arXiv preprint arXiv:2602.14177, 2026

  16. [16]

    A Multimodal Foundation Model of Spatial Transcriptomics and Histology for Biological Discovery and Clinical Prediction

    Jinxi Xiang, Siyu Hou, Yuchen Li, Ryan Quinton, Xiaoming Zhang, Feyisope Eweje, Xiangde Luo, Yijiang Chen, Zhe Li, Colin Bergstrom, et al. A multimodal foundation model of spatial transcriptomics and histology for biological discovery and clinical prediction.arXiv preprint arXiv:2604.03630, 2026

  17. [17]

    A foundation model for clinical-grade computational pathology and rare cancers detection.Nature medicine, 30(10):2924–2935, 2024

    Eugene V orontsov, Alican Bozkurt, Adam Casson, George Shaikovski, Michal Zelechowski, Kristen Severson, Eric Zimmermann, James Hall, Neil Tenenholtz, Nicolo Fusi, et al. A foundation model for clinical-grade computational pathology and rare cancers detection.Nature medicine, 30(10):2924–2935, 2024

  18. [18]

    A whole-slide foundation model for digital pathology from real-world data.Nature, 630(8015):181–188, 2024

    Hanwen Xu, Naoto Usuyama, Jaspreet Bagga, Sheng Zhang, Rajesh Rao, Tristan Naumann, Cliff Wong, Zelalem Gero, Javier González, Yu Gu, et al. A whole-slide foundation model for digital pathology from real-world data.Nature, 630(8015):181–188, 2024

  19. [19]

    Scaling vision transformers to gigapixel images via hierarchical self-supervised learning

    Richard J Chen, Chengkuan Chen, Yicong Li, Tiffany Y Chen, Andrew D Trister, Rahul G Krishnan, and Faisal Mahmood. Scaling vision transformers to gigapixel images via hierarchical self-supervised learning. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 16144–16155, 2022

  20. [20]

    A vision–language foundation model for precision oncology.Nature, 638(8051):769–778, 2025

    Jinxi Xiang, Xiyue Wang, Xiaoming Zhang, Yinghua Xi, Feyisope Eweje, Yijiang Chen, Yuchen Li, Colin Bergstrom, Matthew Gopaulchan, Ted Kim, et al. A vision–language foundation model for precision oncology.Nature, 638(8051):769–778, 2025

  21. [21]

    A multimodal whole-slide foundation model for pathology.Nature medicine, pages 1–13, 2025

    Tong Ding, Sophia J Wagner, Andrew H Song, Richard J Chen, Ming Y Lu, Andrew Zhang, Anurag J Vaidya, Guillaume Jaume, Muhammad Shaban, Ahrong Kim, et al. A multimodal whole-slide foundation model for pathology.Nature medicine, pages 1–13, 2025

  22. [22]

    Attention-based deep multiple instance learning

    Maximilian Ilse, Jakub Tomczak, and Max Welling. Attention-based deep multiple instance learning. InInternational conference on machine learning, pages 2127–2136. PMLR, 2018

  23. [23]

    Dual-stream multiple instance learning network for whole slide image classification with self-supervised contrastive learning

    Bin Li, Yin Li, and Kevin W Eliceiri. Dual-stream multiple instance learning network for whole slide image classification with self-supervised contrastive learning. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 14318–14328, 2021

  24. [24]

    Context matters: Query-aware dynamic long sequence modeling of gigapixel images.arXiv preprint arXiv:2501.18984, 2025

    Zhengrui Guo, Qichen Sun, Jiabo Ma, Lishuang Feng, Jinzhuo Wang, and Hao Chen. Context matters: Query-aware dynamic long sequence modeling of gigapixel images.arXiv preprint arXiv:2501.18984, 2025

  25. [25]

    Mint: Molecularly informed training with spatial transcriptomics supervision for pathology foundation models

    Minsoo Lee, Jonghyun Kim, Juseung Yun, Sunwoo Yu, and Jongseong Jang. Mint: Molecularly informed training with spatial transcriptomics supervision for pathology foundation models. arXiv preprint arXiv:2603.07895, 2026

  26. [26]

    Spade: Spatial transcriptomics and pathology alignment using a mixture of data experts for an expressive latent space.arXiv preprint arXiv:2506.21857, 2025

    Ekaterina Redekop, Mara Pleasure, Zichen Wang, Kimberly Flores, Anthony Sisk, William Speier, and Corey W Arnold. Spade: Spatial transcriptomics and pathology alignment using a mixture of data experts for an expressive latent space.arXiv preprint arXiv:2506.21857, 2025

  27. [27]

    An evidential time-to-event prediction model based on gaussian random fuzzy numbers

    Ling Huang, Yucheng Xing, Thierry Denoeux, and Mengling Feng. An evidential time-to-event prediction model based on gaussian random fuzzy numbers. InInternational Conference on Belief Functions, pages 49–57. Springer, 2024

  28. [28]

    Evidential time-to-event prediction with calibrated uncertainty quantification.International Journal of Approximate Reasoning, 181:109403, 2025

    Ling Huang, Yucheng Xing, Swapnil Mishra, Thierry Denœux, and Mengling Feng. Evidential time-to-event prediction with calibrated uncertainty quantification.International Journal of Approximate Reasoning, 181:109403, 2025

  29. [29]

    Esurvfu- sion: An evidential multimodal survival fusion model based on epistemic random fuzzy sets

    Ling Huang, Yucheng Xing, Qika Lin, Jinming Duan, Su Ruan, and Mengling Feng. Esurvfu- sion: An evidential multimodal survival fusion model based on epistemic random fuzzy sets. IEEE Transactions on Fuzzy Systems, 2025. 12

  30. [30]

    Morphological prototyping for unsupervised slide representation learning in computational pathology

    Andrew H Song, Richard J Chen, Tong Ding, Drew FK Williamson, Guillaume Jaume, and Faisal Mahmood. Morphological prototyping for unsupervised slide representation learning in computational pathology. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

  31. [31]

    Handcrafted histo- logical transformer (h2t): Unsupervised representation of whole slide images.Medical image analysis, 85:102743, 2023

    Quoc Dang Vu, Kashif Rajpoot, Shan E Ahmed Raza, and Nasir Rajpoot. Handcrafted histo- logical transformer (h2t): Unsupervised representation of whole slide images.Medical image analysis, 85:102743, 2023

  32. [32]

    Multimodal prototyping for cancer survival prediction.arXiv preprint arXiv:2407.00224, 2024

    Andrew H Song, Richard J Chen, Guillaume Jaume, Anurag J Vaidya, Alexander S Baras, and Faisal Mahmood. Multimodal prototyping for cancer survival prediction.arXiv preprint arXiv:2407.00224, 2024

  33. [33]

    Dpsurv: Dual-prototype evidential fusion for uncertainty- aware and interpretable whole-slide image survival prediction.arXiv preprint arXiv:2510.00053, 2025

    Yucheng Xing, Ling Huang, Jingying Ma, Ruping Hong, Jiangdong Qiu, Pei Liu, Kai He, Huazhu Fu, and Mengling Feng. Dpsurv: Dual-prototype evidential fusion for uncertainty- aware and interpretable whole-slide image survival prediction.arXiv preprint arXiv:2510.00053, 2025

  34. [34]

    Com- prehensive integration of single-cell data.cell, 177(7):1888–1902, 2019

    Tim Stuart, Andrew Butler, Paul Hoffman, Christoph Hafemeister, Efthymia Papalexi, William M Mauck, Yuhan Hao, Marlon Stoeckius, Peter Smibert, and Rahul Satija. Com- prehensive integration of single-cell data.cell, 177(7):1888–1902, 2019

  35. [35]

    The digital brain tumour atlas, an open histopathology resource

    Thomas Roetzer-Pejrimovsky, Anna-Christina Moser, Baran Atli, Clemens Christian V ogel, Petra A Mercea, Romana Prihoda, Ellen Gelpi, Christine Haberler, Romana Höftberger, Jo- hannes A Hainfellner, et al. The digital brain tumour atlas, an open histopathology resource. Scientific Data, 9(1):55, 2022

  36. [36]

    Distinct patterns of somatic genome alterations in lung adenocarcinomas and squamous cell carcinomas.Nature genetics, 48(6):607–616, 2016

    Joshua D Campbell, Anton Alexandrov, Jaegil Kim, Jeremiah Wala, Alice H Berger, Chan- dra Sekhar Pedamallu, Sachet A Shukla, Guangwu Guo, Angela N Brooks, Bradley A Murray, et al. Distinct patterns of somatic genome alterations in lung adenocarcinomas and squamous cell carcinomas.Nature genetics, 48(6):607–616, 2016

  37. [37]

    Artificial intelligence for diagnosis and gleason grading of prostate cancer: the panda challenge.Nature medicine, 28(1):154–163, 2022

    Wouter Bulten, Kimmo Kartasalo, Po-Hsuan Cameron Chen, Peter Ström, Hans Pinckaers, Kunal Nagpal, Yuannan Cai, David F Steiner, Hester Van Boven, Robert Vink, et al. Artificial intelligence for diagnosis and gleason grading of prostate cancer: the panda challenge.Nature medicine, 28(1):154–163, 2022

  38. [38]

    Bracs: A dataset for breast carcinoma subtyping in h&e histology images.Database, 2022:baac093, 2022

    Nadia Brancati, Anna Maria Anniciello, Pushpak Pati, Daniel Riccio, Giosuè Scognamiglio, Guillaume Jaume, Giuseppe De Pietro, Maurizio Di Bonito, Antonio Foncubierta, Gerardo Botti, et al. Bracs: A dataset for breast carcinoma subtyping in h&e histology images.Database, 2022:baac093, 2022

  39. [39]

    The cancer genome atlas pan-cancer analysis project.Nature genetics, 45(10):1113–1120, 2013

    John N Weinstein, Eric A Collisson, Gordon B Mills, Kenna R Shaw, Brad A Ozenberger, Kyle Ellrott, Ilya Shmulevich, Chris Sander, and Joshua M Stuart. The cancer genome atlas pan-cancer analysis project.Nature genetics, 45(10):1113–1120, 2013

  40. [40]

    Predicting axillary lymph node metastasis in early breast cancer using deep learning on primary tumor biopsy slides.Frontiers in oncology, 11:759007, 2021

    Feng Xu, Chuang Zhu, Wenqi Tang, Ying Wang, Yu Zhang, Jie Li, Hongchuan Jiang, Zhongyue Shi, Jun Liu, and Mulan Jin. Predicting axillary lymph node metastasis in early breast cancer using deep learning on primary tumor biopsy slides.Frontiers in oncology, 11:759007, 2021

  41. [41]

    Transformer-based biomarker prediction from colorectal cancer histology: A large-scale multicentric study.Cancer cell, 41(9):1650–1661, 2023

    Sophia J Wagner, Daniel Reisenbüchler, Nicholas P West, Jan Moritz Niehues, Jiefu Zhu, Sebastian Foersch, Gregory Patrick Veldhuizen, Philip Quirke, Heike I Grabsch, Piet A van den Brandt, et al. Transformer-based biomarker prediction from colorectal cancer histology: A large-scale multicentric study.Cancer cell, 41(9):1650–1661, 2023

  42. [42]

    Attention is all you need.Advances in neural information processing systems, 30, 2017

    Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Łukasz Kaiser, and Illia Polosukhin. Attention is all you need.Advances in neural information processing systems, 30, 2017

  43. [43]

    Pan-cancer image-based detection of clinically actionable genetic alterations.Nature cancer, 1(8):789–799, 2020

    Jakob Nikolas Kather, Lara R Heij, Heike I Grabsch, Chiara Loeffler, Amelie Echle, Hannah So- phie Muti, Jeremias Krause, Jan M Niehues, Kai AJ Sommer, Peter Bankhead, et al. Pan-cancer image-based detection of clinically actionable genetic alterations.Nature cancer, 1(8):789–799, 2020. 13

  44. [44]

    Pan- cancer computational histopathology reveals mutations, tumor composition and prognosis

    Yu Fu, Alexander W Jung, Ramon Viñas Torne, Santiago Gonzalez, Harald Vöhringer, Artem Shmatko, Lucy R Yates, Mercedes Jimenez-Linan, Luiza Moore, and Moritz Gerstung. Pan- cancer computational histopathology reveals mutations, tumor composition and prognosis. Nature cancer, 1(8):800–810, 2020. 14 Appendix A Molecular Prototype Construction We derive K mo...