Cross-Modal Knowledge Distillation from Spatial Transcriptomics to Histology

Arbel Hizmi; Artemii Bakulin; Nir Yosef; Shai Bagon

arxiv: 2604.09076 · v1 · submitted 2026-04-10 · 💻 cs.CV

Cross-Modal Knowledge Distillation from Spatial Transcriptomics to Histology

Arbel Hizmi , Artemii Bakulin , Shai Bagon , Nir Yosef This is my paper

Pith reviewed 2026-05-10 16:46 UTC · model grok-4.3

classification 💻 cs.CV

keywords spatial transcriptomicshistologyknowledge distillationtissue nichescross-modal learningH&E imagescell-type compositiontissue organization

0 comments

The pith

A distilled histology model identifies tissue niches matching transcriptomics better than image-only baselines.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper proposes distilling niche structure from paired spatial transcriptomics and H&E histology into a model that operates on histology images alone. This distilled model shows substantially higher agreement with transcriptomics-derived niches than unsupervised morphology baselines using the same image features, and the niches align with actual cell-type neighborhoods. The approach trains on paired data but applies to new H&E slides without any transcriptomics input. It works across tissue types and disease contexts, making molecular tissue organization accessible from abundant histology data.

Core claim

Cross-modal knowledge distillation from paired spatial transcriptomics and H&E data enables a histology-only model to achieve substantially higher agreement with transcriptomics-derived niche structure than unsupervised morphology-based baselines trained on identical image features, while recovering biologically meaningful neighborhood composition as confirmed by cell-type analysis.

What carries the argument

Cross-modal knowledge distillation that transfers transcriptomics-derived niche labels to supervise training of a histology image model.

If this is right

The model recovers biologically meaningful neighborhood composition verified by cell-type analysis.
It achieves higher agreement with transcriptomics niche structure than morphology-based unsupervised baselines.
It applies to held-out tissue regions using histology alone at inference time.
The gains hold across multiple tissue types and disease contexts.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Large archives of existing H&E slides could be reanalyzed for molecularly defined niches without new transcriptomics experiments.
The approach offers a route to combine morphological and molecular views of tissue in routine pathology workflows.
Similar distillation might be explored with other abundant imaging modalities paired to transcriptomics.

Load-bearing premise

Paired spatial transcriptomics and H&E data supply a consistent signal that transfers to held-out histology images without transcriptomics input.

What would settle it

On independent paired test samples, if the distilled model's niche assignments match transcriptomics no better than unsupervised image-feature clustering, the central claim fails.

Figures

Figures reproduced from arXiv: 2604.09076 by Arbel Hizmi, Artemii Bakulin, Nir Yosef, Shai Bagon.

**Figure 1.** Figure 1: Framework overview: Our goal is to transfer spatially resolved molecular niche structure from spatial transcriptomics to histology, so that niche organization can be predicted from H&E alone at inference time. The key idea is that transcriptomics provides a richer view of local tissue state, while histology is abundant and widely available. During training, we therefore use a frozen spatialtranscriptomic… view at source ↗

**Figure 2.** Figure 2: Representative H&E slide of ovarian cancer. [PITH_FULL_IMAGE:figures/full_fig_p005_2.png] view at source ↗

**Figure 3.** Figure 3: Niche segmentation of healthy human colon into 20 distinct niches: In our supervised distillation framework, the imagebased student model (ours) learns to replicate the teacher’s expression-derived segmentation using histology features alone. Each color corresponds to a distinct niche. For example, in the teacher and student results, purple patches mark B-cell follicles, warm tones (orange, pink) delineat… view at source ↗

**Figure 4.** Figure 4: Niche segmentation in human cervix cancer into 20 distinct niches: Our model (student) closely replicates the teacher’s expression-derived segmentation from histology alone, with both models recovering fine-grained tissue structure including tumor epithelial regions, stromal infiltration, and the precise boundaries of invasive carcinoma nests. Histology-NOLAN and Histology-Leiden produce substantially coar… view at source ↗

read the original abstract

Spatial transcriptomics provides a molecularly rich description of tissue organization, enabling unsupervised discovery of tissue niches -- spatially coherent regions of distinct cell-type composition and function that are relevant to both biological research and clinical interpretation. However, spatial transcriptomics remains costly and scarce, while H&E histology is abundant but carries a less granular signal. We propose to leverage paired spatial transcriptomics and H&E data to transfer transcriptomics-derived niche structure to a histology-only model via cross-modal distillation. Across multiple tissue types and disease contexts, the distilled model achieves substantially higher agreement with transcriptomics-derived niche structure than unsupervised morphology-based baselines trained on identical image features, and recovers biologically meaningful neighborhood composition as confirmed by cell-type analysis. The resulting framework leverages paired spatial transcriptomic and H&E data during training, and can then be applied to held-out tissue regions using histology alone, without any transcriptomic input at inference time.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper distills transcriptomics niches into an H&E-only model via paired data and beats unsupervised image baselines, but the gains may reflect supervised label fitting more than robust cross-modal transfer.

read the letter

The main point is that they derive niches from spatial transcriptomics, then use cross-modal distillation on paired H&E images to train a model that predicts those niches from histology alone. It reports higher agreement with the transcriptomics ground truth than unsupervised morphology baselines on the same image features, plus cell-type checks that the outputs look biologically plausible across a few tissue types and disease settings.

Referee Report

2 major / 1 minor

Summary. The paper proposes cross-modal knowledge distillation to transfer unsupervised niche structures (spatially coherent regions defined by cell-type composition) discovered from paired spatial transcriptomics to a histology-only model operating on H&E images. Training uses paired data; inference requires only histology. The central empirical claim is that the distilled model achieves substantially higher agreement with transcriptomics-derived niches than unsupervised morphology baselines trained on identical image features, across multiple tissue types and disease contexts, while also recovering biologically meaningful neighborhood composition via cell-type analysis.

Significance. If the claims are substantiated with rigorous controls, the work would provide a practical bridge between scarce molecular data and abundant H&E slides, enabling transcriptomics-informed niche analysis at scale in computational pathology. The framework's use of paired data only at training time and its reported outperformance over same-feature unsupervised baselines are strengths that, if verified, would constitute a clear advance over purely morphology-driven methods.

major comments (2)

[Evaluation] Evaluation section: the abstract states that the distilled model achieves 'substantially higher agreement' than unsupervised morphology baselines on identical image features, yet provides no quantitative metrics (e.g., adjusted Rand index, normalized mutual information), no description of baseline implementations, no data-split protocol, and no explicit controls for paired-sample leakage. Without these, it is impossible to determine whether reported gains reflect genuine cross-modal signal transfer or supervised fitting to training-pair idiosyncrasies.
[Methods/Results] Methods and Results: the central premise requires that transcriptomics-derived niches remain sufficiently aligned with H&E morphology for distillation to be meaningful and generalizable. No direct test of this alignment (e.g., mutual information between niche labels and image features, or performance drop on completely unpaired external slides) is described. The comparison to unsupervised baselines on the same features does not isolate whether gains arise from transferable histology signals or from the supervised objective simply memorizing niche labels derived from the paired transcriptomics.

minor comments (1)

[Abstract] Abstract: the phrases 'substantially higher agreement' and 'biologically meaningful neighborhood composition' are not accompanied by any numerical values or specific cell-type findings; adding one or two key quantitative results would improve clarity.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive comments, which highlight important aspects of evaluation rigor and the need to better isolate the source of performance gains. We agree that additional quantitative details and controls will strengthen the manuscript and will incorporate them in the revision.

read point-by-point responses

Referee: [Evaluation] Evaluation section: the abstract states that the distilled model achieves 'substantially higher agreement' than unsupervised morphology baselines on identical image features, yet provides no quantitative metrics (e.g., adjusted Rand index, normalized mutual information), no description of baseline implementations, no data-split protocol, and no explicit controls for paired-sample leakage. Without these, it is impossible to determine whether reported gains reflect genuine cross-modal signal transfer or supervised fitting to training-pair idiosyncrasies.

Authors: We agree that explicit quantitative metrics, baseline details, and leakage controls are necessary for full transparency. In the revised manuscript we will report adjusted Rand index (ARI) and normalized mutual information (NMI) between model predictions and transcriptomics-derived niches for both the distilled model and the unsupervised baselines. We will add a dedicated subsection describing baseline implementations (including exact clustering algorithms, feature extractors, and hyper-parameters), the data-split protocol (patient- or region-level hold-out), and controls such as performance on held-out paired regions to rule out leakage. These additions will allow readers to verify that gains reflect cross-modal transfer. revision: yes
Referee: [Methods/Results] Methods and Results: the central premise requires that transcriptomics-derived niches remain sufficiently aligned with H&E morphology for distillation to be meaningful and generalizable. No direct test of this alignment (e.g., mutual information between niche labels and image features, or performance drop on completely unpaired external slides) is described. The comparison to unsupervised baselines on the same features does not isolate whether gains arise from transferable histology signals or from the supervised objective simply memorizing niche labels derived from the paired transcriptomics.

Authors: We acknowledge the value of a direct alignment test. The revised manuscript will include mutual information between the transcriptomics-derived niche labels and the image features to quantify alignment. We will also evaluate the distilled model on external unpaired H&E slides (where available in our datasets) and report any performance drop to demonstrate generalizability beyond paired training samples. To further isolate transferable signals from memorization, we will add an ablation study training the same architecture with randomly shuffled niche labels; the resulting performance drop relative to the original supervision will support that the gains derive from biologically meaningful cross-modal information rather than label memorization. revision: yes

Circularity Check

0 steps flagged

No circularity: claims rest on held-out external evaluation against independent transcriptomic ground truth

full rationale

The paper's core derivation trains a cross-modal distillation model on paired spatial transcriptomics + H&E to produce a histology-only predictor of transcriptomics-derived niches. Evaluation compares this predictor's output on held-out histology regions directly to niches obtained from the (unseen) transcriptomic profiles of those same regions, with performance measured against separate unsupervised morphology baselines that use identical image features but no transcriptomic supervision. No equation or step reduces a claimed prediction to a fitted parameter by construction, no niche definition is shown to be derived from the model itself, and no load-bearing premise collapses to a self-citation whose content is unverified. The evaluation protocol therefore remains externally falsifiable and independent of the training objective, satisfying the criteria for a self-contained derivation.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

Abstract-only review limits visibility into training details; the framework rests on standard supervised learning assumptions plus the domain premise of paired data availability.

axioms (1)

domain assumption Paired spatial transcriptomics and H&E samples exist and are representative of the target distribution
The distillation process requires such pairs for training and the claim of generalization depends on this pairing being informative.

pith-pipeline@v0.9.0 · 5457 in / 1176 out tokens · 29572 ms · 2026-05-10T16:46:14.285983+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

43 extracted references · 43 canonical work pages · 1 internal anchor

[1]

Human colon preview data (xenium hu- man colon gene expression panel).https : / / www

10x Genomics. Human colon preview data (xenium hu- man colon gene expression panel).https : / / www . 10xgenomics . com / datasets / human - colon - preview - data - xenium - human - colon - gene - expression - panel - 1 - standard, 2023. In Situ Gene Expression (preview); Xenium Onboard Analysis 1.6.0; Date Published 2023-08-29; healthy sample. 4

work page 2023
[2]

10x Genomics. Ffpe human pancreatic cancer with xe- nium human multi-tissue and cancer panel.https:// www.10xgenomics.com/datasets/pancreatic- cancer- with- xenium- human- multi- tissue- and- cancer- panel- 1- standard, 2023. In Situ Gene Expression; Xenium Onboard Analysis 1.6.0; Date Published 2023-11-16. 4, 7

work page 2023
[3]

Ffpe human brain cancer data with human immuno-oncology profiling panel and custom add-on.https : / / www

10x Genomics. Ffpe human brain cancer data with human immuno-oncology profiling panel and custom add-on.https : / / www . 10xgenomics . com / datasets / ffpe - human - brain - cancer - data - with - human - immuno - oncology - profiling - panel- and- custom- add- on- 1- standard, 2024. In Situ Gene Expression; Xenium Onboard Analysis 2.0.0; Date Published...

work page 2024
[4]

10x Genomics. Ffpe human colorectal cancer data with human immuno-oncology profiling panel and custom add- on.https://www.10xgenomics.com/datasets/ ffpe-human-colorectal-cancer-data-with- human - immuno - oncology - profiling - panel - and- custom- add- on- 1- standard, 2024. In Situ Gene Expression; Xenium Onboard Analysis 2.0.0; Date Published 2024-05-14. 4

work page 2024
[5]

Xenium human lung cancer.https: / / www

10x Genomics. Xenium human lung cancer.https: / / www . 10xgenomics . com / datasets / xenium - human- lung- cancer- post- xenium- technote,

work page
[6]

In Situ Gene Expression; Xenium Onboard Analysis 3.0.0; Date Published 2024-06-11. 4

work page 2024
[7]

Human liver data with xenium human multi- tissue and cancer panel.https://www.10xgenomics

10x Genomics. Human liver data with xenium human multi- tissue and cancer panel.https://www.10xgenomics. com / datasets / human - liver - data - xenium - human- multi- tissue- and- cancer- panel- 1- standard, 2024. In Situ Gene Expression; Xenium On- board Analysis 1.9.0; Date Published 2024-03-05; healthy sample. 4

work page 2024
[8]

Fresh frozen mouse colon with xenium multimodal cell segmentation.https : / / www

10x Genomics. Fresh frozen mouse colon with xenium multimodal cell segmentation.https : / / www . 10xgenomics . com / datasets / fresh - frozen - mouse - colon - with - xenium - multimodal-cell-segmentation-1-standard,

work page
[9]

In Situ Gene Expression; Xenium Onboard Analysis 2.0.0; Date Published 2024-03-19. 4

work page 2024
[10]

Mouse pup preview data with xenium mouse tissue atlassing panel.https://www.10xgenomics

10x Genomics. Mouse pup preview data with xenium mouse tissue atlassing panel.https://www.10xgenomics. com / datasets / mouse - pup - preview - data - xenium- mouse- tissue- atlassing- panel- 1- standard, 2024. In Situ Gene Expression; Xenium On- board Analysis 3.0.0; Date Published 2024-06-11. 4

work page 2024
[11]

FFPE Human Ovarian Cancer with 5K Hu- man Pan Tissue and Pathways Panel plus 100 Custom Genes

10x Genomics. FFPE Human Ovarian Cancer with 5K Hu- man Pan Tissue and Pathways Panel plus 100 Custom Genes. https : / / www . 10xgenomics . com / datasets / xenium- prime- ffpe- human- ovarian- cancer,

work page
[12]

Xenium In Situ Gene Expression; analyzed with Xe- nium Onboard Analysis 3.0.0; published 2024-12-17. 4, 7

work page 2024
[13]

Ffpe human pancreatic ductal adenocarci- noma data with human immuno-oncology profiling panel

10x Genomics. Ffpe human pancreatic ductal adenocarci- noma data with human immuno-oncology profiling panel. https : / / www . 10xgenomics . com / datasets / ffpe - human - ductal - adenocarcinoma - data - with - human - immuno - oncology - profiling - panel- 1- standard, 2024. In Situ Gene Expression; Xenium Onboard Analysis 2.0.0; Date Published 2024-05-

work page 2024
[14]

Ffpe human breast cancer with 5k human pan tissue and pathways panel plus 100 custom genes.https: / / www

10x Genomics. Ffpe human breast cancer with 5k human pan tissue and pathways panel plus 100 custom genes.https: / / www . 10xgenomics . com / datasets / xenium - prime-ffpe-human-breast-cancer, 2024. In Situ Gene Expression; Xenium Onboard Analysis 3.0.0; Date Published 2024-10-24. 4

work page 2024
[15]

Ffpe human cervical cancer with 5k hu- man pan tissue and pathways panel plus 100 custom genes

10x Genomics. Ffpe human cervical cancer with 5k hu- man pan tissue and pathways panel plus 100 custom genes. https : / / www . 10xgenomics . com / datasets / xenium-prime-ffpe-human-cervical-cancer,

work page
[16]

In Situ Gene Expression; Xenium Onboard Analysis 3.0.0; Date Published 2024-09-04. 4

work page 2024
[17]

Preview data: Ffpe human lymph node with 5k pan tissue and pathways panel.https://www

10x Genomics. Preview data: Ffpe human lymph node with 5k pan tissue and pathways panel.https://www. 10xgenomics . com / datasets / preview - data - xenium- prime- gene- expression, 2024. In Situ Gene Expression (preview); Xenium Onboard Analysis 3.0.0; Date Published 2024-05-28. 4

work page 2024
[18]

Xenium v1 human breast ffpe with biomark- ers & housekeeping genes custom panel: Sample s2-middle

10x Genomics. Xenium v1 human breast ffpe with biomark- ers & housekeeping genes custom panel: Sample s2-middle. https : / / www . 10xgenomics . com / datasets / xenium-ffpe-human-breast-biomarkers, 2025. In Situ Gene Expression; Xenium Onboard Analysis; Date Published 2025-12-11. 4, 7

work page 2025
[19]

Xenium in situ gene and protein expression data for ffpe human renal cell carcinoma.https://www

10x Genomics. Xenium in situ gene and protein expression data for ffpe human renal cell carcinoma.https://www. 10xgenomics.com/datasets/xenium-protein- ffpe-human-renal-carcinoma, 2025. In Situ Gene and Protein Expression; Xenium Onboard Analysis 4.0.0; Date Published 2025-09-26. 4

work page 2025
[20]

Datasets.https : / / www

10x Genomics. Datasets.https : / / www . 10xgenomics.com/datasets, 2026. 4

work page 2026
[21]

NOLAN: SELF-SUPERVISED FRAME- WORK FOR MAPPING CONTINUOUS TISSUE ORGA- NIZATION

Artemy Bakulin, Nathan Levy, Can Ergen, Jonas Maaskola, and Nir Yosef. NOLAN: SELF-SUPERVISED FRAME- WORK FOR MAPPING CONTINUOUS TISSUE ORGA- NIZATION. InICLR 2025 Workshop on Machine Learning for Genomics Explorations, 2025. 2, 3, 4, 5

work page 2025
[22]

Bhate, Graham L

Salil S. Bhate, Graham L. Barlow, Christian M. Sch¨urch, and Garry P. Nolan. Tissue schematics map the specialization of immune tissue motifs and their appropriation by tumors.Cell Systems, 13(2):109–130, 2022. 2

work page 2022
[23]

Model compression

Cristian Bucilu ˇa, Rich Caruana, and Alexandru Niculescu- Mizil. Model compression. InProceedings of the 12th ACM 9 SIGKDD International Conference on Knowledge Discovery and Data Mining, pages 535–541. ACM, 2006. 3

work page 2006
[24]

Emerg- ing properties in self-supervised vision transformers

Mathilde Caron, Hugo Touvron, Ishan Misra, Herv ´e J´egou, Julien Mairal, Piotr Bojanowski, and Armand Joulin. Emerg- ing properties in self-supervised vision transformers. InPro- ceedings of the International Conference on Computer Vi- sion (ICCV), 2021. 3

work page 2021
[25]

Towards a general-purpose foundation model for com- putational pathology.Nature Medicine, 2024

Richard J Chen, Tong Ding, Ming Y Lu, Drew FK Williamson, Guillaume Jaume, Bowen Chen, Andrew Zhang, Daniel Shao, Andrew H Song, Muhammad Shaban, et al. Towards a general-purpose foundation model for com- putational pathology.Nature Medicine, 2024. 2, 3, 4, 5

work page 2024
[26]

A visual–omics foun- dation model to bridge histopathology with spatial transcrip- tomics.Nature Methods, pages 1–15, 2025

Weiqing Chen, Pengzhi Zhang, Tu N Tran, Yiwei Xiao, Shengyu Li, Vrutant V Shah, Hao Cheng, Kristopher W Brannan, Keith Youker, and Li Lai. A visual–omics foun- dation model to bridge histopathology with spatial transcrip- tomics.Nature Methods, pages 1–15, 2025. 3

work page 2025
[27]

Gindra, Giovanni Palla, Mathias Nguyen, Sophia J

Rushin H. Gindra, Giovanni Palla, Mathias Nguyen, Sophia J. Wagner, Manuel Tran, Fabian J Theis, Dieter Saur, Lorin Crawford, and Tingying Peng. A large-scale bench- mark of cross-modal learning for histology and gene expres- sion in spatial transcriptomics, 2025. 3

work page 2025
[28]

Distilling the Knowledge in a Neural Network

Geoffrey Hinton, Oriol Vinyals, and Jeff Dean. Distill- ing the knowledge in a neural network.arXiv preprint arXiv:1503.02531, 2015. 2

work page internal anchor Pith review Pith/arXiv arXiv 2015
[29]

Jian Hu, Xiangjie Li, Kyle Coleman, Amelia Schroeder, Nan Ma, David J Irwin, Edward B Lee, Russell T Shinohara, and Mingyao Li. Spagcn: Integrating gene expression, spatial location and histology to identify spatial domains and spa- tially variable genes by graph convolutional network.Nature Methods, 18(11):1342–1351, 2021. 2

work page 2021
[30]

STPath: a generative foundation model for integrating spatial transcrip- tomics and whole-slide images.npj Digital Medicine, 8(1): 659, 2025

Tianyu Huang, Tianze Liu, and Mehrtash Babadi. STPath: a generative foundation model for integrating spatial transcrip- tomics and whole-slide images.npj Digital Medicine, 8(1): 659, 2025. 3

work page 2025
[31]

Chen, Drew FK Williamson, Thomas Peeters, Andrew H

Guillaume Jaume, Lukas Oldenburg, Anurag Jayant Vaidya, Richard J. Chen, Drew FK Williamson, Thomas Peeters, Andrew H. Song, and Faisal Mahmood. Transcriptomics- guided slide representation learning in computational pathol- ogy. InProceedings of the IEEE/CVF Conference on Com- puter Vision and Pattern Recognition (CVPR), 2024. 3

work page 2024
[32]

Deep generative modeling for single- cell transcriptomics.Nature Methods, 15(12):1053–1058,

Romain Lopez, Jeffrey Regier, Michael B Cole, Michael I Jordan, and Nir Yosef. Deep generative modeling for single- cell transcriptomics.Nature Methods, 15(12):1053–1058,

work page
[33]

Maxime Oquab, Timoth ´ee Darcet, Theo Moutakanni, Huy V . V o, Marc Szafraniec, Vasil Khalidov, Pierre Fernandez, Daniel Haziza, Francisco Massa, Alaaeldin El-Nouby, Rus- sell Howes, Po-Yao Huang, Hu Xu, Vasu Sharma, Shang- Wen Li, Wojciech Galuba, Mike Rabbat, Mido Assran, Nico- las Ballas, Gabriel Synnaeve, Ishan Misra, Herve Jegou, Julien Mairal, Patri...

work page 2023
[34]

Learn- ing transferable visual models from natural language super- vision

Alec Radford, Jong Wook Kim, Chris Hallacy, Aditya Ramesh, Gabriel Goh, Sandhini Agarwal, Girish Sastry, Amanda Askell, Pamela Mishkin, Jack Clark, et al. Learn- ing transferable visual models from natural language super- vision. InInternational Conference on Machine Learning, pages 8748–8763. PMLR, 2021. 3

work page 2021
[35]

Fit- nets: Hints for thin deep nets

Adriana Romero, Nicolas Ballas, Samira Ebrahimi Kahou, Antoine Chassang, Carlo Gatta, and Yoshua Bengio. Fit- nets: Hints for thin deep nets. InInternational Conference on Learning Representations (ICLR), 2015. 3

work page 2015
[36]

H-optimus-0, 2024

Charlie Saillard, Rodolphe Jenatton, Felipe Llinares-L ´opez, Zelda Mariet, David Cahan´e, Eric Durand, and Jean-Philippe Vert. H-optimus-0, 2024. 2

work page 2024
[37]

A deep learning model to predict RNA-seq expression of tu- mours from whole slide images.Nature Communications, 11(1):3877, 2020

Beno ˆıt Schmauch, Alberto Romagnoni, Elodie Pronier, et al. A deep learning model to predict RNA-seq expression of tu- mours from whole slide images.Nature Communications, 11(1):3877, 2020. 3

work page 2020
[38]

Banksy: a unified framework for clustering and spatial segmentation of hetero- geneous tissue datasets.Nature Genetics, 56(2):334–344,

Vipul Singhal, Nigel Chou, Jinyue Lee, Monica Dhar, Milly M Chang, Yan Ru Poh, Meret Geuenich, Kok Hao Chen, Sang-In Choi, Na Shao, et al. Banksy: a unified framework for clustering and spatial segmentation of hetero- geneous tissue datasets.Nature Genetics, 56(2):334–344,

work page
[39]

Con- trastive representation distillation

Yonglong Tian, Dilip Krishnan, and Phillip Isola. Con- trastive representation distillation. InInternational Confer- ence on Learning Representations (ICLR), 2020. 3

work page 2020
[40]

Kunz, Matthew C

Eugene V orontsov, Alican Bozkurt, Adam Casson, George Shaikovski, Michal Zelechowski, Kristen Severson, Eric Zimmermann, James Hall, Neil Tenenholtz, Nicolo Fusi, Ellen Yang, Philippe Mathieu, Alexander van Eck, Donghun Lee, Julian Viret, Eric Robert, Yi Kan Wang, Jeremy D. Kunz, Matthew C. H. Lee, Jan H. Bernhard, Ran A. Godrich, Gerard Oakley, Ewan Mil...

work page 2024
[41]

Spatially resolved gene expression prediction from histology images via bi- modal contrastive learning

Ronald Xie, Kuan Pang, Sai Chung, Catia Perciani, Sonya MacParland, Bo Wang, and Gary Bader. Spatially resolved gene expression prediction from histology images via bi- modal contrastive learning. InAdvances in Neural Informa- tion Processing Systems, pages 70626–70637. Curran Asso- ciates, Inc., 2023. 3

work page 2023
[42]

Wright, Ari Robicsek, Brian Piening, Carlo Bifulco, Sheng Wang, and Hoifung Poon

Hanwen Xu, Naoto Usuyama, Jaspreet Bagga, Sheng Zhang, Rajesh Rao, Tristan Naumann, Cliff Wong, Zelalem Gero, Javier Gonz´alez, Yu Gu, Yanbo Xu, Mu Wei, Wenhui Wang, Shuming Ma, Furu Wei, Jianwei Yang, Chunyuan Li, Jian- feng Gao, Jaylen Rosemon, Tucker Bower, Soohee Lee, Roshanthi Weerasinghe, Bill J. Wright, Ari Robicsek, Brian Piening, Carlo Bifulco, S...

work page 2024
[43]

Mender: fast and scalable tissue structure identification in spatial omics data.Nature Communications, 15(1):207, 2024

Zhiyuan Yuan. Mender: fast and scalable tissue structure identification in spatial omics data.Nature Communications, 15(1):207, 2024. 2 10

work page 2024

[1] [1]

Human colon preview data (xenium hu- man colon gene expression panel).https : / / www

10x Genomics. Human colon preview data (xenium hu- man colon gene expression panel).https : / / www . 10xgenomics . com / datasets / human - colon - preview - data - xenium - human - colon - gene - expression - panel - 1 - standard, 2023. In Situ Gene Expression (preview); Xenium Onboard Analysis 1.6.0; Date Published 2023-08-29; healthy sample. 4

work page 2023

[2] [2]

10x Genomics. Ffpe human pancreatic cancer with xe- nium human multi-tissue and cancer panel.https:// www.10xgenomics.com/datasets/pancreatic- cancer- with- xenium- human- multi- tissue- and- cancer- panel- 1- standard, 2023. In Situ Gene Expression; Xenium Onboard Analysis 1.6.0; Date Published 2023-11-16. 4, 7

work page 2023

[3] [3]

Ffpe human brain cancer data with human immuno-oncology profiling panel and custom add-on.https : / / www

10x Genomics. Ffpe human brain cancer data with human immuno-oncology profiling panel and custom add-on.https : / / www . 10xgenomics . com / datasets / ffpe - human - brain - cancer - data - with - human - immuno - oncology - profiling - panel- and- custom- add- on- 1- standard, 2024. In Situ Gene Expression; Xenium Onboard Analysis 2.0.0; Date Published...

work page 2024

[4] [4]

10x Genomics. Ffpe human colorectal cancer data with human immuno-oncology profiling panel and custom add- on.https://www.10xgenomics.com/datasets/ ffpe-human-colorectal-cancer-data-with- human - immuno - oncology - profiling - panel - and- custom- add- on- 1- standard, 2024. In Situ Gene Expression; Xenium Onboard Analysis 2.0.0; Date Published 2024-05-14. 4

work page 2024

[5] [5]

Xenium human lung cancer.https: / / www

10x Genomics. Xenium human lung cancer.https: / / www . 10xgenomics . com / datasets / xenium - human- lung- cancer- post- xenium- technote,

work page

[6] [6]

In Situ Gene Expression; Xenium Onboard Analysis 3.0.0; Date Published 2024-06-11. 4

work page 2024

[7] [7]

Human liver data with xenium human multi- tissue and cancer panel.https://www.10xgenomics

10x Genomics. Human liver data with xenium human multi- tissue and cancer panel.https://www.10xgenomics. com / datasets / human - liver - data - xenium - human- multi- tissue- and- cancer- panel- 1- standard, 2024. In Situ Gene Expression; Xenium On- board Analysis 1.9.0; Date Published 2024-03-05; healthy sample. 4

work page 2024

[8] [8]

Fresh frozen mouse colon with xenium multimodal cell segmentation.https : / / www

10x Genomics. Fresh frozen mouse colon with xenium multimodal cell segmentation.https : / / www . 10xgenomics . com / datasets / fresh - frozen - mouse - colon - with - xenium - multimodal-cell-segmentation-1-standard,

work page

[9] [9]

In Situ Gene Expression; Xenium Onboard Analysis 2.0.0; Date Published 2024-03-19. 4

work page 2024

[10] [10]

Mouse pup preview data with xenium mouse tissue atlassing panel.https://www.10xgenomics

10x Genomics. Mouse pup preview data with xenium mouse tissue atlassing panel.https://www.10xgenomics. com / datasets / mouse - pup - preview - data - xenium- mouse- tissue- atlassing- panel- 1- standard, 2024. In Situ Gene Expression; Xenium On- board Analysis 3.0.0; Date Published 2024-06-11. 4

work page 2024

[11] [11]

FFPE Human Ovarian Cancer with 5K Hu- man Pan Tissue and Pathways Panel plus 100 Custom Genes

10x Genomics. FFPE Human Ovarian Cancer with 5K Hu- man Pan Tissue and Pathways Panel plus 100 Custom Genes. https : / / www . 10xgenomics . com / datasets / xenium- prime- ffpe- human- ovarian- cancer,

work page

[12] [12]

Xenium In Situ Gene Expression; analyzed with Xe- nium Onboard Analysis 3.0.0; published 2024-12-17. 4, 7

work page 2024

[13] [13]

Ffpe human pancreatic ductal adenocarci- noma data with human immuno-oncology profiling panel

10x Genomics. Ffpe human pancreatic ductal adenocarci- noma data with human immuno-oncology profiling panel. https : / / www . 10xgenomics . com / datasets / ffpe - human - ductal - adenocarcinoma - data - with - human - immuno - oncology - profiling - panel- 1- standard, 2024. In Situ Gene Expression; Xenium Onboard Analysis 2.0.0; Date Published 2024-05-

work page 2024

[14] [14]

Ffpe human breast cancer with 5k human pan tissue and pathways panel plus 100 custom genes.https: / / www

10x Genomics. Ffpe human breast cancer with 5k human pan tissue and pathways panel plus 100 custom genes.https: / / www . 10xgenomics . com / datasets / xenium - prime-ffpe-human-breast-cancer, 2024. In Situ Gene Expression; Xenium Onboard Analysis 3.0.0; Date Published 2024-10-24. 4

work page 2024

[15] [15]

Ffpe human cervical cancer with 5k hu- man pan tissue and pathways panel plus 100 custom genes

10x Genomics. Ffpe human cervical cancer with 5k hu- man pan tissue and pathways panel plus 100 custom genes. https : / / www . 10xgenomics . com / datasets / xenium-prime-ffpe-human-cervical-cancer,

work page

[16] [16]

In Situ Gene Expression; Xenium Onboard Analysis 3.0.0; Date Published 2024-09-04. 4

work page 2024

[17] [17]

Preview data: Ffpe human lymph node with 5k pan tissue and pathways panel.https://www

10x Genomics. Preview data: Ffpe human lymph node with 5k pan tissue and pathways panel.https://www. 10xgenomics . com / datasets / preview - data - xenium- prime- gene- expression, 2024. In Situ Gene Expression (preview); Xenium Onboard Analysis 3.0.0; Date Published 2024-05-28. 4

work page 2024

[18] [18]

Xenium v1 human breast ffpe with biomark- ers & housekeeping genes custom panel: Sample s2-middle

10x Genomics. Xenium v1 human breast ffpe with biomark- ers & housekeeping genes custom panel: Sample s2-middle. https : / / www . 10xgenomics . com / datasets / xenium-ffpe-human-breast-biomarkers, 2025. In Situ Gene Expression; Xenium Onboard Analysis; Date Published 2025-12-11. 4, 7

work page 2025

[19] [19]

Xenium in situ gene and protein expression data for ffpe human renal cell carcinoma.https://www

10x Genomics. Xenium in situ gene and protein expression data for ffpe human renal cell carcinoma.https://www. 10xgenomics.com/datasets/xenium-protein- ffpe-human-renal-carcinoma, 2025. In Situ Gene and Protein Expression; Xenium Onboard Analysis 4.0.0; Date Published 2025-09-26. 4

work page 2025

[20] [20]

Datasets.https : / / www

10x Genomics. Datasets.https : / / www . 10xgenomics.com/datasets, 2026. 4

work page 2026

[21] [21]

NOLAN: SELF-SUPERVISED FRAME- WORK FOR MAPPING CONTINUOUS TISSUE ORGA- NIZATION

Artemy Bakulin, Nathan Levy, Can Ergen, Jonas Maaskola, and Nir Yosef. NOLAN: SELF-SUPERVISED FRAME- WORK FOR MAPPING CONTINUOUS TISSUE ORGA- NIZATION. InICLR 2025 Workshop on Machine Learning for Genomics Explorations, 2025. 2, 3, 4, 5

work page 2025

[22] [22]

Bhate, Graham L

Salil S. Bhate, Graham L. Barlow, Christian M. Sch¨urch, and Garry P. Nolan. Tissue schematics map the specialization of immune tissue motifs and their appropriation by tumors.Cell Systems, 13(2):109–130, 2022. 2

work page 2022

[23] [23]

Model compression

Cristian Bucilu ˇa, Rich Caruana, and Alexandru Niculescu- Mizil. Model compression. InProceedings of the 12th ACM 9 SIGKDD International Conference on Knowledge Discovery and Data Mining, pages 535–541. ACM, 2006. 3

work page 2006

[24] [24]

Emerg- ing properties in self-supervised vision transformers

Mathilde Caron, Hugo Touvron, Ishan Misra, Herv ´e J´egou, Julien Mairal, Piotr Bojanowski, and Armand Joulin. Emerg- ing properties in self-supervised vision transformers. InPro- ceedings of the International Conference on Computer Vi- sion (ICCV), 2021. 3

work page 2021

[25] [25]

Towards a general-purpose foundation model for com- putational pathology.Nature Medicine, 2024

Richard J Chen, Tong Ding, Ming Y Lu, Drew FK Williamson, Guillaume Jaume, Bowen Chen, Andrew Zhang, Daniel Shao, Andrew H Song, Muhammad Shaban, et al. Towards a general-purpose foundation model for com- putational pathology.Nature Medicine, 2024. 2, 3, 4, 5

work page 2024

[26] [26]

A visual–omics foun- dation model to bridge histopathology with spatial transcrip- tomics.Nature Methods, pages 1–15, 2025

Weiqing Chen, Pengzhi Zhang, Tu N Tran, Yiwei Xiao, Shengyu Li, Vrutant V Shah, Hao Cheng, Kristopher W Brannan, Keith Youker, and Li Lai. A visual–omics foun- dation model to bridge histopathology with spatial transcrip- tomics.Nature Methods, pages 1–15, 2025. 3

work page 2025

[27] [27]

Gindra, Giovanni Palla, Mathias Nguyen, Sophia J

Rushin H. Gindra, Giovanni Palla, Mathias Nguyen, Sophia J. Wagner, Manuel Tran, Fabian J Theis, Dieter Saur, Lorin Crawford, and Tingying Peng. A large-scale bench- mark of cross-modal learning for histology and gene expres- sion in spatial transcriptomics, 2025. 3

work page 2025

[28] [28]

Distilling the Knowledge in a Neural Network

Geoffrey Hinton, Oriol Vinyals, and Jeff Dean. Distill- ing the knowledge in a neural network.arXiv preprint arXiv:1503.02531, 2015. 2

work page internal anchor Pith review Pith/arXiv arXiv 2015

[29] [29]

Jian Hu, Xiangjie Li, Kyle Coleman, Amelia Schroeder, Nan Ma, David J Irwin, Edward B Lee, Russell T Shinohara, and Mingyao Li. Spagcn: Integrating gene expression, spatial location and histology to identify spatial domains and spa- tially variable genes by graph convolutional network.Nature Methods, 18(11):1342–1351, 2021. 2

work page 2021

[30] [30]

STPath: a generative foundation model for integrating spatial transcrip- tomics and whole-slide images.npj Digital Medicine, 8(1): 659, 2025

Tianyu Huang, Tianze Liu, and Mehrtash Babadi. STPath: a generative foundation model for integrating spatial transcrip- tomics and whole-slide images.npj Digital Medicine, 8(1): 659, 2025. 3

work page 2025

[31] [31]

Chen, Drew FK Williamson, Thomas Peeters, Andrew H

Guillaume Jaume, Lukas Oldenburg, Anurag Jayant Vaidya, Richard J. Chen, Drew FK Williamson, Thomas Peeters, Andrew H. Song, and Faisal Mahmood. Transcriptomics- guided slide representation learning in computational pathol- ogy. InProceedings of the IEEE/CVF Conference on Com- puter Vision and Pattern Recognition (CVPR), 2024. 3

work page 2024

[32] [32]

Deep generative modeling for single- cell transcriptomics.Nature Methods, 15(12):1053–1058,

Romain Lopez, Jeffrey Regier, Michael B Cole, Michael I Jordan, and Nir Yosef. Deep generative modeling for single- cell transcriptomics.Nature Methods, 15(12):1053–1058,

work page

[33] [33]

Maxime Oquab, Timoth ´ee Darcet, Theo Moutakanni, Huy V . V o, Marc Szafraniec, Vasil Khalidov, Pierre Fernandez, Daniel Haziza, Francisco Massa, Alaaeldin El-Nouby, Rus- sell Howes, Po-Yao Huang, Hu Xu, Vasu Sharma, Shang- Wen Li, Wojciech Galuba, Mike Rabbat, Mido Assran, Nico- las Ballas, Gabriel Synnaeve, Ishan Misra, Herve Jegou, Julien Mairal, Patri...

work page 2023

[34] [34]

Learn- ing transferable visual models from natural language super- vision

Alec Radford, Jong Wook Kim, Chris Hallacy, Aditya Ramesh, Gabriel Goh, Sandhini Agarwal, Girish Sastry, Amanda Askell, Pamela Mishkin, Jack Clark, et al. Learn- ing transferable visual models from natural language super- vision. InInternational Conference on Machine Learning, pages 8748–8763. PMLR, 2021. 3

work page 2021

[35] [35]

Fit- nets: Hints for thin deep nets

Adriana Romero, Nicolas Ballas, Samira Ebrahimi Kahou, Antoine Chassang, Carlo Gatta, and Yoshua Bengio. Fit- nets: Hints for thin deep nets. InInternational Conference on Learning Representations (ICLR), 2015. 3

work page 2015

[36] [36]

H-optimus-0, 2024

Charlie Saillard, Rodolphe Jenatton, Felipe Llinares-L ´opez, Zelda Mariet, David Cahan´e, Eric Durand, and Jean-Philippe Vert. H-optimus-0, 2024. 2

work page 2024

[37] [37]

A deep learning model to predict RNA-seq expression of tu- mours from whole slide images.Nature Communications, 11(1):3877, 2020

Beno ˆıt Schmauch, Alberto Romagnoni, Elodie Pronier, et al. A deep learning model to predict RNA-seq expression of tu- mours from whole slide images.Nature Communications, 11(1):3877, 2020. 3

work page 2020

[38] [38]

Banksy: a unified framework for clustering and spatial segmentation of hetero- geneous tissue datasets.Nature Genetics, 56(2):334–344,

Vipul Singhal, Nigel Chou, Jinyue Lee, Monica Dhar, Milly M Chang, Yan Ru Poh, Meret Geuenich, Kok Hao Chen, Sang-In Choi, Na Shao, et al. Banksy: a unified framework for clustering and spatial segmentation of hetero- geneous tissue datasets.Nature Genetics, 56(2):334–344,

work page

[39] [39]

Con- trastive representation distillation

Yonglong Tian, Dilip Krishnan, and Phillip Isola. Con- trastive representation distillation. InInternational Confer- ence on Learning Representations (ICLR), 2020. 3

work page 2020

[40] [40]

Kunz, Matthew C

Eugene V orontsov, Alican Bozkurt, Adam Casson, George Shaikovski, Michal Zelechowski, Kristen Severson, Eric Zimmermann, James Hall, Neil Tenenholtz, Nicolo Fusi, Ellen Yang, Philippe Mathieu, Alexander van Eck, Donghun Lee, Julian Viret, Eric Robert, Yi Kan Wang, Jeremy D. Kunz, Matthew C. H. Lee, Jan H. Bernhard, Ran A. Godrich, Gerard Oakley, Ewan Mil...

work page 2024

[41] [41]

Spatially resolved gene expression prediction from histology images via bi- modal contrastive learning

Ronald Xie, Kuan Pang, Sai Chung, Catia Perciani, Sonya MacParland, Bo Wang, and Gary Bader. Spatially resolved gene expression prediction from histology images via bi- modal contrastive learning. InAdvances in Neural Informa- tion Processing Systems, pages 70626–70637. Curran Asso- ciates, Inc., 2023. 3

work page 2023

[42] [42]

Wright, Ari Robicsek, Brian Piening, Carlo Bifulco, Sheng Wang, and Hoifung Poon

Hanwen Xu, Naoto Usuyama, Jaspreet Bagga, Sheng Zhang, Rajesh Rao, Tristan Naumann, Cliff Wong, Zelalem Gero, Javier Gonz´alez, Yu Gu, Yanbo Xu, Mu Wei, Wenhui Wang, Shuming Ma, Furu Wei, Jianwei Yang, Chunyuan Li, Jian- feng Gao, Jaylen Rosemon, Tucker Bower, Soohee Lee, Roshanthi Weerasinghe, Bill J. Wright, Ari Robicsek, Brian Piening, Carlo Bifulco, S...

work page 2024

[43] [43]

Mender: fast and scalable tissue structure identification in spatial omics data.Nature Communications, 15(1):207, 2024

Zhiyuan Yuan. Mender: fast and scalable tissue structure identification in spatial omics data.Nature Communications, 15(1):207, 2024. 2 10

work page 2024