pith. sign in

arxiv: 2604.03630 · v1 · submitted 2026-04-04 · 💻 cs.AI · q-bio.QM

A Multimodal Foundation Model of Spatial Transcriptomics and Histology for Biological Discovery and Clinical Prediction

classification 💻 cs.AI q-bio.QM
keywords textbfspatialacrossdiscoveryexpressiongenemodelstorm
0
0 comments X
read the original abstract

Spatial transcriptomics (ST) enables gene expression mapping within anatomical context but remains costly and low-throughput. Hematoxylin and eosin (H\&E) staining offers rich morphology yet lacks molecular resolution. We present \textbf{\ours} (\textbf{S}patial \textbf{T}ranscriptomics and hist\textbf{O}logy \textbf{R}epresentation \textbf{M}odel), a foundation model trained on 1.2 million spatially resolved transcriptomic profiles with matched histology across 18 organs. Using a hierarchical architecture integrating morphological features, gene expression, and spatial context, STORM bridges imaging and omics through robust molecular--morphological representations. STORM enhances spatial domain discovery, producing biologically coherent tissue maps, and outperforms existing methods in predicting spatial gene expression from H\&E images across 11 tumor types. The model is platform-agnostic, performing consistently across Visium, Xenium, Visium HD, and CosMx. Applied to 23 independent cohorts comprising 7,245 patients, STORM significantly improves immunotherapy response prediction and prognostication over established biomarkers, providing a scalable framework for spatially informed discovery and clinical precision medicine.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. Bridging the Modality Bottleneck in Pathology MIL through Virtual Molecular Staining

    q-bio.QM 2026-05 unverdicted novelty 6.0

    MIST augments MIL projection layers with cross-modal gene-expression prototypes derived from spatial transcriptomics, yielding consistent gains on survival, subtyping, and biomarker tasks across 23 endpoints and 8 agg...