Virchow: A Million-Slide Digital Pathology Foundation Model

Adam Casson; Alexander van Eck; Alican Bozkurt; Brandon Rothrock; Christopher Kanan; David Klimstra; Donghun Lee; Eric Robert; Eric Zimmermann; Eugene Vorontsov

arxiv: 2309.07778 · v6 · pith:5TSY5MABnew · submitted 2023-09-14 · 📡 eess.IV · cs.CV· cs.LG· q-bio.TO

Virchow: A Million-Slide Digital Pathology Foundation Model

Eugene Vorontsov , Alican Bozkurt , Adam Casson , George Shaikovski , Michal Zelechowski , Siqi Liu , Kristen Severson , Eric Zimmermann

show 23 more authors

James Hall Neil Tenenholtz Nicolo Fusi Philippe Mathieu Alexander van Eck Donghun Lee Julian Viret Eric Robert Yi Kan Wang Jeremy D. Kunz Matthew C. H. Lee Jan Bernhard Ran A. Godrich Gerard Oakley Ewan Millar Matthew Hanna Juan Retamero William A. Moye Razik Yousfi Christopher Kanan David Klimstra Brandon Rothrock Thomas J. Fuchs

This is my paper

classification 📡 eess.IV cs.CVcs.LGq-bio.TO

keywords pathologymodelvirchowcancerdataimagestypesapplications

0 comments

read the original abstract

The use of artificial intelligence to enable precision medicine and decision support systems through the analysis of pathology images has the potential to revolutionize the diagnosis and treatment of cancer. Such applications will depend on models' abilities to capture the diverse patterns observed in pathology images. To address this challenge, we present Virchow, a foundation model for computational pathology. Using self-supervised learning empowered by the DINOv2 algorithm, Virchow is a vision transformer model with 632 million parameters trained on 1.5 million hematoxylin and eosin stained whole slide images from diverse tissue and specimen types, which is orders of magnitude more data than previous works. The Virchow model enables the development of a pan-cancer detection system with 0.949 overall specimen-level AUC across 17 different cancer types, while also achieving 0.937 AUC on 7 rare cancer types. The Virchow model sets the state-of-the-art on the internal and external image tile level benchmarks and slide level biomarker prediction tasks. The gains in performance highlight the importance of training on massive pathology image datasets, suggesting scaling up the data and network architecture can improve the accuracy for many high-impact computational pathology applications where limited amounts of training data are available.

This paper has not been read by Pith yet.

discussion (0)

Forward citations

Cited by 2 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

SIMPLER: H&E-Informed Representation Learning for Structured Illumination Microscopy
cs.CV 2026-04 unverdicted novelty 7.0

SIMPLER learns biologically grounded SIM representations by progressively aligning them with H&E images through multiple self-supervised objectives, outperforming scratch-trained or H&E-only models on downstream tasks...
MorphDistill: Distilling Unified Morphological Knowledge from Pathology Foundation Models for Colorectal Cancer Survival Prediction
cs.CV 2026-04 unverdicted novelty 6.0

MorphDistill creates a CRC-specific encoder by distilling inter-sample relationships from multiple pathology foundation models, achieving AUC 0.68 and C-index 0.661 on stage III CRC cohorts with an 8% relative gain ov...