Evaluating Computational Pathology Foundation Models for Prostate Cancer Grading under Distribution Shifts
Pith reviewed 2026-05-23 19:18 UTC · model grok-4.3
The pith
Pathology foundation models for prostate cancer grading lose substantial performance when moved to a new hospital site.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
Large-scale pretraining produces strong in-distribution representations for prostate cancer grading from whole-slide images, yet these representations do not transfer robustly across collection sites; cross-site visual shifts dominate label-distribution shifts in both performance loss and feature-space separation.
What carries the argument
Frozen patch-level encoders from pathology foundation models inserted into weakly supervised multiple-instance learning models for slide-level grading, together with t-SNE or similar visualization of site versus grade clustering in the resulting embeddings.
If this is right
- All evaluated pathology foundation models exhibit clear accuracy drops under the Radboud-to-Karolinska site transfer.
- The same models show smaller degradation when only the label distribution over grade groups is shifted.
- Embeddings from every tested foundation model continue to separate primarily by collection site rather than by cancer grade.
- Generalization remains limited by the diversity of the data used to train the downstream slide-level predictor.
Where Pith is reading between the lines
- Methods that explicitly align or adapt representations across sites may be required before these models can be deployed across institutions.
- Collecting pretraining data from multiple sites and scanners could reduce the observed domain gaps.
- The same visual-shift problem is likely to appear in other computational pathology tasks that involve different staining batches or scanner vendors.
Load-bearing premise
The Radboud-to-Karolinska split and the weakly supervised slide-level modeling choices in PANDA are representative of the distribution shifts that would appear in real clinical deployment.
What would settle it
Repeating the cross-site evaluation on a third independent collection site that uses similar staining and scanning protocols and finding no large performance drop for any of the tested foundation models.
Figures
read the original abstract
Pathology foundation models (PFMs) have emerged as powerful pretrained encoders for computational pathology, but their robustness under clinically relevant distribution shifts remains insufficiently understood. We benchmark the robustness of recent PFMs in the setting of prostate cancer grading from whole-slide images (WSIs). Using the PANDA dataset, we evaluate PFMs as frozen patch-level feature extractors within weakly supervised slide-level grading models, and assess robustness to two important forms of distribution shift: shifts in WSI image appearance across collection sites, and shifts in the label distribution over cancer grade groups. Across in-distribution settings, PFMs consistently achieve strong performance and clearly outperform a natural-image baseline. Under cross-site transfer from Radboud to Karolinska, however, performance drops substantially for all models, showing that large-scale pretraining alone does not guarantee robust downstream generalization. In contrast, PFMs are less sensitive to label-distribution shift, indicating that visually grounded domain shift is the dominant challenge. Representation analysis further supports these findings by revealing persistent domain separation between sites across all PFMs. While grade-related structure is present, it is comparatively weak, indicating that domain-related variation dominates in the learned feature space. Together, these results provide a comprehensive benchmark of PFMs under distribution shift and highlight an important practical message: although PFMs provide strong representations, generalizability remains constrained by the quality and diversity of the data used to train downstream prediction models.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript benchmarks pathology foundation models (PFMs) as frozen patch-level encoders within weakly supervised slide-level models for prostate cancer grading on the PANDA dataset. It reports strong in-distribution performance that outperforms a natural-image baseline, substantial degradation under cross-site shift (Radboud to Karolinska), comparatively smaller effects from label-distribution shift, and representation analysis showing persistent domain separation that dominates grade-related structure in the feature space. The central claim is that large-scale pretraining alone does not guarantee robust generalization and that visually grounded domain shift is the dominant practical challenge.
Significance. If the empirical patterns hold, the work supplies a useful benchmark demonstrating concrete limits of current PFMs under site-level appearance shifts and supplies a practical takeaway that downstream training data diversity matters more than pretraining scale alone. The representation analysis component adds interpretive value beyond accuracy numbers.
major comments (2)
- [Results, cross-site transfer paragraph] Cross-site transfer results: the claim of a 'substantial' drop for all models is presented without reported confidence intervals, p-values, or paired statistical tests against the in-distribution baselines; this weakens the assertion that the degradation is consistent and load-bearing for the conclusion that pretraining does not guarantee robustness.
- [Representation analysis subsection] Representation analysis: the statement that 'domain-related variation dominates' rests on visual inspection of embeddings; without quantitative support such as domain-classification accuracy on the frozen features or a direct comparison of cluster separation metrics between domain and grade labels, the dominance claim remains qualitative and does not fully substantiate that visual shift is the primary driver.
minor comments (2)
- [Abstract] The abstract states that PFMs are 'less sensitive' to label-distribution shift but does not quantify the relative magnitude of the two shift types (e.g., via delta-AUC or normalized drop); adding a direct side-by-side comparison would improve clarity.
- [Methods] The description of the weakly supervised slide-level modeling choices (MIL aggregator, aggregation function, etc.) is referenced but not fully specified in the provided text; expanding this in the methods would aid reproducibility.
Simulated Author's Rebuttal
We thank the referee for these constructive comments, which highlight opportunities to strengthen the statistical rigor and quantitative support in our manuscript. We address each major comment below and will revise the paper accordingly.
read point-by-point responses
-
Referee: [Results, cross-site transfer paragraph] Cross-site transfer results: the claim of a 'substantial' drop for all models is presented without reported confidence intervals, p-values, or paired statistical tests against the in-distribution baselines; this weakens the assertion that the degradation is consistent and load-bearing for the conclusion that pretraining does not guarantee robustness.
Authors: We agree that adding confidence intervals and statistical tests will make the claims more robust. In the revised manuscript we will report 95% confidence intervals (via bootstrap resampling over slides) for all AUC and accuracy metrics. We will also add paired statistical tests (Wilcoxon signed-rank test on per-slide performance scores) comparing in-distribution versus cross-site results for each model, with p-values and effect sizes. These additions will directly support the consistency of the observed drops. revision: yes
-
Referee: [Representation analysis subsection] Representation analysis: the statement that 'domain-related variation dominates' rests on visual inspection of embeddings; without quantitative support such as domain-classification accuracy on the frozen features or a direct comparison of cluster separation metrics between domain and grade labels, the dominance claim remains qualitative and does not fully substantiate that visual shift is the primary driver.
Authors: We acknowledge that the current dominance claim is supported primarily by t-SNE visualizations. In the revision we will add quantitative analyses: (1) linear probe accuracies for predicting site (domain) versus grade from the frozen PFM features, and (2) silhouette scores and between-cluster variance ratios comparing domain-based versus grade-based clustering on the embeddings. These metrics will provide direct quantitative evidence that domain separation is stronger than grade-related structure. revision: yes
Circularity Check
Empirical benchmark study with no derivations or self-referential reductions
full rationale
The paper is a pure empirical benchmark: it measures slide-level grading performance of frozen PFMs on held-out PANDA splits (in-distribution and Radboud-to-Karolinska cross-site) and reports representation statistics. No equations, ansatzes, uniqueness theorems, or fitted parameters are introduced whose outputs are then relabeled as predictions. All reported numbers are direct evaluations on disjoint data; the central claim that visual domain shift dominates is therefore a measured outcome rather than a quantity forced by the modeling choices themselves. Self-citations, if present, are not load-bearing for any derivation.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption The PANDA dataset collection-site and grade-group splits constitute meaningful proxies for clinically relevant distribution shifts.
Reference graph
Works this paper leans on
-
[1]
Towards large-scale training of pathology foundation models
Nanne Aben, Edwin D de Jong, Ioannis Gatopoulos, Nico- las K¨anzig, Mikhail Karasikov, Axel Lagr ´e, Roman Moser, Joost van Doorn, Fei Tang, et al. Towards large-scale training of pathology foundation models. arXiv preprint arXiv:2404.15217, 2024. 1
-
[2]
Artifi- cial intelligence as the next step towards precision pathology
Bal ´azs Acs, Mattias Rantalainen, and Johan Hartman. Artifi- cial intelligence as the next step towards precision pathology. Journal of Internal Medicine, 288(1):62–81, 2020. 1
work page 2020
-
[3]
Salim Arslan, Julian Schmidt, Cher Bass, Debapriya Mehro- tra, Andre Geraldes, Shikha Singhal, Julius Hense, Xiusi Li, Pandu Raharja-Liu, Oscar Maiques, et al. A systematic pan-cancer study on deep learning-based prediction of multi- omic biomarkers from routine pathology images. Communi- cations Medicine, 4(1):48, 2024. 1
work page 2024
-
[4]
Foundational models in medical imaging: A comprehensive survey and future vision
Bobby Azad, Reza Azad, Sania Eskandari, Afshin Bo- zorgpour, Amirhossein Kazerouni, Islem Rekik, and Dorit Merhof. Foundational models in medical imaging: A comprehensive survey and future vision. arXiv preprint arXiv:2310.18689, 2023. 1
-
[5]
On the Opportunities and Risks of Foundation Models
Rishi Bommasani, Drew A Hudson, Ehsan Adeli, Russ Alt- man, Simran Arora, Sydney von Arx, Michael S Bernstein, Jeannette Bohg, Antoine Bosselut, Emma Brunskill, et al. On the opportunities and risks of foundation models. arXiv preprint arXiv:2108.07258, 2021. 1
work page internal anchor Pith review Pith/arXiv arXiv 2021
-
[6]
Artifi- cial intelligence for diagnosis and gleason grading of prostate cancer: the PANDA challenge
Wouter Bulten, Kimmo Kartasalo, Po-Hsuan Cameron Chen, Peter Str ¨om, Hans Pinckaers, Kunal Nagpal, Yuannan Cai, David F Steiner, Hester Van Boven, Robert Vink, et al. Artifi- cial intelligence for diagnosis and gleason grading of prostate cancer: the PANDA challenge. Nature Medicine, 28(1):154– 163, 2022. 1, 2, 5
work page 2022
-
[7]
Clinical-grade computational pathology using weakly supervised deep learning on whole slide images
Gabriele Campanella, Matthew G Hanna, Luke Geneslaw, Allen Miraflor, Vitor Werneck Krauss Silva, Klaus J Busam, Edi Brogi, Victor E Reuter, David S Klimstra, and Thomas J Fuchs. Clinical-grade computational pathology using weakly supervised deep learning on whole slide images. Nature Medicine, 25(8):1301–1309, 2019. 1
work page 2019
-
[8]
A clinical benchmark of public self-supervised pathology foun- dation models
Gabriele Campanella, Shengjia Chen, Ruchika Verma, Jen- nifer Zeng, Aryeh Stock, Matt Croken, Brandon Veremis, Abdulkadir Elmas, Kuan-lin Huang, Ricky Kwan, et al. A clinical benchmark of public self-supervised pathology foun- dation models. arXiv preprint arXiv:2407.06508, 2024. 1
-
[9]
Towards a general-purpose foundation model for computational pathology
Richard J Chen, Tong Ding, Ming Y Lu, Drew FK Williamson, Guillaume Jaume, Andrew H Song, Bowen Chen, Andrew Zhang, Daniel Shao, Muhammad Shaban, et al. Towards a general-purpose foundation model for computational pathology. Nature Medicine, 30(3):850–862,
-
[10]
Artificial intelligence to identify genetic alterations in con- ventional histopathology
Didem Cifci, Sebastian Foersch, and Jakob Nikolas Kather. Artificial intelligence to identify genetic alterations in con- ventional histopathology. The Journal of Pathology, 257(4): 430–444, 2022. 1
work page 2022
-
[11]
Nicolas Coudray, Paolo Santiago Ocampo, Theodore Sakel- laropoulos, Navneet Narula, Matija Snuderl, David Feny ¨o, Andre L Moreira, Narges Razavian, and Aristotelis Tsirigos. Classification and mutation prediction from non–small cell lung cancer histopathology images using deep learning. Na- ture Medicine, 24(10):1559–1567, 2018. 1
work page 2018
-
[12]
An image is worth 16x16 words: Transformers for image recognition at scale
Alexey Dosovitskiy, Lucas Beyer, Alexander Kolesnikov, Dirk Weissenborn, Xiaohua Zhai, Thomas Unterthiner, Mostafa Dehghani, Matthias Minderer, Georg Heigold, Syl- vain Gelly, Jakob Uszkoreit, and Neil Houlsby. An image is worth 16x16 words: Transformers for image recognition at scale. In International Conference on Learning Representa- tions (ICLR), 2021. 7
work page 2021
-
[13]
Deep learning in cancer pathology: a new generation of clinical biomarkers
Amelie Echle, Niklas Timon Rindtorff, Titus Josef Brinker, Tom Luedde, Alexander Thomas Pearson, and Jakob Nikolas Kather. Deep learning in cancer pathology: a new generation of clinical biomarkers. British Journal of Cancer , 124(4): 686–696, 2021. 1
work page 2021
-
[14]
An update of the gleason grading system
Jonathan I Epstein. An update of the gleason grading system. The Journal of urology, 183(2):433–440, 2010. 1
work page 2010
-
[15]
A contemporary prostate cancer grading system: a validated alternative to the gleason score
Jonathan I Epstein, Michael J Zelefsky, Daniel D Sjoberg, Joel B Nelson, Lars Egevad, Cristina Magi-Galluzzi, An- drew J Vickers, Anil V Parwani, Victor E Reuter, Samson W Fine, et al. A contemporary prostate cancer grading system: a validated alternative to the gleason score. European urol- ogy, 69(3):428–435, 2016. 1
work page 2016
-
[16]
Scaling self-supervised learning for histopathology with masked image modeling
Alexandre Filiot, Ridouane Ghermi, Antoine Olivier, Paul Jacob, Lucas Fidon, Alice Mac Kain, Charlie Saillard, and Jean-Baptiste Schiratti. Scaling self-supervised learning for histopathology with masked image modeling. medRxiv preprint, 2023. 1 8
work page 2023
-
[17]
The clinician and dataset shift in artificial intelligence
Samuel G Finlayson, Adarsh Subbaswamy, Karandeep Singh, John Bowers, Annabel Kupke, Jonathan Zittrain, Isaac S Kohane, and Suchi Saria. The clinician and dataset shift in artificial intelligence. New England Journal of Medicine, 385(3):283–286, 2021. 1
work page 2021
-
[18]
Gustafsson, Martin Danelljan, and Thomas B
Fredrik K. Gustafsson, Martin Danelljan, and Thomas B. Sch¨on. How reliable is your regression model’s uncertainty under real-world distribution shifts? Transactions on Ma- chine Learning Research (TMLR), 2023. 1
work page 2023
-
[19]
Deep residual learning for image recognition
Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. Deep residual learning for image recognition. In Proceed- ings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 770–778, 2016. 7
work page 2016
-
[20]
Benchmarking neu- ral network robustness to common corruptions and perturba- tions
Dan Hendrycks and Thomas Dietterich. Benchmarking neu- ral network robustness to common corruptions and perturba- tions. In International Conference on Learning Representa- tions (ICLR), 2019. 1
work page 2019
-
[21]
Julia H ¨ohn, Eva Krieghoff-Henning, Christoph Wies, Lennard Kiehl, Martin J Hetz, Tabea-Clara Bucher, Jitendra Jonnagaddala, Kurt Zatloukal, Heimo M¨uller, Markus Plass, et al. Colorectal cancer risk stratification on histological slides based on survival curves predicted by deep learning. npj Precision Oncology, 7(1):98, 2023. 1
work page 2023
-
[22]
Attention-based deep multiple instance learning
Maximilian Ilse, Jakub Tomczak, and Max Welling. Attention-based deep multiple instance learning. In Inter- national Conference on Machine Learning (ICML) , pages 2127–2136, 2018. 2, 7
work page 2018
-
[23]
Domain generalization in computational pathology: survey and guidelines
Mostafa Jahanifar, Manahil Raza, Kesi Xu, Trinh Vuong, Rob Jewsbury, Adam Shephard, Neda Zamanitajeddin, Jin Tae Kwak, Shan E Ahmed Raza, Fayyaz Minhas, et al. Domain generalization in computational pathology: survey and guidelines. arXiv preprint arXiv:2310.19656, 2023. 1
-
[24]
Xiaofeng Jiang, Michael Hoffmeister, Hermann Brenner, Hannah Sophie Muti, Tanwei Yuan, Sebastian Foersch, Nicholas P West, Alexander Brobeil, Jitendra Jonnagaddala, Nicholas Hawkins, et al. End-to-end prognostication in col- orectal cancer by deep learning: a retrospective, multicentre study. The Lancet Digital Health, 6(1):e33–e43, 2024. 1
work page 2024
-
[25]
Wilds: A benchmark of in-the- wild distribution shifts
Pang Wei Koh, Shiori Sagawa, Henrik Marklund, Sang Michael Xie, Marvin Zhang, Akshay Balsubra- mani, Weihua Hu, Michihiro Yasunaga, Richard Lanas Phillips, Irena Gao, et al. Wilds: A benchmark of in-the- wild distribution shifts. In International Conference on Machine Learning (ICML), pages 5637–5664. PMLR, 2021. 1
work page 2021
-
[26]
Narmin Ghaffari Laleh, Hannah Sophie Muti, Chiara Maria Lavinia Loeffler, Amelie Echle, Oliver Lester Sal- danha, Faisal Mahmood, Ming Y Lu, Christian Trautwein, Rupert Langer, Bastian Dislich, et al. Benchmarking weakly- supervised deep learning pipelines for whole slide classifica- tion in computational pathology. Medical Image Analysis , 79, 2022. 2
work page 2022
-
[27]
Decoupled weight de- cay regularization
Ilya Loshchilov and Frank Hutter. Decoupled weight de- cay regularization. In International Conference on Learning Representations (ICLR), 2019. 7
work page 2019
-
[28]
Data-efficient and weakly supervised computational pathology on whole- slide images
Ming Y Lu, Drew FK Williamson, Tiffany Y Chen, Richard J Chen, Matteo Barbieri, and Faisal Mahmood. Data-efficient and weakly supervised computational pathology on whole- slide images. Nature Biomedical Engineering , 5(6):555– 570, 2021. 6, 7
work page 2021
-
[29]
A visual- language foundation model for computational pathology
Ming Y Lu, Bowen Chen, Drew FK Williamson, Richard J Chen, Ivy Liang, Tong Ding, Guillaume Jaume, Igor Odintsov, Long Phi Le, Georg Gerber, et al. A visual- language foundation model for computational pathology. Nature Medicine, 30:863–874, 2024. 1, 6, 7
work page 2024
-
[30]
Foundation models for generalist medi- cal artificial intelligence
Michael Moor, Oishi Banerjee, Zahra Shakeri Hossein Abad, Harlan M Krumholz, Jure Leskovec, Eric J Topol, and Pranav Rajpurkar. Foundation models for generalist medi- cal artificial intelligence. Nature, 616(7956):259–265, 2023. 1
work page 2023
-
[31]
Hibou: A family of foundational vision transformers for pathology
Dmitry Nechaev, Alexey Pchelnikov, and Ekaterina Ivanova. Hibou: A family of foundational vision transformers for pathology. arXiv preprint arXiv:2406.05074, 2024. 1
-
[32]
Benchmarking foundation models as feature extractors for weakly-supervised computational pathology
Peter Neidlinger, Omar SM El Nahhas, Hannah Sophie Muti, Tim Lenz, Michael Hoffmeister, Hermann Brenner, Marko van Treeck, Rupert Langer, Bastian Dislich, Hans Michael Behrens, et al. Benchmarking foundation models as feature extractors for weakly-supervised computational pathology. arXiv preprint arXiv:2408.15823, 2024. 1, 4
-
[33]
Jan Moritz Niehues, Philip Quirke, Nicholas P West, Heike I Grabsch, Marko van Treeck, Yoni Schirris, Gregory P Veld- huizen, Gordon GA Hutchins, Susan D Richman, Sebastian Foersch, et al. Generalizable biomarker prediction from can- cer pathology slides with self-supervised deep learning: A retrospective multi-centric study. Cell Reports Medicine , 4 (4)...
work page 2023
-
[34]
Maxime Oquab, Timoth ´ee Darcet, Th´eo Moutakanni, Huy V . V o, Marc Szafraniec, Vasil Khalidov, Pierre Fernandez, Daniel HAZIZA, Francisco Massa, Alaaeldin El-Nouby, Mido Assran, Nicolas Ballas, Wojciech Galuba, Russell Howes, Po-Yao Huang, Shang-Wen Li, Ishan Misra, Michael Rabbat, Vasu Sharma, Gabriel Synnaeve, Hu Xu, Herve Je- gou, Julien Mairal, Patr...
work page 2024
-
[35]
Sculley, Sebastian Nowozin, Joshua Dillon, Balaji Lakshmi- narayanan, and Jasper Snoek
Yaniv Ovadia, Emily Fertig, Jie Ren, Zachary Nado, D. Sculley, Sebastian Nowozin, Joshua Dillon, Balaji Lakshmi- narayanan, and Jasper Snoek. Can you trust your model's uncertainty? Evaluating predictive uncertainty under dataset shift. In Advances in Neural Information Processing Systems (NeurIPS), 2019. 1
work page 2019
-
[36]
F. Pedregosa, G. Varoquaux, A. Gramfort, V . Michel, B. Thirion, O. Grisel, M. Blondel, P. Prettenhofer, R. Weiss, V . Dubourg, J. Vanderplas, A. Passos, D. Cournapeau, M. Brucher, M. Perrot, and E. Duchesnay. Scikit-learn: Machine learning in Python. Journal of Machine Learning Research (JMLR), 12:2825–2830, 2011. 7
work page 2011
-
[37]
Dataset shift in ma- chine learning, 2009
Joaquin Quionero-Candela, Masashi Sugiyama, Anton Schwaighofer, and Neil D Lawrence. Dataset shift in ma- chine learning, 2009. 1
work page 2009
-
[38]
Imagenet large 9 scale visual recognition challenge
Olga Russakovsky, Jia Deng, Hao Su, Jonathan Krause, San- jeev Satheesh, Sean Ma, Zhiheng Huang, Andrej Karpathy, Aditya Khosla, Michael Bernstein, et al. Imagenet large 9 scale visual recognition challenge. International Journal of Computer Vision (IJCV), 115:211–252, 2015. 7
work page 2015
-
[39]
Charlie Saillard, Rodolphe Jenatton, Felipe Llinares-L ´opez, Zelda Mariet, David Cahan´e, Eric Durand, and Jean-Philippe Vert. H-optimus-0, 2024. 5
work page 2024
-
[40]
Artificial intelligence in histopathology: enhancing cancer research and clinical on- cology
Artem Shmatko, Narmin Ghaffari Laleh, Moritz Ger- stung, and Jakob Nikolas Kather. Artificial intelligence in histopathology: enhancing cancer research and clinical on- cology. Nature Cancer, 3(9):1026–1038, 2022. 1
work page 2022
-
[41]
Peter Str ¨om, Kimmo Kartasalo, Henrik Olsson, Leslie Solorzano, Brett Delahunt, Daniel M Berney, David G Bost- wick, Andrew J Evans, David J Grignon, Peter A Humphrey, et al. Artificial intelligence for diagnosis and grading of prostate cancer in biopsies: a population-based, diagnostic study. The Lancet Oncology, 21(2):222–232, 2020. 1
work page 2020
-
[42]
Prediction of recurrence risk in endometrial cancer with multimodal deep learning
Sarah V olinsky-Fremond, Nanda Horeweg, Sonali Andani, Jurriaan Barkey Wolf, Maxime W Lafarge, Cor D de Kroon, Gitte Ørtoft, Estrid Høgdall, Jouke Dijkstra, Jan J Jobsen, et al. Prediction of recurrence risk in endometrial cancer with multimodal deep learning. Nature Medicine, pages 1– 12, 2024. 1
work page 2024
-
[43]
A foundation model for clinical-grade computational pathology and rare cancers detection
Eugene V orontsov, Alican Bozkurt, Adam Casson, George Shaikovski, Michal Zelechowski, Kristen Severson, Eric Zimmermann, James Hall, Neil Tenenholtz, Nicolo Fusi, et al. A foundation model for clinical-grade computational pathology and rare cancers detection. Nature Medicine , pages 1–12, 2024. 1, 5
work page 2024
-
[44]
Improved breast cancer histological grading using deep learning
Y Wang, B Acs, S Robertson, B Liu, Leslie Solorzano, Car- olina W ¨ahlby, J Hartman, and M Rantalainen. Improved breast cancer histological grading using deep learning. An- nals of Oncology, 33(1):89–98, 2022. 1
work page 2022
-
[45]
Georg W ¨olflein, Dyke Ferber, Asier R. Meneghetti, Omar S. M. El Nahhas, Daniel Truhn, Zunamys I. Carrero, David J. Harrison, Ognjen Arandjelovi ´c, and Jakob Nikolas Kather. Benchmarking pathology feature extractors for whole slide image classification. arXiv preprint arXiv:2311.11772v5 ,
-
[46]
A whole-slide foundation model for digital pathology from real-world data
Hanwen Xu, Naoto Usuyama, Jaspreet Bagga, Sheng Zhang, Rajesh Rao, Tristan Naumann, Cliff Wong, Zelalem Gero, Javier Gonz ´alez, Yu Gu, et al. A whole-slide foundation model for digital pathology from real-world data. Nature, pages 1–8, 2024. 1, 5
work page 2024
-
[47]
Coca: Contrastive captioners are image-text foundation models
Jiahui Yu, Zirui Wang, Vijay Vasudevan, Legg Yeung, Mo- jtaba Seyedhosseini, and Yonghui Wu. Coca: Contrastive captioners are image-text foundation models. Transactions on Machine Learning Research (TMLR), 2022. 7
work page 2022
-
[48]
Image BERT pre-training with online tokenizer
Jinghao Zhou, Chen Wei, Huiyu Wang, Wei Shen, Cihang Xie, Alan Yuille, and Tao Kong. Image BERT pre-training with online tokenizer. InInternational Conference on Learn- ing Representations (ICLR), 2022. 7
work page 2022
-
[49]
Virchow2: Scaling self-supervised mixed magnification models in pathology
Eric Zimmermann, Eugene V orontsov, Julian Viret, Adam Casson, Michal Zelechowski, George Shaikovski, Neil Tenenholtz, James Hall, Thomas Fuchs, Nicolo Fusi, et al. Virchow2: Scaling self-supervised mixed magnification models in pathology. arXiv preprint arXiv:2408.00738 ,
-
[50]
1, 5 10 Evaluating Computational Pathology Foundation Models for Prostate Cancer Grading under Distribution Shifts Supplementary Material A. Supplementary Tables Table S1. Raw numerical results for Figure 1. All results are mean±std over 10 random cross-validation folds. PANDA Karolinska Radboud Radboud→Karolinska Radboud-U Radboud-U→Karolinska-U Radboud-...
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.