pith. sign in

arxiv: 2510.23807 · v5 · submitted 2025-10-27 · 💻 cs.AI · cs.CV

Beyond the Failures: Rethinking Foundation Models in Pathology

Pith reviewed 2026-05-18 03:55 UTC · model grok-4.3

classification 💻 cs.AI cs.CV
keywords foundation modelspathologydigital pathologybiological imagingmodel adaptationAI limitationstissue analysis
0
0 comments X

The pith

Pathology requires foundation models designed explicitly for biological tissue rather than adapted from natural-image systems.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper argues that foundation models have underperformed in pathology, showing low accuracy, instability, and high computational costs. These problems arise from conceptual mismatches, not mere tuning shortfalls: dense embeddings cannot capture the combinatorial richness of tissue, and architectures carry over flaws in self-supervision, patch design, and noise handling. Biological complexity and limited domain-specific innovation widen the divide. A sympathetic reader cares because this diagnosis explains stalled progress and indicates that simply scaling existing models will not suffice. The paper concludes that pathology needs models built from the ground up for biological images whose assumptions differ from those of natural scenes.

Core claim

The central claim is that the shortcomings of foundation models in pathology stem from deeper conceptual mismatches: dense embeddings cannot represent the combinatorial richness of tissue, current architectures inherit flaws in self-supervision, patch design, and noise-fragile pretraining, and biological complexity plus limited domain innovation keep the gap open. Therefore pathology requires models explicitly designed for biological images rather than adaptations of large-scale natural-image methods whose assumptions do not hold for tissue.

What carries the argument

The conceptual mismatch between natural-image foundation model assumptions and the combinatorial structure of biological tissue images.

If this is right

  • Continued adaptation of natural-image models will keep producing unstable or low-accuracy results on tissue analysis tasks.
  • New architectures must address patch design, self-supervision, and noise sensitivity specifically for biological data.
  • Dense embedding approaches will need replacement by representations that handle combinatorial tissue complexity.
  • Domain innovation focused on pathology will be required before reliable clinical tools emerge.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same mismatch logic may apply to other specialized imaging fields such as radiology or histopathology variants.
  • Sparse or structured representations could replace dense embeddings to better match tissue combinatorics.
  • Hybrid systems that incorporate explicit biological rules alongside learned features might accelerate progress.

Load-bearing premise

That the observed failures in pathology originate from fundamental design mismatches rather than insufficient scale or better optimization of existing natural-image models.

What would settle it

A controlled experiment in which a large natural-image foundation model, after extensive scaling and domain-specific tuning, matches or exceeds the accuracy and stability of custom biological models on standard pathology benchmarks.

Figures

Figures reproduced from arXiv: 2510.23807 by Hamid R. Tizhoosh.

Figure 1
Figure 1. Figure 1: AI models can recognize dogs and even distin [PITH_FULL_IMAGE:figures/full_fig_p003_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Self-supervised learning often rests on the im [PITH_FULL_IMAGE:figures/full_fig_p003_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: The field of view in light microscopy is tradition [PITH_FULL_IMAGE:figures/full_fig_p004_3.png] view at source ↗
read the original abstract

Despite their successes in vision and language, foundation models have stumbled in pathology, revealing low accuracy, instability, and heavy computational demands. These shortcomings stem not from tuning problems but from deeper conceptual mismatches: dense embeddings cannot represent the combinatorial richness of tissue, and current architectures inherit flaws in self-supervision, patch design, and noise-fragile pretraining. Biological complexity and limited domain innovation further widen the gap. The evidence is clear-pathology requires models explicitly designed for biological images rather than adaptations of large-scale natural-image methods whose assumptions do not hold for tissue.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

1 major / 1 minor

Summary. The manuscript is a position paper asserting that foundation models adapted from natural-image domains have failed in pathology due to conceptual mismatches rather than tuning deficiencies. Key issues cited include the inability of dense embeddings to capture tissue combinatorial complexity, inherited flaws in self-supervision, patch design, and noise-fragile pretraining, along with limited domain-specific innovation; the conclusion is that pathology requires models explicitly designed for biological images.

Significance. If the perspective is adopted, it could redirect research effort toward domain-native architectures in computational pathology, potentially yielding more stable and efficient models for tissue analysis. As an opinion piece without new experiments or derivations, its primary contribution would be to frame existing shortcomings in a way that motivates targeted innovation rather than incremental adaptation.

major comments (1)
  1. [Abstract] Abstract: the assertion that shortcomings 'stem not from tuning problems but from deeper conceptual mismatches' and that 'the evidence is clear' is load-bearing for the central claim yet is advanced without data, controlled comparisons, error analysis, or falsifiable tests, leaving the distinction between tuning and conceptual issues unevaluated.
minor comments (1)
  1. The manuscript would benefit from explicit section headings or a structured outline to separate the critique of current methods from the proposed rethinking, improving readability for readers unfamiliar with the position.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for their constructive review of our position paper. We address the single major comment below and describe the revisions we will make to improve clarity while preserving the manuscript's perspective character.

read point-by-point responses
  1. Referee: [Abstract] Abstract: the assertion that shortcomings 'stem not from tuning problems but from deeper conceptual mismatches' and that 'the evidence is clear' is load-bearing for the central claim yet is advanced without data, controlled comparisons, error analysis, or falsifiable tests, leaving the distinction between tuning and conceptual issues unevaluated.

    Authors: We acknowledge the validity of this observation. As a position paper, the manuscript synthesizes existing literature rather than presenting new experiments, controlled comparisons, or falsifiable tests. The central distinction is argued on conceptual grounds: pathology images possess hierarchical combinatorial structure and domain-specific noise characteristics that differ fundamentally from natural-image assumptions underlying current self-supervised pretraining and dense embedding approaches. We cite multiple prior studies in which extensive tuning and architectural adaptation of natural-image foundation models have failed to close performance gaps. To address the referee's point directly, we will revise the abstract to replace 'the evidence is clear' with 'literature review indicates' and add a short clarifying paragraph in the introduction that explicitly frames the argument as conceptual synthesis rather than new empirical demonstration. These changes will be incorporated in the next version. revision: partial

Circularity Check

0 steps flagged

Position paper with no technical derivation or load-bearing steps

full rationale

The manuscript is a position paper that advances an opinion on conceptual mismatches between natural-image foundation models and pathology requirements. It contains no equations, proofs, fitted parameters, predictions, or derivation chains that could reduce to inputs by construction. Arguments rest on stated observations and assertions about embeddings, self-supervision, and patch design without invoking self-citations as uniqueness theorems or smuggling ansatzes. The central claim is framed explicitly as a call for new designs rather than a derived result, rendering the text self-contained against external benchmarks with no circularity present.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 0 invented entities

The central claim rests on domain assumptions about representation limits and training flaws without new supporting evidence or external validation.

axioms (2)
  • domain assumption Dense embeddings cannot represent the combinatorial richness of tissue.
    Invoked directly in the abstract as the root cause of model failures.
  • domain assumption Shortcomings stem not from tuning problems but from deeper conceptual mismatches.
    Stated explicitly as the explanation separating this view from standard fine-tuning discussions.

pith-pipeline@v0.9.0 · 5608 in / 1282 out tokens · 33156 ms · 2026-05-18T03:55:18.457551+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Forward citations

Cited by 2 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. SIMPLER: H&E-Informed Representation Learning for Structured Illumination Microscopy

    cs.CV 2026-04 unverdicted novelty 7.0

    SIMPLER learns biologically grounded SIM representations by progressively aligning them with H&E images through multiple self-supervised objectives, outperforming scratch-trained or H&E-only models on downstream tasks...

  2. Validation of Whole-Slide Foundation Models for Image Retrieval in TCGA Data

    cs.CV 2026-04 unverdicted novelty 4.0

    Benchmarking on TCGA shows TITAN foundation model edges out others for whole-slide retrieval but with only ~68% average accuracy, high organ-to-organ variation, and no consistent winner over patch-level baselines.

Reference graph

Works this paper leans on

15 extracted references · 15 canonical work pages · cited by 2 Pith papers · 1 internal anchor

  1. [1]

    On the Opportunities and Risks of Foundation Models

    R. Bommasani et al. On the opportunities and risks of foundation models. arXiv preprint arXiv:2108.07258, 2021

  2. [2]

    Foundation models defining a new era in vision: a survey and outlook

    Muhammad Awais, Muzammal Naseer, Salman Khan, Rao Muhammad Anwer, Hisham Cholakkal, Mubarak Shah, Ming-Hsuan Yang, and Fahad Shah- baz Khan. Foundation models defining a new era in vision: a survey and outlook. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2025

  3. [3]

    Foundation models in computational pathology: A review of challenges, opportunities, and impact

    Mohsin Bilal, Manahil Raza, Youssef Altherwy, Anas Alsuhaibani, Abdulrahman Abduljabbar, Fah- dah Almarshad, Paul Golding, Nasir Rajpoot, et al. Foundation models in computational pathology: A review of challenges, opportunities, and impact. arXiv preprint arXiv:2502.08333, 2025

  4. [4]

    A survey of pathology foundation model: Progress and future directions

    Conghao Xiong, Hao Chen, and Joseph JY Sung. A survey of pathology foundation model: Progress and future directions. arXiv preprint arXiv:2504.04045, 2025

  5. [5]

    Validation of histopathology foun- dation models through whole slide image retrieval

    Saghir Alfasly, Ghazal Alabtah, Sobhan Hemati, Krishna Rani Kalari, Joaquin J Garcia, and HR Tizhoosh. Validation of histopathology foun- dation models through whole slide image retrieval. Scientific Reports, 15(1):3990, 2025

  6. [6]

    Current pathology foundation models are unro- bust to medical center differences

    Edwin D de Jong, Eric Marcus, and Jonas Teuwen. Current pathology foundation models are unro- bust to medical center differences. arXiv preprint arXiv:2501.18055, 2025

  7. [7]

    Are the latent representations of foundation models for pathology invariant to rotation? arXiv preprint arXiv:2412.11938, 2024

    Matou ˇs Elphick, Samra Turajlic, and Guang Yang. Are the latent representations of foundation models for pathology invariant to rotation? arXiv preprint arXiv:2412.11938, 2024

  8. [8]

    Rotation-agnostic im- age representation learning for digital pathology

    Saghir Alfasly, Abubakr Shafique, Peyman Ne- jat, Jibran Khan, Areej Alsaafin, Ghazal Alabtah, and Hamid R Tizhoosh. Rotation-agnostic im- age representation learning for digital pathology. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 11683–11693, 2024

  9. [9]

    Foundation models– a panacea for artificial intelligence in pathology? arXiv preprint arXiv:2502.21264, 2025

    Nita Mulliqi, Anders Blilie, Xiaoyi Ji, Kelvin Szol- noky, Henrik Olsson, Sol Erika Boman, Matteo Ti- tus, Geraldine Martinez Gonzalez, Julia Anna Miel- carz, Masi Valkonen, et al. Foundation models– a panacea for artificial intelligence in pathology? arXiv preprint arXiv:2502.21264, 2025

  10. [10]

    Liang et al

    J. Liang et al. Benchmarking foundation models as feature extractors for pathology tasks. Nature Biomedical Engineering, 9:1516, 2025

  11. [11]

    Universal and transfer- able attacks on pathology foundation models

    Yuntian Wang, Xilin Yang, Che-Yung Shen, Nir Pil- lar, and Aydogan Ozcan. Universal and transfer- able attacks on pathology foundation models. arXiv preprint arXiv:2510.16660, 2025

  12. [12]

    Can children of dif- ferent ages recognize dog communication signals in different situations? International journal of environmental research and public health, 17(2):506, 2020

    Petra Eretov ´a, Helena Chaloupkov ´a, Marcela Hef- ferov´a, and Eva Joz ´ıfkov´a. Can children of dif- ferent ages recognize dog communication signals in different situations? International journal of environmental research and public health, 17(2):506, 2020

  13. [13]

    A survey on com- putational pathology foundation models: Datasets, adaptation strategies, and evaluation tasks

    Dong Li, Guihong Wan, Xintao Wu, Xinyu Wu, Ajit J Nirmal, Christine G Lian, Peter K Sorger, Yev- geniy R Semenov, and Chen Zhao. A survey on com- putational pathology foundation models: Datasets, adaptation strategies, and evaluation tasks. arXiv preprint arXiv:2501.15724, 2025

  14. [14]

    From pretraining to 6 Beyond the Failures: Rethinking Foundation Models in Pathology pathology: How noise leads to catastrophic inheri- tance in medical models

    Hao Sun, Zhongyi Han, Hao Chen, Jindong Wang, Xin Gao, and Yilong Yin. From pretraining to 6 Beyond the Failures: Rethinking Foundation Models in Pathology pathology: How noise leads to catastrophic inheri- tance in medical models. In The Thirty-ninth Annual Conference on Neural Information Processing Systems, 2025

  15. [15]

    On the theoretical limitations of embedding-based retrieval

    Orion Weller, Michael Boratko, Iftekhar Naim, and Jinhyuk Lee. On the theoretical limita- tions of embedding-based retrieval. arXiv preprint arXiv:2508.21038, 2025. 7