SegTME-UNI2: A Foundation Model-Based Framework for Generalisable Multiclass Cell Segmentation and LLM-Driven Tumour Microenvironment Characterisation in Histopathology

Anwar P.P. Abdul Majeed; Faris Syahmi Samidi; Mohammad Badal Ahmmed; Selvam Thavaraj; Vimal Angela Thiviyanathan; Wan Siti Halimatul Munirah Wan Ahmad

arxiv: 2606.17702 · v2 · pith:ZJUUVQHBnew · submitted 2026-06-16 · 💻 cs.CV · cs.AI

SegTME-UNI2: A Foundation Model-Based Framework for Generalisable Multiclass Cell Segmentation and LLM-Driven Tumour Microenvironment Characterisation in Histopathology

Wan Siti Halimatul Munirah Wan Ahmad , Faris Syahmi Samidi , Mohammad Badal Ahmmed , Vimal Angela Thiviyanathan , Selvam Thavaraj , Anwar P.P. Abdul Majeed This is my paper

Pith reviewed 2026-06-27 01:37 UTC · model grok-4.3

classification 💻 cs.CV cs.AI

keywords cell segmentationtumour microenvironmenthistopathologyfoundation modelpseudo-label curriculumLLM interpretationsemantic segmentationinstance separation

0 comments

The pith

A pathology foundation model paired with dual decoders, trained through three stages of progressively refined pseudo-labels, produces generalisable six-class cell segmentation that feeds an LLM to generate tumour microenvironment narratives

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper aims to establish that a ViT-Giant pathology foundation model equipped with parallel UperNet heads for semantic classes and boundary gradients can be bootstrapped into reliable multiclass segmentation using only limited human labels followed by entropy-filtered pseudo-labels on large unannotated archives. These segmentations then supply a structured extraction of compositional, morphological, and spatial metrics that a fine-tuned GPT converts into readable clinical text. A sympathetic reader would care because the pipeline removes the need for exhaustive pixel annotations while turning routine slides into quantifiable and narratable TME profiles at scale. The method deliberately retrains a fresh model at each stage rather than fine-tuning, so gains come solely from rising pseudo-label quality across PanNuke and multi-resolution TCGA-UT data.

Core claim

UNI2-UperHoVeR pairs the UNI2-h ViT-Giant foundation model with two parallel UperNet decoders—one producing six-class semantic segmentation and the other regressing horizontal-vertical gradients for watershed-based instance separation—and is trained via a three-stage progressive pseudo-label curriculum: stage 1 on human-annotated PanNuke, stage 2 on entropy-filtered outputs of the stage-1 model applied to 271k TCGA-UT patches, and stage 3 on entropy-filtered outputs of the stage-2 model applied to 1.6M patches across six resolution scales, with no weight transfer between stages; the resulting segmentation outputs drive a TME feature pipeline that encodes more than twenty metrics as JSON for

What carries the argument

The three-stage progressive pseudo-label curriculum, in which each fresh model is trained exclusively on entropy-filtered pseudo-labels generated by the preceding model to raise label quality without weight transfer or human correction.

If this is right

The six-class segmentation and instance separation generalise across the resolution scales and tissue types present in TCGA-UT.
The TME pipeline computes more than twenty per-patch metrics covering composition, morphology, spatial entropy, and intercellular distances.
JSON-encoded features from the pipeline are converted by the fine-tuned BioNeMo GPT into clinically interpretable narratives.
The released pseudo-labelled TCGA-UT dataset and UNI2-UperHoVeR checkpoint directly support large-scale TME profiling.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The curriculum may be reusable with other foundation models to bootstrap segmentation tasks that lack dense labels.
Pairing quantitative TME metrics with LLM narratives could support integration of image-derived features into existing clinical reporting systems.
Success across multiple resolution scales indicates the framework may tolerate real-world scanner and staining variation once validated on external cohorts.

Load-bearing premise

Entropy-filtered pseudo-labels produced by each successive model are of measurably higher quality than those of the prior stage, sufficient to drive genuine segmentation improvement across tissue types and resolution scales without any human correction or weight transfer.

What would settle it

Segmentation metrics on held-out TCGA-UT partitions show no improvement or decline from stage-1 to stage-3 models, or the LLM-generated TME narratives show systematic mismatch with pathologist review on the same patches.

Figures

Figures reproduced from arXiv: 2606.17702 by Anwar P.P. Abdul Majeed, Faris Syahmi Samidi, Mohammad Badal Ahmmed, Selvam Thavaraj, Vimal Angela Thiviyanathan, Wan Siti Halimatul Munirah Wan Ahmad.

**Figure 1.** Figure 1: TCGA-UT multi-resolution patches from the same tissue region (TCGA-OR-A5J1). Each column shows the same 128×128 𝜇m tissue area sampled at a different resolution, from Scale 0 (0.5 𝜇m/pixel, finest detail) to Scale 5 (1.0 𝜇m/pixel, broadest tissue context). At finer scales, individual nuclear morphology and staining intensity are clearly resolved; at coarser scales, intercellular spatial organisation and ti… view at source ↗

**Figure 2.** Figure 2: Full SegTME-UNI2 system architecture and inference pipeline. (a) UNI2-UperHoVer dual-head segmentation model: a shared UNI2-h ViT-Giant backbone extracts a four-scale FPN from transformer blocks 5, 11, 17, and 23; two independent UperNet decoder heads produce six-class semantic logits and two-channel HV maps. (b) End-to-end inference and reporting pipeline: patches are MPP-normalised, tiled with 50% overla… view at source ↗

**Figure 3.** Figure 3: Three-stage progressive pseudo-label training curriculum. Each model is trained independently from a fresh initialisation (UNI2-h pretrained backbone, randomly initialised decoder heads); the only connection between stages is the pseudo-labels generated by the previous model. Stage 1: 1 is trained on human-annotated PanNuke (0.25 𝜇m/pixel). Stage 2: 1 generates entropy-filtered pseudo-labels for TCGA-UT … view at source ↗

**Figure 4.** Figure 4: End-to-end pipeline for per-patch TME feature extraction and biological token translation, implemented across two functional modules. The pipeline accepts two segmentation outputs: a 2D instance mask (inst_mask, integer region IDs) and a six-class semantic mask (cls_mask, values 0 to 5) and produces a structured JSON object containing all quantitative TME descriptors and their categorical interpretations. … view at source ↗

**Figure 5.** Figure 5: End-to-end SegTME-UNI2 inference and narrative generation pipeline. From a raw H&E patch, UNI2-UperHoVer produces a numeric TME feature set (raw_data, green) and categorical interpretation labels (orange). Both are passed as structured JSON to the BioNeMo SFT model, which generates a clinically grounded TME patch description. The representative example shown (Neoplastic ratio: 0.29; Immune Hot; immune cell… view at source ↗

**Figure 6.** Figure 6: Training dynamics across the three curriculum stages (1 : PanNuke, 249 epochs; 2 : TCGA-UT Scale 0, 250 epochs; 3 : TCGA-UT Scales 0–5, 100 epochs; all seeds identical; single run per stage). Left (a): validation mIoU trajectories. All three models initialise from the UNI2-h pretrained backbone with randomly initialised decoder heads; no weights are inherited between stages. Key observation: the initial… view at source ↗

**Figure 7.** Figure 7: Qualitative TME characterisation on five independent IGNITE NSCLC tiles (Tiles 1 to 5). Each row shows (left to right) the original H&E tile, nucleus outline overlay colour-coded by predicted class (Neoplastic: red; Inflammatory: blue; Connective: green), raw six-class semantic segmentation map, and TME phenotype token output generated by translate_tme_to_bio_tokens ( [PITH_FULL_IMAGE:figures/full_fig_p02… view at source ↗

**Figure 8.** Figure 8: Qualitative TME characterisation on five independent IGNITE NSCLC tiles (Tiles 4 and 5). Each row follows the same layout as [PITH_FULL_IMAGE:figures/full_fig_p023_8.png] view at source ↗

read the original abstract

Characterising the tumour microenvironment (TME) from routine H&E-stained histology images requires simultaneous cell segmentation, feature extraction, and interpretable clinical reporting. We present SegTME-UNI2, a unified framework addressing these requirements. Its core is UNI2-UperHoVeR, a dual-head segmentation model pairing the UNI2-h pathology foundation model (ViT-Giant, pretrained on >100M tiles from 100K slides) with two parallel UperNet decoders: one for six-class semantic segmentation and one for horizontal-vertical gradient regression enabling watershed-based nuclear instance separation. To address the lack of pixel-level annotations in large real-world repositories, UNI2-UperHoVeR undergoes a three-stage progressive pseudo-label curriculum. Each stage trains a fresh model without weight transfer, driving improvement entirely via increased pseudo-label quality: Stage 1: Uses human-annotated PanNuke (7,901 images, 189,744 nuclei, 0.25 um/pixel). Stage 2: Uses entropy-filtered pseudo-labels from the Stage 1 model on 271,711 TCGA-UT scale-0 patches (0.5 um/pixel). Stage 3: Uses pseudo-labels from the Stage 2 model on all 1,608,060 TCGA-UT patches across six resolution scales (0.5-1.0 um/pixel). Segmentation outputs feed a structured TME feature extraction pipeline computing 20+ per-patch compositional, morphological, spatial entropy, and intercellular distance metrics. These are encoded as JSON and passed to a fine-tuned NVIDIA BioNeMo GPT model to generate clinically interpretable TME narratives. Preliminary validation on held-out PanNuke and TCGA-UT partitions demonstrates framework feasibility and internal consistency. The pseudo-labelled TCGA-UT dataset and UNI2-UperHoVeR checkpoint are publicly released to support large-scale TME profiling and spatial biology research.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper lays out a full pipeline from UNI2-h segmentation through pseudo-label curriculum to LLM TME narratives but supplies no numbers to show the curriculum actually improves anything.

read the letter

The main point is a three-stage curriculum that trains fresh UNI2-UperHoVeR models on PanNuke then entropy-filtered TCGA-UT patches, followed by metric extraction and BioNeMo-generated reports. They release the checkpoint and the pseudo-labeled TCGA-UT set.

What is actually new is the no-weight-transfer curriculum and the dual-head attachment for six-class semantic plus HV regression on this foundation model. The end-to-end flow from raw patches to structured JSON to readable narratives is also presented as a single package.

The release itself is useful. Groups working on spatial biology can start from the data and model without having to rerun the labeling steps.

The soft spot is the validation. The abstract only claims preliminary validation and internal consistency on held-out partitions. There are no accuracy figures, no stage-to-stage comparisons, no ablation on the entropy filter, and no external test sets. That leaves the central assumption—that each round of pseudo-labels is meaningfully better—unverified. The stress-test note about missing cross-stage metrics is accurate based on what is described.

This is for digital pathology labs that want reusable assets for TME profiling and are willing to run their own checks. A reader who needs evidence of better segmentation or more reliable narratives will not find it here.

I would bring it to a reading group to discuss the curriculum design choices. I would not cite it yet. It deserves peer review once the authors add the quantitative results and ablations, because the framework is coherent and the release lowers the barrier for follow-up work.

Referee Report

3 major / 2 minor

Summary. The paper introduces SegTME-UNI2, a framework centered on UNI2-UperHoVeR (UNI2-h ViT-Giant backbone with dual UperNet heads for six-class semantic segmentation and H/V gradient regression for watershed instance separation). It trains this model via a three-stage progressive pseudo-label curriculum (human-annotated PanNuke Stage 1; entropy-filtered pseudo-labels on TCGA-UT scale-0 patches Stage 2; full multi-scale TCGA-UT Stage 3, each stage training a fresh model with no weight transfer), then feeds outputs into a 20+ metric TME feature pipeline whose JSON is rendered as clinical narratives by a fine-tuned BioNeMo GPT. Preliminary validation on held-out PanNuke/TCGA-UT partitions is reported as demonstrating feasibility and internal consistency; the pseudo-labeled dataset and checkpoint are released publicly.

Significance. If the curriculum demonstrably improves pseudo-label quality and yields generalisable six-class segmentation plus instance separation across TCGA-UT resolutions, the framework would provide a scalable route to TME profiling on large unannotated H&E repositories and link quantitative spatial features to LLM-generated narratives. The public release of the pseudo-labeled TCGA-UT dataset and UNI2-UperHoVeR checkpoint is a concrete strength supporting reproducibility and downstream spatial-biology research.

major comments (3)

[Abstract] Abstract: the claim that the three-stage curriculum produces generalisable segmentation 'driving improvement entirely via increased pseudo-label quality' is load-bearing for the central contribution, yet no cross-stage quantitative verification (e.g., agreement with Stage-1 PanNuke labels, entropy-distribution shifts, or per-class Dice on held-out TCGA-UT) is supplied; internal consistency on held-out partitions alone does not confirm that entropy-filtered labels from each fresh model are strictly higher quality.
[Abstract / Validation] Validation description (preliminary validation paragraph): only 'preliminary validation' and 'internal consistency' are stated, with no reported quantitative metrics (Dice, PQ, AJI, or instance-level F1), ablation of curriculum stages, or external test sets; this leaves the generalisability claim across 0.5–1.0 µm/pixel scales and tissue types unsupported by evidence.
[TME pipeline] TME feature pipeline section: the 20+ compositional/morphological/spatial metrics and their JSON encoding are asserted to feed clinically interpretable narratives, but no concrete output examples, human evaluation of narrative fidelity, or correlation with known TME biology are provided to substantiate the end-to-end utility.

minor comments (2)

[Stage 3 description] Clarify whether the six resolution scales in Stage 3 are handled by a single model or by scale-specific fine-tuning; the current description leaves the multi-scale training procedure ambiguous.
[Curriculum] The abstract states 'no weight transfer' between stages; confirm this is also true for the decoder heads and any auxiliary losses.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for the constructive feedback. We respond point-by-point to the major comments, indicating where revisions will strengthen the manuscript while maintaining honesty about current limitations.

read point-by-point responses

Referee: [Abstract] Abstract: the claim that the three-stage curriculum produces generalisable segmentation 'driving improvement entirely via increased pseudo-label quality' is load-bearing for the central contribution, yet no cross-stage quantitative verification (e.g., agreement with Stage-1 PanNuke labels, entropy-distribution shifts, or per-class Dice on held-out TCGA-UT) is supplied; internal consistency on held-out partitions alone does not confirm that entropy-filtered labels from each fresh model are strictly higher quality.

Authors: We agree that direct cross-stage verification would better substantiate the claim. The manuscript currently supports the progressive curriculum via the design (entropy filtering + scale expansion) and held-out consistency, but lacks explicit metrics like entropy shifts or label agreement. In revision we will add such analyses (e.g., entropy distributions and per-class metrics on held-out data) to provide quantitative support for quality improvement. revision: yes
Referee: [Abstract / Validation] Validation description (preliminary validation paragraph): only 'preliminary validation' and 'internal consistency' are stated, with no reported quantitative metrics (Dice, PQ, AJI, or instance-level F1), ablation of curriculum stages, or external test sets; this leaves the generalisability claim across 0.5–1.0 µm/pixel scales and tissue types unsupported by evidence.

Authors: The validation is presented as preliminary and focused on feasibility. We will revise the abstract and validation section to report specific metrics (Dice, PQ, AJI, F1) from held-out PanNuke/TCGA-UT partitions and include curriculum-stage ablations. External test sets beyond the held-out TCGA-UT partitions are not available in the current study; we will clarify the scope of generalisability to the tested distributions and resolutions while noting this limitation. revision: partial
Referee: [TME pipeline] TME feature pipeline section: the 20+ compositional/morphological/spatial metrics and their JSON encoding are asserted to feed clinically interpretable narratives, but no concrete output examples, human evaluation of narrative fidelity, or correlation with known TME biology are provided to substantiate the end-to-end utility.

Authors: We acknowledge the absence of concrete examples and evaluations for the TME narrative component. In the revised manuscript we will include sample JSON outputs paired with generated narratives, plus qualitative discussion of alignment with known TME biology to better demonstrate end-to-end utility. revision: yes

Circularity Check

0 steps flagged

No significant circularity in derivation chain

full rationale

The paper's derivation consists of a three-stage pseudo-label curriculum (human-annotated PanNuke to entropy-filtered TCGA-UT stages) feeding segmentation outputs into a TME metric pipeline and then an LLM for narratives. No equations, self-citations, or definitions reduce any claimed prediction or improvement to quantities defined by the inputs themselves; the curriculum improvement is asserted via the training process and internal consistency on held-out partitions rather than by construction. The foundation model and BioNeMo components are external, and no uniqueness theorems or ansatzes are invoked in a load-bearing way. The chain is therefore self-contained against external benchmarks and does not match any enumerated circularity patterns.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central claim depends on the assumption that successive pseudo-label sets improve in quality without external validation or weight transfer; no free parameters are explicitly named, no new physical entities are introduced, and the only background assumptions are standard supervised segmentation training and watershed post-processing.

axioms (1)

domain assumption Entropy-filtered pseudo-labels from each stage are of higher quality than the previous stage
Invoked to justify the progressive curriculum in stages 2 and 3

pith-pipeline@v0.9.1-grok · 5945 in / 1393 out tokens · 28732 ms · 2026-06-27T01:37:39.032900+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

30 extracted references · 22 canonical work pages

[1]

Semi- supervised learning for network-based cardiac mr image segmentation, in: Medical Image Computing and Computer Assisted Intervention (MICCAI), pp

Bai, W., Oktay, O., Sinclair, M., Suzuki, H., Rajchl, M., Tarroni, G., Glocker, B., King, A., Matthews, P.M., Rueckert, D., 2017. Semi- supervised learning for network-based cardiac mr image segmentation, in: Medical Image Computing and Computer Assisted Intervention (MICCAI), pp. 253–260. doi:10.1007/978-3-319-66185-8_29

work page doi:10.1007/978-3-319-66185-8_29 2017
[2]

Digitalquantificationofstromapercentageenhancesprognosticstratification in pancreatic cancer

Bengtsson,A.,Andersson,R.,Andersson,B.,Ansari,D.,2026. Digitalquantificationofstromapercentageenhancesprognosticstratification in pancreatic cancer. Surgery in Practice and Science doi:10.1016/j.sopen.2026.01.002

work page doi:10.1016/j.sopen.2026.01.002 2026
[3]

Towards a general-purpose foundation model for computational pathology

Chen, R.J., Ding, T., Lu, M.Y., Williamson, D.F.K., Jaume, G., Song, A.H., Chen, B., Zhang, A., Shao, D., Schuffler, P.J., Mahmood, F., 2024a. Towards a general-purpose foundation model for computational pathology. Nature Medicine 30, 850–862. doi:10.1038/ s41591-024-02857-3
[4]

Uni2: Towards a universal whole-slide foundation model for pathology

Chen, R.J., Lu, M.Y., Ding, T., Williamson, D.F.K., Jaume, G., Chen, B., Mahmood, F., 2024b. Uni2: Towards a universal whole-slide foundation model for pathology. arXiv preprint arXiv:2406.01647 doi:10.48550/arXiv.2406.01647

work page doi:10.48550/arxiv.2406.01647
[5]

Approaches to treat immune hot, altered and cold tumours with combination immunotherapies

Galon, J., Bruni, D., 2019. Approaches to treat immune hot, altered and cold tumours with combination immunotherapies. Nature Reviews Drug Discovery 18, 197–218. doi:10.1038/s41573-018-0007-y

work page doi:10.1038/s41573-018-0007-y 2019
[6]

Type, density, and location of immune cells within human colorectal tumors predict clinical outcome

Galon, J., Costes, A., Sanchez-Cabo, F., Kirilovsky, A., Mlecnik, B., Lagorce-Pageès, C., Tosolini, M., Camus, M., Berger, A., Wind, P., Zinzindohoué, F., Bruneval, P., Cugnenc, P.H., Trajanoski, Z., Fridman, W.H., Pagès, F., 2006. Type, density, and location of immune cells within human colorectal tumors predict clinical outcome. Science 313, 1960–1964. ...

work page doi:10.1126/science.1129139 2006
[7]

Pannukedataset extension, insights and baselines

Gamper,J.,Koohbanani,N.A.,Benes,K.,Graham,S.,Jahanifar,M.,Khurram,S.A.,Azam,A.,Hewitt,K.,Rajpoot,N.,2020. Pannukedataset extension, insights and baselines. arXiv preprint arXiv:2003.10778 doi:10.48550/arXiv.2003.10778

work page doi:10.48550/arxiv.2003.10778 2020
[8]

Hover-net: Simultaneous segmentation and classification of nuclei in multi-tissue histology images

Graham, S., Vu, Q.D., Raza, S.E.A., Azam, A., Tsang, Y.W., Kwak, J.T., Rajpoot, N., 2019. Hover-net: Simultaneous segmentation and classification of nuclei in multi-tissue histology images. Medical Image Analysis 58, 101563. doi:10.1016/j.media.2019.101563

work page doi:10.1016/j.media.2019.101563 2019
[9]

Semi-supervisedlearningbyentropyminimization,in:AdvancesinNeuralInformationProcessingSystems (NeurIPS)

Grandvalet,Y.,Bengio,Y.,2004. Semi-supervisedlearningbyentropyminimization,in:AdvancesinNeuralInformationProcessingSystems (NeurIPS)

2004
[10]

Hörst, F., Rempe, M., Heine, L., Seibold, C., Keyl, J., Baldini, G., Ugurel, S., Siveke, J., Bockmayr, M., Samek, W., Fuchs, T.J., Kleesiek, J.,
[11]

doi:10.1016/j

CellViT: Vision transformers for precise cell segmentation and classification. Medical Image Analysis 94, 103143. doi:10.1016/j. media.2024.103143

work page doi:10.1016/j 2024
[12]

A visual-language foundation model for pathology image analysis using medical twitter

Huang, Z., Bianchi, F., Yuksekgonul, M., Montine, T.J., Zou, J., 2023. A visual-language foundation model for pathology image analysis using medical twitter. Nature Medicine 29, 2307–2316. doi:10.1038/s41591-023-02504-3

work page doi:10.1038/s41591-023-02504-3 2023
[13]

Deep learning can predict microsatellite instability directly from histology in gastrointestinal cancer

Kather, J.N., Pearson, A.T., Halama, N., Jäger, D., Krause, J., Loosen, S.H., Marx, A., Boor, P., Tacke, F., Neumann, U.P., Grabsch, H.I., Yoshikawa, T., Brenner, H., Chang-Claude, J., Hoffmeister, M., Trautwein, C., Luedde, T., 2019. Deep learning can predict microsatellite instability directly from histology in gastrointestinal cancer. Nature Medicine 2...

work page doi:10.1038/s41591-019-0462-y 2019
[14]

A structured tumor-immune microenvironment in triple negative breast cancer revealed by multiplexed ion beam imaging

Keren,L.,Bosse,M.,Marquez,D.,Angoshtari,R.,Jain,S.,Varma,S.,Yang,S.R.,Kurian,A.,VanValen,D.,West,R.,Bendall,S.C.,Angelo, M., 2018. A structured tumor-immune microenvironment in triple negative breast cancer revealed by multiplexed ion beam imaging. Cell 174, 1373–1387. doi:10.1016/j.cell.2018.08.039

work page doi:10.1016/j.cell.2018.08.039 2018
[15]

Panoptic segmentation, in: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp

Kirillov, A., He, K., Girshick, R., Rother, C., Dollár, P., 2019. Panoptic segmentation, in: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 9404–9413. doi:10.1109/CVPR.2019.00963

work page doi:10.1109/cvpr.2019.00963 2019
[17]

Morphometric grading of invasive ductal breast cancer

Kronqvist, P., Kuopio, T., Collan, Y., 1998. Morphometric grading of invasive ductal breast cancer. i. thresholds for nuclear grade. British Journal of Cancer 78, 800–805. doi:10.1038/bjc.1998.581

work page doi:10.1038/bjc.1998.581 1998
[18]

A visual-language foundation model for computational pathology

Lu, M.Y., Chen, B., Williamson, D.F.K., Chen, R.J., Liang, I., Ding, T., Jaume, G., Odia, I., Zhang, A., Le, L.P., Gerber, G.K., Mah- mood, F., 2024a. A visual-language foundation model for computational pathology. Nature Medicine 30, 863–874. doi:10.1038/ s41591-024-02856-4
[19]

Nature634(8033), 466–473 (Oct 2024)

Lu,M.Y.,Chen,B.,Williamson,D.F.K.,Chen,R.J.,Liang,I.,Ding,T.,Jaume,G.,Odintsov,I.,Le,L.P.,Gerber,G.,Parwani,A.V.,Zhang,A., Mahmood,F.,2024b. AmultimodalgenerativeAIcopilotforhumanpathology. Nature634,604–613. doi:10.1038/s41586-024-07618-3

work page doi:10.1038/s41586-024-07618-3
[20]

A tissue and cell-level annotated H&E and PD-L1 histopathology image dataset in non-small cell lung cancer

Lucassen,R.T.,Ciompi,F.,Veta,M.,Ciompi,F.,Bulten,W.,Balkenhol,M.,Geessink,O.,Smit,J.,Litjens,G.,Bejnordi,B.E.,Pluim,J.P.W., van der Laak, J., Geijs, D.J., 2025. A tissue and cell-level annotated H&E and PD-L1 histopathology image dataset in non-small cell lung cancer. arXiv preprint arXiv:2507.16855 doi:10.5281/zenodo.17735903

work page doi:10.5281/zenodo.17735903 2025
[21]

Bionemo: Large language model framework for life sciences.https://www.nvidia.com/en-us/clara/ bionemo/

NVIDIA Corporation, 2023. Bionemo: Large language model framework for life sciences.https://www.nvidia.com/en-us/clara/ bionemo/

2023
[22]

In: Proc

Ronneberger,O.,Fischer,P.,Brox,T.,2015. U-net:Convolutionalnetworksforbiomedicalimagesegmentation,in:MedicalImageComputing and Computer-Assisted Intervention (MICCAI), pp. 234–241. doi:10.1007/978-3-319-24574-4_28

work page doi:10.1007/978-3-319-24574-4_28 2015
[23]

Cell Reports 23, 181–193

Saltz, J., Gupta, R., Hou, L., Kurc, T., Singh, P., Nguyen, V., Samaras, D., Shroyer, K.R., Zhao, T., Batiste, R., Van Arnam, J., The Cancer GenomeAtlasResearchNetwork,Shmulevich,I.,Rao,A.U.K.,Lazar,A.J.,Sharma,A.,Thorsson,V.,2018.Spatialorganizationandmolecular correlation of tumor-infiltrating lymphocytes using deep learning on pathology images. Cell Re...

work page doi:10.1016/j.celrep 2018
[24]

Cell detection with star-convex polygons, in: Medical Image Computing and Computer Assisted Intervention (MICCAI), pp

Schmidt, U., Weigert, M., Broaddus, C., Myers, G., 2018. Cell detection with star-convex polygons, in: Medical Image Computing and Computer Assisted Intervention (MICCAI), pp. 265–273. doi:10.1007/978-3-030-00934-2_30

work page doi:10.1007/978-3-030-00934-2_30 2018
[25]

Cellpose:ageneralistalgorithmforcellularsegmentation

Stringer, C., Wang, T., Michaelos, M., Pachitariu, M., 2021. Cellpose: a generalist algorithm for cellular segmentation. Nature Methods 18, 100–106. doi:10.1038/s41592-020-01018-x

work page doi:10.1038/s41592-020-01018-x 2021
[26]

Tarvainen, A., Valpola, H., 2017. Mean teachers are better role models: Weight-averaged consistency targets improve semi-supervised deep learning results, in: Advances in Neural Information Processing Systems (NeurIPS). 27 of 28

2017
[27]

Tcga-ut cell instance and semantic pseudo-label dataset

Wan Ahmad, W.S.H.M., 2024. Tcga-ut cell instance and semantic pseudo-label dataset. HuggingFace Datasets.https://huggingface. co/datasets/mizjaggy18/tcga-ut-cell-instance-semantic

2024
[28]

Wolf, T., Debut, L., Sanh, V., Chaumond, J., Delangue, C., Moi, A., Cistac, P., Rault, T., Louf, R., Funtowicz, M., Davison, J., Shleifer, S., von Platen, P., Ma, C., Jernite, Y., Plu, J., Xu, C., Le Scao, T., Gugger, S., Drame, M., Lhoest, Q., Rush, A., 2020. Transformers: State-of- the-art natural language processing, in: Proceedings of the 2020 Confere...

work page doi:10.18653/v1/2020.emnlp-demos.6 2020
[29]

Unified perceptual parsing for scene understanding, in: Proceedings of the European Conference on Computer Vision (ECCV), pp

Xiao, T., Liu, Y., Zhou, B., Jiang, Y., Sun, J., 2018. Unified perceptual parsing for scene understanding, in: Proceedings of the European Conference on Computer Vision (ECCV), pp. 418–434. doi:10.1007/978-3-030-01246-5_26

work page doi:10.1007/978-3-030-01246-5_26 2018
[30]

Segformer: Simple and efficient design for semantic segmentation with transformers, in: Advances in Neural Information Processing Systems (NeurIPS), pp

Xie, E., Wang, W., Yu, Z., Anandkumar, A., Alvarez, J.M., Luo, P., 2021. Segformer: Simple and efficient design for semantic segmentation with transformers, in: Advances in Neural Information Processing Systems (NeurIPS), pp. 12077–12090

2021
[31]

A whole-slide foundation model for digital pathology from real-world data

Xu, H., Usuyama, N., Bagga, J., Zhang, S., Rao, R., Tristan, N., Wong, C., Gero, Z., Javier, G., Poon, H., 2024. A whole-slide foundation model for digital pathology from real-world data. Nature 630, 181–188. doi:10.1038/s41586-024-07441-w. 28 of 28

work page doi:10.1038/s41586-024-07441-w 2024

[1] [1]

Semi- supervised learning for network-based cardiac mr image segmentation, in: Medical Image Computing and Computer Assisted Intervention (MICCAI), pp

Bai, W., Oktay, O., Sinclair, M., Suzuki, H., Rajchl, M., Tarroni, G., Glocker, B., King, A., Matthews, P.M., Rueckert, D., 2017. Semi- supervised learning for network-based cardiac mr image segmentation, in: Medical Image Computing and Computer Assisted Intervention (MICCAI), pp. 253–260. doi:10.1007/978-3-319-66185-8_29

work page doi:10.1007/978-3-319-66185-8_29 2017

[2] [2]

Digitalquantificationofstromapercentageenhancesprognosticstratification in pancreatic cancer

Bengtsson,A.,Andersson,R.,Andersson,B.,Ansari,D.,2026. Digitalquantificationofstromapercentageenhancesprognosticstratification in pancreatic cancer. Surgery in Practice and Science doi:10.1016/j.sopen.2026.01.002

work page doi:10.1016/j.sopen.2026.01.002 2026

[3] [3]

Towards a general-purpose foundation model for computational pathology

Chen, R.J., Ding, T., Lu, M.Y., Williamson, D.F.K., Jaume, G., Song, A.H., Chen, B., Zhang, A., Shao, D., Schuffler, P.J., Mahmood, F., 2024a. Towards a general-purpose foundation model for computational pathology. Nature Medicine 30, 850–862. doi:10.1038/ s41591-024-02857-3

[4] [4]

Uni2: Towards a universal whole-slide foundation model for pathology

Chen, R.J., Lu, M.Y., Ding, T., Williamson, D.F.K., Jaume, G., Chen, B., Mahmood, F., 2024b. Uni2: Towards a universal whole-slide foundation model for pathology. arXiv preprint arXiv:2406.01647 doi:10.48550/arXiv.2406.01647

work page doi:10.48550/arxiv.2406.01647

[5] [5]

Approaches to treat immune hot, altered and cold tumours with combination immunotherapies

Galon, J., Bruni, D., 2019. Approaches to treat immune hot, altered and cold tumours with combination immunotherapies. Nature Reviews Drug Discovery 18, 197–218. doi:10.1038/s41573-018-0007-y

work page doi:10.1038/s41573-018-0007-y 2019

[6] [6]

Type, density, and location of immune cells within human colorectal tumors predict clinical outcome

Galon, J., Costes, A., Sanchez-Cabo, F., Kirilovsky, A., Mlecnik, B., Lagorce-Pageès, C., Tosolini, M., Camus, M., Berger, A., Wind, P., Zinzindohoué, F., Bruneval, P., Cugnenc, P.H., Trajanoski, Z., Fridman, W.H., Pagès, F., 2006. Type, density, and location of immune cells within human colorectal tumors predict clinical outcome. Science 313, 1960–1964. ...

work page doi:10.1126/science.1129139 2006

[7] [7]

Pannukedataset extension, insights and baselines

Gamper,J.,Koohbanani,N.A.,Benes,K.,Graham,S.,Jahanifar,M.,Khurram,S.A.,Azam,A.,Hewitt,K.,Rajpoot,N.,2020. Pannukedataset extension, insights and baselines. arXiv preprint arXiv:2003.10778 doi:10.48550/arXiv.2003.10778

work page doi:10.48550/arxiv.2003.10778 2020

[8] [8]

Hover-net: Simultaneous segmentation and classification of nuclei in multi-tissue histology images

Graham, S., Vu, Q.D., Raza, S.E.A., Azam, A., Tsang, Y.W., Kwak, J.T., Rajpoot, N., 2019. Hover-net: Simultaneous segmentation and classification of nuclei in multi-tissue histology images. Medical Image Analysis 58, 101563. doi:10.1016/j.media.2019.101563

work page doi:10.1016/j.media.2019.101563 2019

[9] [9]

Semi-supervisedlearningbyentropyminimization,in:AdvancesinNeuralInformationProcessingSystems (NeurIPS)

Grandvalet,Y.,Bengio,Y.,2004. Semi-supervisedlearningbyentropyminimization,in:AdvancesinNeuralInformationProcessingSystems (NeurIPS)

2004

[10] [10]

Hörst, F., Rempe, M., Heine, L., Seibold, C., Keyl, J., Baldini, G., Ugurel, S., Siveke, J., Bockmayr, M., Samek, W., Fuchs, T.J., Kleesiek, J.,

[11] [11]

doi:10.1016/j

CellViT: Vision transformers for precise cell segmentation and classification. Medical Image Analysis 94, 103143. doi:10.1016/j. media.2024.103143

work page doi:10.1016/j 2024

[12] [12]

A visual-language foundation model for pathology image analysis using medical twitter

Huang, Z., Bianchi, F., Yuksekgonul, M., Montine, T.J., Zou, J., 2023. A visual-language foundation model for pathology image analysis using medical twitter. Nature Medicine 29, 2307–2316. doi:10.1038/s41591-023-02504-3

work page doi:10.1038/s41591-023-02504-3 2023

[13] [13]

Deep learning can predict microsatellite instability directly from histology in gastrointestinal cancer

Kather, J.N., Pearson, A.T., Halama, N., Jäger, D., Krause, J., Loosen, S.H., Marx, A., Boor, P., Tacke, F., Neumann, U.P., Grabsch, H.I., Yoshikawa, T., Brenner, H., Chang-Claude, J., Hoffmeister, M., Trautwein, C., Luedde, T., 2019. Deep learning can predict microsatellite instability directly from histology in gastrointestinal cancer. Nature Medicine 2...

work page doi:10.1038/s41591-019-0462-y 2019

[14] [14]

A structured tumor-immune microenvironment in triple negative breast cancer revealed by multiplexed ion beam imaging

Keren,L.,Bosse,M.,Marquez,D.,Angoshtari,R.,Jain,S.,Varma,S.,Yang,S.R.,Kurian,A.,VanValen,D.,West,R.,Bendall,S.C.,Angelo, M., 2018. A structured tumor-immune microenvironment in triple negative breast cancer revealed by multiplexed ion beam imaging. Cell 174, 1373–1387. doi:10.1016/j.cell.2018.08.039

work page doi:10.1016/j.cell.2018.08.039 2018

[15] [15]

Panoptic segmentation, in: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp

Kirillov, A., He, K., Girshick, R., Rother, C., Dollár, P., 2019. Panoptic segmentation, in: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 9404–9413. doi:10.1109/CVPR.2019.00963

work page doi:10.1109/cvpr.2019.00963 2019

[16] [17]

Morphometric grading of invasive ductal breast cancer

Kronqvist, P., Kuopio, T., Collan, Y., 1998. Morphometric grading of invasive ductal breast cancer. i. thresholds for nuclear grade. British Journal of Cancer 78, 800–805. doi:10.1038/bjc.1998.581

work page doi:10.1038/bjc.1998.581 1998

[17] [18]

A visual-language foundation model for computational pathology

Lu, M.Y., Chen, B., Williamson, D.F.K., Chen, R.J., Liang, I., Ding, T., Jaume, G., Odia, I., Zhang, A., Le, L.P., Gerber, G.K., Mah- mood, F., 2024a. A visual-language foundation model for computational pathology. Nature Medicine 30, 863–874. doi:10.1038/ s41591-024-02856-4

[18] [19]

Nature634(8033), 466–473 (Oct 2024)

Lu,M.Y.,Chen,B.,Williamson,D.F.K.,Chen,R.J.,Liang,I.,Ding,T.,Jaume,G.,Odintsov,I.,Le,L.P.,Gerber,G.,Parwani,A.V.,Zhang,A., Mahmood,F.,2024b. AmultimodalgenerativeAIcopilotforhumanpathology. Nature634,604–613. doi:10.1038/s41586-024-07618-3

work page doi:10.1038/s41586-024-07618-3

[19] [20]

A tissue and cell-level annotated H&E and PD-L1 histopathology image dataset in non-small cell lung cancer

Lucassen,R.T.,Ciompi,F.,Veta,M.,Ciompi,F.,Bulten,W.,Balkenhol,M.,Geessink,O.,Smit,J.,Litjens,G.,Bejnordi,B.E.,Pluim,J.P.W., van der Laak, J., Geijs, D.J., 2025. A tissue and cell-level annotated H&E and PD-L1 histopathology image dataset in non-small cell lung cancer. arXiv preprint arXiv:2507.16855 doi:10.5281/zenodo.17735903

work page doi:10.5281/zenodo.17735903 2025

[20] [21]

Bionemo: Large language model framework for life sciences.https://www.nvidia.com/en-us/clara/ bionemo/

NVIDIA Corporation, 2023. Bionemo: Large language model framework for life sciences.https://www.nvidia.com/en-us/clara/ bionemo/

2023

[21] [22]

In: Proc

Ronneberger,O.,Fischer,P.,Brox,T.,2015. U-net:Convolutionalnetworksforbiomedicalimagesegmentation,in:MedicalImageComputing and Computer-Assisted Intervention (MICCAI), pp. 234–241. doi:10.1007/978-3-319-24574-4_28

work page doi:10.1007/978-3-319-24574-4_28 2015

[22] [23]

Cell Reports 23, 181–193

Saltz, J., Gupta, R., Hou, L., Kurc, T., Singh, P., Nguyen, V., Samaras, D., Shroyer, K.R., Zhao, T., Batiste, R., Van Arnam, J., The Cancer GenomeAtlasResearchNetwork,Shmulevich,I.,Rao,A.U.K.,Lazar,A.J.,Sharma,A.,Thorsson,V.,2018.Spatialorganizationandmolecular correlation of tumor-infiltrating lymphocytes using deep learning on pathology images. Cell Re...

work page doi:10.1016/j.celrep 2018

[23] [24]

Cell detection with star-convex polygons, in: Medical Image Computing and Computer Assisted Intervention (MICCAI), pp

Schmidt, U., Weigert, M., Broaddus, C., Myers, G., 2018. Cell detection with star-convex polygons, in: Medical Image Computing and Computer Assisted Intervention (MICCAI), pp. 265–273. doi:10.1007/978-3-030-00934-2_30

work page doi:10.1007/978-3-030-00934-2_30 2018

[24] [25]

Cellpose:ageneralistalgorithmforcellularsegmentation

Stringer, C., Wang, T., Michaelos, M., Pachitariu, M., 2021. Cellpose: a generalist algorithm for cellular segmentation. Nature Methods 18, 100–106. doi:10.1038/s41592-020-01018-x

work page doi:10.1038/s41592-020-01018-x 2021

[25] [26]

Tarvainen, A., Valpola, H., 2017. Mean teachers are better role models: Weight-averaged consistency targets improve semi-supervised deep learning results, in: Advances in Neural Information Processing Systems (NeurIPS). 27 of 28

2017

[26] [27]

Tcga-ut cell instance and semantic pseudo-label dataset

Wan Ahmad, W.S.H.M., 2024. Tcga-ut cell instance and semantic pseudo-label dataset. HuggingFace Datasets.https://huggingface. co/datasets/mizjaggy18/tcga-ut-cell-instance-semantic

2024

[27] [28]

Wolf, T., Debut, L., Sanh, V., Chaumond, J., Delangue, C., Moi, A., Cistac, P., Rault, T., Louf, R., Funtowicz, M., Davison, J., Shleifer, S., von Platen, P., Ma, C., Jernite, Y., Plu, J., Xu, C., Le Scao, T., Gugger, S., Drame, M., Lhoest, Q., Rush, A., 2020. Transformers: State-of- the-art natural language processing, in: Proceedings of the 2020 Confere...

work page doi:10.18653/v1/2020.emnlp-demos.6 2020

[28] [29]

Unified perceptual parsing for scene understanding, in: Proceedings of the European Conference on Computer Vision (ECCV), pp

Xiao, T., Liu, Y., Zhou, B., Jiang, Y., Sun, J., 2018. Unified perceptual parsing for scene understanding, in: Proceedings of the European Conference on Computer Vision (ECCV), pp. 418–434. doi:10.1007/978-3-030-01246-5_26

work page doi:10.1007/978-3-030-01246-5_26 2018

[29] [30]

Segformer: Simple and efficient design for semantic segmentation with transformers, in: Advances in Neural Information Processing Systems (NeurIPS), pp

Xie, E., Wang, W., Yu, Z., Anandkumar, A., Alvarez, J.M., Luo, P., 2021. Segformer: Simple and efficient design for semantic segmentation with transformers, in: Advances in Neural Information Processing Systems (NeurIPS), pp. 12077–12090

2021

[30] [31]

A whole-slide foundation model for digital pathology from real-world data

Xu, H., Usuyama, N., Bagga, J., Zhang, S., Rao, R., Tristan, N., Wong, C., Gero, Z., Javier, G., Poon, H., 2024. A whole-slide foundation model for digital pathology from real-world data. Nature 630, 181–188. doi:10.1038/s41586-024-07441-w. 28 of 28

work page doi:10.1038/s41586-024-07441-w 2024