Recognition: unknown
Cluster-First Labelling: An Automated Pipeline for Segmentation and Morphological Clustering in Histology Whole Slide Images
Pith reviewed 2026-05-10 16:28 UTC · model grok-4.3
The pith
A cluster-first pipeline segments histology images and groups similar tissue components so humans label clusters instead of individual objects.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The system tiles whole slide images, filters uninformative areas, segments tissue components, extracts neural embeddings, reduces dimensionality, and applies density-based clustering to produce groups of morphologically similar objects. Human labeling then occurs at the cluster level rather than for each individual component, producing a weighted cluster-label alignment accuracy of 96.8 percent across 3,696 evaluated structures from 13 tissue types in human, rat, and rabbit samples, with perfect agreement in seven of those types.
What carries the argument
The cluster-first paradigm, in which unsupervised morphological clustering of segmented objects occurs before any human labeling, shifting effort from individuals to representative groups.
If this is right
- Annotation effort drops by orders of magnitude for slides containing tens of thousands of structures.
- The pipeline handles diverse tissue types from multiple species with high measured alignment to human judgments.
- Seven of the 13 tested tissue types reach perfect cluster-label agreement.
- The full pipeline, companion web application, and evaluation code are released as open-source software.
Where Pith is reading between the lines
- The method could enable routine morphological analysis of slide repositories that are currently too large for manual labeling.
- Similar cluster-first strategies might transfer to other high-volume biomedical imaging tasks beyond histology.
- The open-source components could support community experiments on additional tissue datasets to test cluster stability.
Load-bearing premise
The unsupervised clusters formed from image embeddings correspond to categories that human annotators would consistently recognize and label in the same way.
What would settle it
A new collection of whole slide images in which many clusters contain structures receiving inconsistent human labels, resulting in alignment accuracy substantially below 96.8 percent.
Figures
read the original abstract
Labelling tissue components in histology whole slide images (WSIs) is prohibitively labour-intensive: a single slide may contain tens of thousands of structures--cells, nuclei, and other morphologically distinct objects--each requiring manual boundary delineation and classification. We present a cloudnative, end-to-end pipeline that automates this process through a cluster-first paradigm. Our system tiles WSIs, filters out tiles deemed unlikely to contain valuable information, segments tissue components with Cellpose-SAM (including cells, nuclei, and other morphologically similar structures), extracts neural embeddings via a pretrained ResNet-50, reduces dimensionality with UMAP, and groups morphologically similar objects using DBSCAN clustering. Under this paradigm, a human annotator labels representative clusters rather than individual objects, reducing annotation effort by orders of magnitude. We evaluate the pipeline on 3,696 tissue components across 13 diverse tissue types from three species (human, rat, rabbit), measuring how well unsupervised clusters align with independent human labels via per-tile Hungarian-algorithm matching. Our system achieves a weighted cluster-label alignment accuracy of 96.8%, with 7 of 13 tissue types reaching perfect agreement. The pipeline, a companion labelling web application, and all evaluation code are released as open-source software.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript presents a cloud-native, end-to-end pipeline for automating segmentation and morphological clustering in histology whole slide images (WSIs) under a cluster-first labeling paradigm. WSIs are tiled and filtered, tissue components (cells, nuclei, and similar structures) are segmented via Cellpose-SAM, embeddings are extracted with a pretrained ResNet-50, dimensionality is reduced with UMAP, and objects are grouped with DBSCAN. A human then labels representative clusters rather than individual objects. The pipeline is evaluated on 3,696 tissue components across 13 tissue types from three species (human, rat, rabbit), reporting a weighted cluster-label alignment accuracy of 96.8% (with perfect agreement in 7 tissue types) obtained via per-tile Hungarian-algorithm matching. The pipeline, a companion labeling web application, and all evaluation code are released as open source.
Significance. If the reported alignment holds under a global cluster-labeling regime, the work could reduce annotation effort in digital pathology by orders of magnitude, shifting the burden from labeling tens of thousands of individual objects to labeling a much smaller number of clusters. The open-source release of the full pipeline, web app, and reproducible evaluation code is a clear strength that supports adoption and extension. The significance is nevertheless conditional on whether the unsupervised clusters are morphologically consistent enough to receive a single, stable human label across their full extent.
major comments (1)
- [Evaluation procedure] Evaluation procedure (abstract and results): The 96.8% weighted cluster-label alignment accuracy is obtained via per-tile Hungarian matching after DBSCAN on the pooled set of 3,696 components. Because a single cluster ID can contain morphologically similar objects drawn from different tissue types or species that carry distinct human labels, the per-tile optimal assignment permits the same cluster to be matched to different labels in different tiles. This metric therefore does not test the central cluster-first claim that a single human-assigned label to an entire cluster would be correct across the cluster's full extent.
minor comments (2)
- [Methods] The exact tile-filtering criteria, the procedure used to select DBSCAN eps and min_samples (and UMAP n_neighbors, min_dist), and whether the 96.8% figure is a single run or an average are not stated, limiting reproducibility.
- [Results] The abstract and results would benefit from a brief statement of the number of clusters produced and the distribution of cluster sizes, which directly affects the claimed reduction in annotation effort.
Simulated Author's Rebuttal
We thank the referee for their constructive comments, which help clarify the strengths and limitations of our evaluation. We address the major concern on the evaluation procedure below and will update the manuscript to strengthen the validation of the cluster-first paradigm.
read point-by-point responses
-
Referee: [Evaluation procedure] Evaluation procedure (abstract and results): The 96.8% weighted cluster-label alignment accuracy is obtained via per-tile Hungarian matching after DBSCAN on the pooled set of 3,696 components. Because a single cluster ID can contain morphologically similar objects drawn from different tissue types or species that carry distinct human labels, the per-tile optimal assignment permits the same cluster to be matched to different labels in different tiles. This metric therefore does not test the central cluster-first claim that a single human-assigned label to an entire cluster would be correct across the cluster's full extent.
Authors: We agree that the per-tile Hungarian matching does not directly test global label consistency within each cluster, which is central to the cluster-first claim. Although DBSCAN is performed on the pooled embeddings and the high accuracy indicates effective morphological grouping, the per-tile optimal assignment can mask cases where a single cluster spans objects with differing human labels across tiles or tissue types. To address this, we will revise the Results and Evaluation sections to add two complementary global metrics: (1) cluster purity, defined as the fraction of each cluster's members sharing the majority human label, averaged across clusters weighted by size; and (2) a simulated cluster-first accuracy obtained by assigning the majority label to every member of a cluster and computing agreement with the full set of individual human labels. These additions will provide a direct assessment of whether a single label per cluster is reliable across its full extent. We will report both the original per-tile metric and the new global metrics for transparency. revision: yes
Circularity Check
No significant circularity detected; evaluation is post-hoc and independent.
full rationale
The paper describes an unsupervised pipeline (Cellpose-SAM segmentation, ResNet-50 embeddings, UMAP dimensionality reduction, DBSCAN clustering) followed by a separate evaluation step that computes per-tile Hungarian matching between cluster IDs and independent human labels on 3,696 components. The reported 96.8% weighted alignment accuracy is a post-hoc measurement and is not used to fit, select, or optimize any pipeline parameters. No self-definitional equations, fitted-input predictions, load-bearing self-citations, uniqueness theorems, or ansatzes are present that would reduce the central claim to its own inputs by construction. The derivation chain remains self-contained against external benchmarks.
Axiom & Free-Parameter Ledger
free parameters (2)
- DBSCAN eps and min_samples
- UMAP n_neighbors and min_dist
axioms (2)
- domain assumption Cellpose-SAM produces accurate instance segmentations of cells and nuclei in the tested histology images
- domain assumption ResNet-50 embeddings capture morphological similarity relevant to human labeling decisions
Reference graph
Works this paper leans on
-
[1]
QuPath: Open source software for digital pathology image analysis.Scientific Reports, 7(1):16878, 2017
Peter Bankhead, Maurice B Loughrey, José A Fernán- dez, Yvonne Dombrowski, Darragh G McArt, Philip D Dunne, Stephen McQuaid, Ronan T Gray, Liam J Murray, Helen G Coleman, et al. QuPath: Open source software for digital pathology image analysis.Scientific Reports, 7(1):16878, 2017
2017
-
[2]
The OpenCV library.Dr
Gary Bradski. The OpenCV library.Dr. Dobb’s Journal of Software Tools, 2000
2000
-
[3]
Towards a general-purpose foundation model for computational pathology.Nature Medicine, 30(3):850– 862, 2024
Richard J Chen, Tong Ding, Ming Y Lu, Drew F K Williamson, Guillaume Jaume, Andrew H Song, Bowen Chen, Andrew Zhang, Daniel Shao, Muhammad Shaban, et al. Towards a general-purpose foundation model for computational pathology.Nature Medicine, 30(3):850– 862, 2024
2024
-
[4]
ImageNet: A large-scale hierarchical image database
Jia Deng, Wei Dong, Richard Socher, Li-Jia Li, Kai Li, and Li Fei-Fei. ImageNet: A large-scale hierarchical image database. InIEEE Conference on Computer Vision and Pattern Recognition, pages 248–255, 2009
2009
-
[5]
Unbiased single-cell morphology with self-supervised vision transformers.bioRxiv, 2023
Michael Doron, Théo Moutakanni, Zitong S Chen, Nikita Moshkov, Mathilde Caron, Hugo Touvron, Piotr Bo- janowski, Wolfgang M Pernice, and Juan C Caicedo. Unbiased single-cell morphology with self-supervised vision transformers.bioRxiv, 2023
2023
-
[6]
A density-based algorithm for discovering clusters in large spatial databases with noise
Martin Ester, Hans-Peter Kriegel, Jörg Sander, and Xi- aowei Xu. A density-based algorithm for discovering clusters in large spatial databases with noise. InProceed- ings of the 2nd International Conference on Knowledge Discovery and Data Mining, pages 226–231, 1996
1996
-
[7]
Whole slide imaging in pathology: advantages, limita- tions, and emerging perspectives.Pathology and Labora- tory Medicine International, 7:23–33, 2015
Navid Farahani, Anil V Parwani, and Liron Pantanowitz. Whole slide imaging in pathology: advantages, limita- tions, and emerging perspectives.Pathology and Labora- tory Medicine International, 7:23–33, 2015
2015
-
[8]
Openslide: A vendor- neutral software foundation for digital pathology.Journal of Pathology Informatics, 4(1):27, 2013
Adam Goode, Benjamin Gilbert, Jan Harkes, Drazen Ju- kic, and Mahadev Satyanarayanan. Openslide: A vendor- neutral software foundation for digital pathology.Journal of Pathology Informatics, 4(1):27, 2013
2013
-
[9]
Hover-net: Simultaneous segmentation and classification of nuclei in multi-tissue histology images
Simon Graham, Quoc Dang Vu, Shan E Ahmed Raza, Ayesha Azam, Yee Wah Tsang, Jin Tae Kwak, and Nasir Rajpoot. Hover-net: Simultaneous segmentation and classification of nuclei in multi-tissue histology images. Medical Image Analysis, 58:101563, 2019
2019
-
[10]
Deep residual learning for image recognition
Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 770–778, 2016
2016
-
[11]
The Hungarian method for the assign- ment problem.Naval Research Logistics Quarterly, 2(1– 2):83–97, 1955
Harold W Kuhn. The Hungarian method for the assign- ment problem.Naval Research Logistics Quarterly, 2(1– 2):83–97, 1955
1955
-
[12]
Data-efficient and weakly supervised computa- tional pathology on whole-slide images.Nature Biomed- ical Engineering, 5(6):555–570, 2021
Ming Y Lu, Drew F K Williamson, Tiffany Y Chen, Richard J Chen, Matteo Barbieri, and Faisal Mah- mood. Data-efficient and weakly supervised computa- tional pathology on whole-slide images.Nature Biomed- ical Engineering, 5(6):555–570, 2021
2021
-
[13]
UMAP: Uniform Manifold Approximation and Projection for Dimension Reduction
Leland McInnes, John Healy, and James Melville. UMAP: Uniform manifold approximation and pro- jection for dimension reduction.arXiv preprint arXiv:1802.03426, 2018
work page internal anchor Pith review arXiv 2018
-
[14]
Azure Machine Learning documenta- tion
Microsoft. Azure Machine Learning documenta- tion. https://learn.microsoft.com/en-us/ azure/machine-learning/, 2026. Accessed: 2026-04-09
2026
-
[15]
Cellpose-SAM: superhuman generalization for cellular segmentation.bioRxiv, 2025
Marius Pachitariu, Michael Rariden, and Carsen Stringer. Cellpose-SAM: superhuman generalization for cellular segmentation.bioRxiv, 2025
2025
-
[16]
Review of the current state of whole slide imaging in pathology.Journal of Pathology Informatics, 2(1):36, 2011
Liron Pantanowitz, Paul N Valenstein, Andrew J Evans, Keith J Kaplan, John D Pfeifer, David C Wilbur, Laura C Collins, and Terence J Colgan. Review of the current state of whole slide imaging in pathology.Journal of Pathology Informatics, 2(1):36, 2011. 6
2011
-
[17]
RAPIDS: Open GPU data science
RAPIDS Development Team. RAPIDS: Open GPU data science. https://rapids.ai, 2026. Accessed: 2026-04-09
2026
-
[18]
U-net: Convolutional networks for biomedical image seg- mentation
Olaf Ronneberger, Philipp Fischer, and Thomas Brox. U-net: Convolutional networks for biomedical image seg- mentation. InMedical Image Computing and Computer Assisted Intervention – MICCAI 2015, pages 234–241. Springer, 2015
2015
-
[19]
Cell detection with star-convex polygons
Uwe Schmidt, Martin Weigert, Coleman Broaddus, and Gene Myers. Cell detection with star-convex polygons. InMedical Image Computing and Computer Assisted Intervention – MICCAI 2018, pages 265–273. Springer, 2018
2018
-
[20]
Deep neural network models for computational histopathology: A survey.Medical Image Analysis, 67:101813, 2021
Chetan L Srinidhi, Ozan Ciga, and Anne L Martel. Deep neural network models for computational histopathology: A survey.Medical Image Analysis, 67:101813, 2021
2021
-
[21]
Cellpose: a generalist algorithm for cellular segmentation.Nature Methods, 18(1):100–106, 2021
Carsen Stringer, Tim Wang, Michalis Michaelos, and Marius Pachitariu. Cellpose: a generalist algorithm for cellular segmentation.Nature Methods, 18(1):100–106, 2021. 7
2021
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.