Needle in a Haystack: One-Class Representation Learning for Detecting Rare Malignant Cells in Computational Cytology
Pith reviewed 2026-05-10 17:01 UTC · model grok-4.3
The pith
One-class learning from normal cells alone detects rare malignant cells better than supervised methods in ultra-low prevalence cytology.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
DSVDD learns a compact hypersphere around normal cell representations from slide-negative patches alone and ranks test patches by distance from the center, achieving state-of-the-art abnormality detection at witness rates of 1 percent or lower and sometimes exceeding fully supervised baselines that require exhaustive instance labels. DROC offers competitive results by using distribution-augmented contrastive learning to handle rarity.
What carries the argument
Deep Support Vector Data Description (DSVDD), which embeds normal patches into a feature space and measures abnormality as Euclidean distance from a learned center point.
If this is right
- Instance-level detection becomes possible without any positive examples or instance-level annotations.
- Ranking accuracy remains high in ultra-low witness-rate regimes where MIL methods lose generalization.
- DROC gains robustness through explicit distribution augmentation in contrastive training.
- The distance-to-center score provides direct interpretability for flagged cells.
Where Pith is reading between the lines
- The same normality-modeling strategy could transfer to rare-event detection in other high-resolution medical imaging tasks such as radiology.
- Pairing the one-class model with cheap slide-level labels might further boost performance without requiring instance annotations.
- Evaluating transfer across additional cancer types would test whether normal representations remain stable under varying cell morphologies.
Load-bearing premise
Representations learned exclusively from slide-negative patches will generalize to detect morphologically diverse malignant cells never seen during training.
What would settle it
A new test set containing malignant cell morphologies absent from the normal training distribution would show DSVDD ranking positives no better than random or below MIL baselines.
Figures
read the original abstract
In computational cytology, detecting malignancy on whole-slide images is difficult because malignant cells are morphologically diverse yet vanishingly rare amid a vast background of normal cells. Accurate detection of these extremely rare malignant cells remains challenging due to large class imbalance and limited annotations. Conventional weakly supervised approaches, such as multiple instance learning (MIL), often fail to generalize at the instance level, especially when the fraction of malignant cells (witness rate) is exceedingly low. In this study, we explore the use of one-class representation learning techniques for detecting malignant cells in low-witness-rate scenarios. These methods are trained exclusively on slide-negative patches, without requiring any instance-level supervision. Specifically, we evaluate two OCC approaches, DSVDD and DROC, and compare them with FS-SIL, WS-SIL, and the recent ItS2CLR method. The one-class methods learn compact representations of normality and detect deviations at test time. Experiments on a publicly available bone marrow cytomorphology dataset (TCIA) and an in-house oral cancer cytology dataset show that DSVDD achieves state-of-the-art performance in instance-level abnormality ranking, particularly in ultra-low witness-rate regimes ($\leq 1\%$) and, in some cases, even outperforming fully supervised learning, which is typically not a practical option in whole-slide cytology due to the infeasibility of exhaustive instance-level annotations. DROC is also competitive under extreme rarity, benefiting from distribution-augmented contrastive learning. These findings highlight one-class representation learning as a robust and interpretable superior choice to MIL for malignant cell detection under extreme rarity.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper claims that one-class representation learning methods (DSVDD and DROC), trained exclusively on slide-negative patches without instance-level labels, learn compact normality representations that enable superior instance-level abnormality ranking for rare malignant cells in computational cytology. On the TCIA bone marrow cytomorphology dataset and an in-house oral cancer dataset, DSVDD achieves state-of-the-art performance particularly in ultra-low witness-rate regimes (≤1%), sometimes outperforming fully supervised learning, while DROC is competitive; both outperform MIL baselines like ItS2CLR under extreme rarity.
Significance. If the empirical results hold after proper validation, the work would be significant for computational pathology by demonstrating a practical, annotation-free alternative to MIL and supervised methods for detecting vanishingly rare malignant cells amid morphologically diverse backgrounds in whole-slide images, where exhaustive labeling is infeasible.
major comments (2)
- The manuscript provides no details on data splits, hyperparameter selection, ablation studies, or statistical testing for the reported performance gains (abstract and experiments sections). This prevents assessment of whether DSVDD's SOTA claims at ≤1% witness rates are robust or reproducible.
- The central generalization assumption—that representations learned solely from slide-negative patches produce a compact normal manifold whose complement reliably ranks all morphologically diverse malignant cells—is not validated with subtype-specific analysis, manifold visualizations, or failure-case examination on the TCIA and oral-cancer test sets (results and discussion sections). This is load-bearing for the claim of outperformance over supervised baselines.
minor comments (1)
- Acronyms FS-SIL, WS-SIL, and ItS2CLR are used in the abstract without prior definition.
Simulated Author's Rebuttal
We thank the referee for the constructive and insightful comments. We agree that the manuscript requires additional details and validation to strengthen the claims. We address each major comment below and outline the revisions we will make.
read point-by-point responses
-
Referee: The manuscript provides no details on data splits, hyperparameter selection, ablation studies, or statistical testing for the reported performance gains (abstract and experiments sections). This prevents assessment of whether DSVDD's SOTA claims at ≤1% witness rates are robust or reproducible.
Authors: We agree that the original manuscript omitted critical experimental details, which limits reproducibility and assessment of robustness. In the revised version, we will add a comprehensive 'Experimental Setup' section that specifies: patient-level data splits (e.g., 70/15/15 train/validation/test with no slide overlap to prevent leakage), the hyperparameter selection process (grid search over validation sets for DSVDD radius/center and DROC augmentation parameters), ablation studies on backbone networks, loss terms, and witness-rate sampling strategies, and statistical testing (mean ± std over 5 random seeds with paired Wilcoxon signed-rank tests and p-values for all comparisons against MIL baselines at ≤1% witness rates). These changes will directly support the SOTA claims. revision: yes
-
Referee: The central generalization assumption—that representations learned solely from slide-negative patches produce a compact normal manifold whose complement reliably ranks all morphologically diverse malignant cells—is not validated with subtype-specific analysis, manifold visualizations, or failure-case examination on the TCIA and oral-cancer test sets (results and discussion sections). This is load-bearing for the claim of outperformance over supervised baselines.
Authors: We acknowledge this as a substantive point: the assumption underpins the outperformance claims, and indirect evidence from two datasets is insufficient without direct validation. While the consistent gains of DSVDD/DROC over ItS2CLR and supervised baselines at ultra-low witness rates empirically support a compact normality representation that generalizes across morphological diversity, we will strengthen this in revision. We will add UMAP visualizations of the learned embeddings (showing tight normal clusters and outlier malignant points), subtype-specific breakdowns (using TCIA cell-type annotations and qualitative morphological diversity analysis for the oral dataset), and a dedicated failure-case discussion (e.g., cases of atypical normals or rare malignant variants that receive lower ranks). This will make the generalization claim more robust without changing the core results. revision: yes
Circularity Check
No circularity: empirical benchmark on held-out test data
full rationale
The paper reports experimental results from training DSVDD and DROC exclusively on slide-negative patches and evaluating instance-level ranking on held-out test patches from TCIA and oral-cancer datasets. No equations, derivations, or first-principles claims are present that reduce any reported metric (e.g., ranking performance at ≤1% witness rate) to a fitted parameter or self-citation by construction. All performance numbers are obtained via standard train/test splits on external data, with comparisons to FS-SIL, WS-SIL, and ItS2CLR baselines. This is a self-contained empirical study whose central claims rest on observable test-set outcomes rather than any definitional or self-referential reduction.
Axiom & Free-Parameter Ledger
Reference graph
Works this paper leans on
-
[1]
David Kim, Kaitlin E Sundling, Renu Virk, Michael J Thrall, Susan Alperstein, Marilyn M Bui, Heather Chen-Yost, Amber D Donnelly, Oscar Lin, Xiaoying Liu, et al. Digital cytology part 1: digital cytologyimplementationforpractice:aconceptpaperwithreviewand recommendations from the american society of cytopathology digital cytologytaskforce.JournaloftheAmer...
work page 2024
-
[2]
David Kim, Kaitlin E Sundling, Renu Virk, Michael J Thrall, Susan Alperstein, Marilyn M Bui, Heather Chen-Yost, Amber D Donnelly, Oscar Lin, Xiaoying Liu, et al. Digital cytology part 2: artificial intelligence in cytology: a concept paper with review and recommen- dations from the american society of cytopathology digital cytology task force.Journal of t...
work page 2024
-
[3]
Deep learning for computational cytology: A survey.Medical Image Analysis, 84:102691, 2023
HaoJiang,YanningZhou,YiLin,RonaldCKChan,JiangLiu,andHao Chen. Deep learning for computational cytology: A survey.Medical Image Analysis, 84:102691, 2023
work page 2023
-
[4]
Michael S Landau and Liron Pantanowitz. Artificial intelligence in cytopathology: a review of the literature and overview of commercial landscape.Journal of the American Society of Cytopathology, 8(4): 230–241, 2019
work page 2019
-
[5]
EwertBengtssonandPatrikMalm. Screeningforcervicalcancerusing automated analysis of pap-smears.Computational and mathematical methods in medicine, 2014(1):842037, 2014
work page 2014
-
[6]
Robust whole slide image analysis for cervical cancer screening using deep learning
Shenghua Cheng, Sibo Liu, Jingya Yu, Gong Rao, Yuwei Xiao, Wei Han, Wenjie Zhu, Xiaohua Lv, Ning Li, Jing Cai, et al. Robust whole slide image analysis for cervical cancer screening using deep learning. Nature communications, 12(1):5639, 2021
work page 2021
-
[7]
Dolores Subirá, Fabiola Barriopedro, Jesús Fernández, Ruth Martínez, Luis Chara, Jorge Castelao, and Eugenia García. High sensitivity flow cytometry immunophenotyping increases the diagnostic yield of malignant pleural effusions.Clinical & Experimental Metastasis, 40 (6):505–515, 2023
work page 2023
-
[8]
Techniques for early diagnosis of oral squamous cell carcinoma: Systematic review
Clàudia Carreras-Torras and Cosme Gay-Escoda. Techniques for early diagnosis of oral squamous cell carcinoma: Systematic review. Medicina oral, patologia oral y cirugia bucal, 20(3):e305, 2015
work page 2015
-
[9]
Aaron Kruse, Nour Abdel-Azim, Hye Na Kim, Yongsheng Ruan, Valerie Phan, Heather Ogana, William Wang, Rachel Lee, Eun Ji Gang, Sajad Khazal, et al. Minimal residual disease detection in acute lymphoblastic leukemia.International journal of molecular sciences, 21(3):1054, 2020
work page 2020
-
[10]
Elaine Coustan-Smith, Guangchun Song, Christopher Clark, Laura Key, Peixin Liu, Mohammad Mehrpooya, Patricia Stow, Xiaoping Su, Sheila Shurtleff, Ching-Hon Pui, et al. New markers for minimal Chatterjee et al.:Preprint submitted to ElsevierPage 14 of 15 One-Class Representation Learning for Rare Malignant Cell Detection residualdiseasedetectioninacutelymp...
work page 2011
-
[11]
Gabriele Campanella, Michael G Hanna, Levi Geneslaw, Antonio Miraflor, Victor Werneck Krauss Silva, Klaus J Busam, Edi Brogi, VictorEReuter,DavidSKlimstra,andThomasJFuchs. Clinical-grade computational pathology using weakly supervised deep learning on whole slide images.Nature medicine, 25(8):1301–1309, 2019
work page 2019
-
[12]
Attention-based deep multiple instance learning
Maximilian Ilse, Jakub Tomczak, and Max Welling. Attention-based deep multiple instance learning. InInternational conference on machine learning, pages 2127–2136. PMLR, 2018
work page 2018
-
[13]
Clam: Clustering-constrained attentionmultipleinstancelearningforwholeslideimageclassification
Ming Y Lu, Drew F K Williamson, Tiffany Y Chen, Richard J Chen, Milena Barbieri, and Faisal Mahmood. Clam: Clustering-constrained attentionmultipleinstancelearningforwholeslideimageclassification. InConference on Computer Vision and Pattern Recognition (CVPR), 2021
work page 2021
-
[14]
Ming Y Lu, Drew FK Williamson, Tiffany Y Chen, Richard J Chen, Matteo Barbieri, and Faisal Mahmood. Data-efficient and weakly supervised computational pathology on whole-slide images.Nature biomedical engineering, 5(6):555–570, 2021
work page 2021
-
[15]
One-class classification: A survey.IEEETransactionsonKnowledgeandDataEngineering,2020
Pramuditha Perera and Vishal M Patel. One-class classification: A survey.IEEETransactionsonKnowledgeandDataEngineering,2020
work page 2020
-
[16]
Lukas Ruff, Jacob R Kauffmann, Robert A Vandermeulen, Grégoire Montavon, Wojciech Samek, Marius Kloft, Thomas G Dietterich, and Klaus-Robert Müller. A unifying review of deep and shallow anomaly detection.Proceedings of the IEEE, 109(5):756–795, 2021
work page 2021
-
[17]
Lukas Ruff, Robert A Vandermeulen, Nico Görnitz, Lucas Deecke, Shoaib A Siddiqui, Alexander Binder, Emmanuel Müller, and Marius Kloft. Deep one-class classification. InInternational conference on machine learning, pages 4393–4402. PMLR, 2018
work page 2018
-
[18]
A simple framework for contrastive learning of visual representations
Ting Chen, Simon Kornblith, Mohammad Norouzi, and Geoffrey Hinton. A simple framework for contrastive learning of visual representations. InInternational conference on machine learning, pages 1597–1607. PMLR, 2020
work page 2020
-
[19]
Learningandevaluatingrepresentationsfordeepone-classclas- sification
Kihyuk Sohn, Chun-Liang Li, Jinsung Yoon, Minho Jin, and Tomas Pfister. Learningandevaluatingrepresentationsfordeepone-classclas- sification. InInternational Conference on Learning Representations,
-
[20]
URLhttps://openreview.net/forum?id=HCSgyPUfeDj
-
[21]
Christian Matek, Sebastian Krappe, Christian Münzenmayer, Torsten Haferlach, and Carsten Marr. An expert-annotated dataset of bone marrow cytology in hematologic malignancies.The Cancer Imaging Archive, 2021
work page 2021
-
[22]
Swarnadip Chatterjee, Orcun Göksel, Nataša Sladoje, and Joakim Lindblad. Detection of extremely sparse key instances in whole slide cytology images via self-supervised one-class representation learning. InInternational Conference on Pattern Recognition, pages 408–421. Springer, 2024
work page 2024
-
[23]
Yuexiang Li, Yunzhi Liu, Yufei Xu, Liangqiong Zhang, Lei Xing, and Junzhou Huang. Dual-stream multiple instance learning network for whole slide image classification with self-supervised contrastive learning. InProceedings of the IEEE/CVF International Conference on Computer Vision, pages 14318–14327, 2021
work page 2021
-
[24]
Pushpak Pati, Guillaume Jaume, Zeineb Ayadi, Kevin Thandiackal, Behzad Bozorgtabar, Maria Gabrani, and Orcun Goksel. Weakly supervisedjointwhole-slidesegmentationandclassificationinprostate cancer.Medical Image Analysis, 89:102915, 2023
work page 2023
-
[25]
Nadezhda Koriakina, Nataša Sladoje, Vladimir Bašić, and Joakim Lindblad. Deep multiple instance learning versus conventional deep single instance learning for interpretable oral cancer detection.Plos one, 19(4):e0302169, 2024
work page 2024
-
[26]
A deep learning based pipeline for efficient oral cancer screening on whole slide images
JiahaoLu,NatašaSladoje,ChristinaRunowStark,EvaDaraiRamqvist, Jan-Michaél Hirsch, and Joakim Lindblad. A deep learning based pipeline for efficient oral cancer screening on whole slide images. In International Conference on Image Analysis and Recognition, pages 249–261. Springer, 2020
work page 2020
-
[27]
Multiple in- stancelearningviaiterativeself-pacedsupervisedcontrastivelearning
Kangning Liu, Weicheng Zhu, Yiqiu Shen, Sheng Liu, Narges Raza- vian, Krzysztof J Geras, and Carlos Fernandez-Granda. Multiple in- stancelearningviaiterativeself-pacedsupervisedcontrastivelearning. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 3355–3365, 2023
work page 2023
-
[28]
Momentum contrast for unsupervised visual representation learning
Kaiming He, Haoqi Fan, Yuxin Wu, Saining Xie, and Ross Girshick. Momentum contrast for unsupervised visual representation learning. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2020
work page 2020
-
[29]
Jean-Bastien Grill, Florian Strub, Florent Altché, Corentin Tallec, Pierre H. Richemond, Elena Buchatskaya, Carl Doersch, Bernardo Avila Pires, Zhaohan Guo, Mohammad Gheshlaghi Azar, Bilal Piot, Koray Kavukcuoglu, Rémi Munos, and Michal Valko. Bootstrap your own latent: A new approach to self-supervised learning. InAdvances in Neural Information Processin...
work page 2020
-
[30]
Unsupervised learning of visual features by contrasting cluster assignments
Mathilde Caron, Ishan Misra, Julien Mairal, Priya Goyal, Piotr Bojanowski, and Armand Joulin. Unsupervised learning of visual features by contrasting cluster assignments. InAdvances in Neural Information Processing Systems (NeurIPS), 2020
work page 2020
-
[31]
Yoni Schirris, Efstratios Gavves, Iris Nederlof, Hugo Mark Horlings, andJonasTeuwen.Deepsmile:Contrastiveself-supervisedpre-training benefits msi and hrd classification directly from h&e whole-slide images in colorectal and breast cancer.Medical image analysis, 79: 102464, 2022
work page 2022
-
[32]
David Torpey and Richard Klein. Deepset simclr: Self-supervised deep sets for improved pathology representation learning.Pattern Recognition Letters, 186:64–70, 2024
work page 2024
-
[33]
Deep learning for medical anomaly detection–a survey.ACM Computing Surveys (CSUR), 54(7):1–37, 2021
Tharindu Fernando, Harshala Gammulle, Simon Denman, Sridha Sridharan, and Clinton Fookes. Deep learning for medical anomaly detection–a survey.ACM Computing Surveys (CSUR), 54(7):1–37, 2021
work page 2021
-
[34]
Anomalydetection in medical imaging-a mini review
MaximilianETschuchnigandMichaelGadermayr. Anomalydetection in medical imaging-a mini review. InInternational Data Science Conference, pages 33–38. Springer, 2021
work page 2021
-
[35]
Anomaly detection for medical images based on a one-class classification
Qi Wei, Yinhao Ren, Rui Hou, Bibo Shi, Joseph Y Lo, and Lawrence Carin. Anomaly detection for medical images based on a one-class classification. InMedical Imaging 2018: Computer-Aided Diagnosis, volume 10575, pages 375–380. SPIE, 2018
work page 2018
-
[36]
Carlo Bruno Marta, Manuel Doblare, Jonathan Heras, Gadea Mata, and Teresa Ramirez. Anomaly detection applied to the classification of cytology images.Biomedical Signal Processing and Control, 105: 107625, 2025
work page 2025
-
[37]
Better aggregation in test-time augmentation
Divya Shanmugam, Davis Blalock, Guha Balakrishnan, and John Guttag. Better aggregation in test-time augmentation. InProceedings of the IEEE/CVF international conference on computer vision, pages 1214–1223, 2021
work page 2021
-
[38]
Understanding test-time augmentation
Masanari Kimura. Understanding test-time augmentation. In International Conference on Neural Information Processing, pages 558–569. Springer, 2021
work page 2021
-
[39]
Seffi Cohen, Niv Goldshlager, Lior Rokach, and Bracha Shapira. Boosting anomaly detection using unsupervised diverse test-time augmentation.Information Sciences, 626:821–836, 2023
work page 2023
-
[40]
Christian Matek, Sebastian Krappe, Christian Münzenmayer, Torsten Haferlach, and Carsten Marr. Highly accurate differentiation of bone marrowcellmorphologiesusingdeepneuralnetworksonalargeimage data set.Blood, The Journal of the American Society of Hematology, 138(20):1917–1927, 2021. Chatterjee et al.:Preprint submitted to ElsevierPage 15 of 15
work page 1917
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.