CLEAR-HPV: Interpretable concept discovery for human-papillomavirus-associated morphology in whole-slide histology
Pith reviewed 2026-05-25 07:16 UTC · model grok-4.3
The pith
CLEAR-HPV restructures attention-based MIL latent space to discover keratinizing, basaloid and stromal concepts for HPV prediction without concept labels.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
CLEAR-HPV restructures the MIL latent space using attention to enable concept discovery without requiring concept labels during training. Operating in an attention-weighted latent space, CLEAR-HPV automatically discovers keratinizing, basaloid, and stromal morphologic concepts, generates spatial concept maps, and represents each slide using a compact concept-fraction vector. CLEAR-HPV's concept-fraction vectors preserve the predictive information of the original MIL embeddings while reducing the high-dimensional feature space to only 10 interpretable concepts. CLEAR-HPV generalizes consistently across TCGA-HNSCC, TCGA-CESC, and CPTAC-HNSCC.
What carries the argument
Attention-weighted latent space restructuring that isolates morphologic concepts and produces concept-fraction vectors from attention-based MIL embeddings.
If this is right
- Each slide is represented by a 10-dimensional concept-fraction vector that matches the HPV prediction accuracy of the original high-dimensional embedding.
- Spatial concept maps highlight the locations of keratinizing, basaloid, and stromal patterns within each whole-slide image.
- The framework applies across different attention-based MIL backbones and three independent datasets without retraining for concept labels.
- High-dimensional feature spaces can be replaced by these compact interpretable vectors for downstream analysis while keeping predictive power.
Where Pith is reading between the lines
- The same attention-restructuring step could be tested on other slide-level biomarkers such as PD-L1 or microsatellite instability to check whether similar concept discovery occurs.
- Concept-fraction vectors might serve as input features for survival models to test whether the discovered morphology concepts carry prognostic value beyond HPV status alone.
- If the concepts align with known histologic subtypes, the method could support prospective studies that measure whether providing these maps to pathologists changes diagnostic consistency.
Load-bearing premise
That attention weights in the latent space isolate distinct morphologic patterns that correspond to actual tissue biology without any concept supervision.
What would settle it
Pathologist manual annotation of keratinizing, basaloid, and stromal regions on a held-out set of slides; if the automatically generated spatial concept maps show low overlap with those annotations, the discovery claim fails.
Figures
read the original abstract
Human papillomavirus (HPV) status is a critical determinant of prognosis and treatment response in head and neck and cervical cancers. Although attention-based multiple instance learning (MIL) achieves strong slide-level prediction for HPV-related whole-slide histopathology, it provides limited morphologic interpretability. To address this limitation, we introduce Concept-Level Explainable Attention-guided Representation for HPV (CLEAR-HPV), a framework that restructures the MIL latent space using attention to enable concept discovery without requiring concept labels during training. Operating in an attention-weighted latent space, CLEAR-HPV automatically discovers keratinizing, basaloid, and stromal morphologic concepts, generates spatial concept maps, and represents each slide using a compact concept-fraction vector. CLEAR-HPV's concept-fraction vectors preserve the predictive information of the original MIL embeddings while reducing the high-dimensional feature space (e.g., 1536 dimensions) to only 10 interpretable concepts. CLEAR-HPV generalizes consistently across TCGA-HNSCC, TCGA-CESC, and CPTAC-HNSCC, providing compact, concept-level interpretability through a general, backbone-agnostic framework for attention-based MIL models of whole-slide histopathology.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript introduces CLEAR-HPV, a framework that restructures the latent space of attention-based multiple instance learning (MIL) models to enable unsupervised discovery of morphologic concepts (keratinizing, basaloid, stromal) in HPV-associated whole-slide histopathology. It generates spatial concept maps and represents slides with 10-dimensional concept-fraction vectors that preserve the predictive power of the original high-dimensional (e.g., 1536) MIL embeddings while generalizing across TCGA-HNSCC, TCGA-CESC, and CPTAC-HNSCC cohorts.
Significance. If the quantitative results support the claims, this provides a general, backbone-agnostic method for adding concept-level interpretability to MIL models in computational pathology without requiring labeled concepts during training. The reduction to a compact, interpretable representation while maintaining predictive performance and cross-dataset consistency could facilitate clinical translation by linking model predictions to known histologic features. The unsupervised concept discovery and reported generalization across independent cohorts are notable strengths.
minor comments (2)
- [Abstract] Abstract: The abstract asserts that concept-fraction vectors 'preserve the predictive information' and 'generalize consistently' but supplies no numerical metrics (e.g., AUC deltas, correlation coefficients, or ablation results); a brief quantitative summary should be added.
- [Methods] The description of how the attention-weighted latent space is restructured for concept discovery (e.g., clustering or factorization procedure) is referenced but not detailed in the provided abstract; ensure the methods section supplies the exact algorithm and hyperparameters.
Simulated Author's Rebuttal
We thank the referee for the positive assessment of CLEAR-HPV, the recognition of its backbone-agnostic interpretability contribution, and the recommendation for minor revision. No specific major comments appear in the provided report, so we have no point-by-point replies to offer. We remain ready to incorporate any additional feedback or perform minor edits if requested by the editor.
Circularity Check
No significant circularity detected
full rationale
The paper presents CLEAR-HPV as a restructuring of attention-based MIL latent space to enable unsupervised concept discovery, with claims that concept-fraction vectors preserve predictive information while reducing dimensionality from 1536 to 10 concepts. No equations, derivations, or self-citations are provided in the abstract or description that would reduce any prediction or result to fitted inputs by construction. The framework is described as backbone-agnostic and generalizing across independent cohorts (TCGA-HNSCC, TCGA-CESC, CPTAC-HNSCC), with no load-bearing steps that equate outputs to inputs via definition or self-referential fitting. The derivation chain appears self-contained as an extension of existing MIL methods without circular reductions.
Axiom & Free-Parameter Ledger
Reference graph
Works this paper leans on
-
[1]
Global burden of cancer attributable to infections in 2018: a worldwide incidence analysis
Catherine de Martel, Damien Georges, Freddie Bray, Jacques Ferlay, and Gary M Clifford. Global burden of cancer attributable to infections in 2018: a worldwide incidence analysis. The Lancet global health, 8(2):180–190, 2020
work page 2018
-
[2]
Ang, Jay Harris, Robert Wheeler, Randal Weber, David I
Kevin K. Ang, Jay Harris, Robert Wheeler, Randal Weber, David I. Rosenthal, P . F . Nguyen- Tˆan, William H. Westra, Christine H. Chung, R. Clay Jordan, Chaomei Lu, Hyejin Kim, Rebecca Axelrod, Carole C. Silverman, Kimberly P . Redmond, and Maura L. Gillison. Hu- man papillomavirus and survival of patients with oropharyngeal cancer.The New England Journal...
-
[3]
Maura L. Gillison, William M. Koch, Ralph B. Capone, Michael Spafford, William H. Westra, Ling Wu, Mark L. Zahurak, Robert W. Daniel, Michael Viglione, Daniel E. Symer, Keerti V. Shah, and David Sidransky. Evidence for a causal association between human papillo- mavirus and a subset of head and neck cancers.Journal of the National Cancer Institute, 92(9):...
-
[4]
S. Marur and A. A. Forastiere. Head and neck cancer: changing epidemiology, diagnosis, and treatment.Mayo Clinic Proceedings, 83(4):489–501, April 2008. doi:10.4065/83.4.489. Erratum in Mayo Clin Proc. 2008 May;83(5):604
- [5]
-
[6]
A. A. Shah, S. K. Jeffus, and E. B. Stelow. Squamous cell carcinoma variants of the upper aerodigestive tract: a comprehensive review with a focus on genetic alter- ations.Archives of Pathology & Laboratory Medicine, 138(6):731–744, June 2014. doi: 10.5858/arpa.2013-0070-RA
-
[7]
Jakob N Kather, Alexander T Pearson, Niels Halama, Dirk J ¨ager, Jos´e Krause, Stefanie H Loosen, Alexander Marx, Sabine Foersch, Ansgar D¨usterhoft, Thomas van den Heuvel, and et al. Deep learning can predict microsatellite instability directly from histology in gastroin- testinal cancer.Nature Medicine, 25:1054–1056, 2019. doi:10.1038/s41591-019-0462-y
-
[8]
J. S. Jr Lewis, B. Beadle, J. A. Bishop, R. D. Chernock, C. Colasacco, C. Lacchetti, J. T. Moncur, J. W. Rocco, M. R. Schwartz, R. R. Seethala, N. E. Thomas, W. H. Westra, and W. C. Faquin. Human papillomavirus testing in head and neck carcinomas: Guideline from the college of american pathologists.Archives of Pathology & Laboratory Medicine, 142(5): 559–...
-
[9]
Hanna, Lisa Geneslaw, Albert Miraflor, Victor Werneck Krauss Silva, Klaus J
Giacomo Campanella, Michael G. Hanna, Lisa Geneslaw, Albert Miraflor, Victor Werneck Krauss Silva, Klaus J. Busam, Edi Brogi, Victor E. Reuter, David S. Klimstra, and Thomas J. Fuchs. Clinical-grade computational pathology using weakly supervised deep learning on whole slide images.Nature Medicine, 25:1301–1309, 2019. doi:10.1038/s41591-019-0508-1
-
[10]
Ocampo, Theodore Sakellaropoulos, Navneet Narula, Matija Snuderl, David Feny¨o, Andre L
Nicolas Coudray, Paola S. Ocampo, Theodore Sakellaropoulos, Navneet Narula, Matija Snuderl, David Feny¨o, Andre L. Moreira, Narges Razavian, and Aristotelis Tsirigos. Classifi- cation and mutation prediction from non–small cell lung cancer histopathology images using deep learning.Nature Medicine, 24:1559–1567, 2018. doi:10.1038/s41591-018-0177-5. 22
-
[11]
Y . Liu, T. Kohlberger, M. Norouzi, G. E. Dahl, L. Metz, J. Shoemaker, R. MacDonald, C. Scott, P . Q. Nelson, J. D. Hipp, G. S. Corrado, J. Dean, H. MacMahon, A. Madab- hushi, and M. C. Stumpe. Artificial intelligence–based breast cancer nodal metastasis de- tection: Insights into the black box for pathologists.JAMA, 323(4):315–325, 2020. doi: 10.1001/jam...
-
[12]
Wojciech Samek, Thomas Wiegand, and Klaus-Robert M ¨uller. Explainable artificial intel- ligence: Understanding, visualizing and interpreting deep learning models.IEEE Signal Processing Magazine, 34(6):26–41, 2017
work page 2017
-
[13]
Alejandro Barredo Arrieta, Natalia D ´ıaz-Rodr´ıguez, Javier Del Ser, Adrien Bennetot, Siham Tabik, Alberto Barbado, Salvador Garc ´ıa, Sergio Gil-L´opez, Daniel Molina, Richard Ben- jamins, et al. Explainable artificial intelligence (xai): Concepts, taxonomies, opportunities and challenges toward responsible ai.Information Fusion, 58:82–115, 2020
work page 2020
-
[14]
Erico Tjoa and Cuntai Guan. A survey on explainable artificial intelligence (xai): Toward medical xai.Computer Methods and Programs in Biomedicine, 200:105009, 2021
work page 2021
-
[15]
Attention-based Deep Multiple Instance Learning
Maximilian Ilse, Jakub M Tomczak, and Max Welling. Attention-based deep multiple in- stance learning.arXiv preprint arXiv:1802.04712, 2018
work page internal anchor Pith review Pith/arXiv arXiv 2018
-
[16]
Ming Y . Lu, Drew F . K. Williamson, Tiffany Y . Chen, Rui Chen, Mauro Barbieri, and Faisal Mahmood. Data-efficient and weakly supervised computational pathology on whole-slide images.Nature Biomedical Engineering, 5:555–570, 2021. doi:10.1038/ s41551-020-00682-w
work page 2021
-
[17]
Zhuchen Shao, Hao Bian, Y ang Chen, Yifeng Wang, Jian Zhang, Xiangyang Ji, et al. Trans- mil: Transformer based correlated multiple instance learning for whole slide image classifi- cation.Advances in Neural Information Processing Systems, 34:2136–2147, 2021
work page 2021
-
[18]
Selvaraju, Michael Cogswell, Abhishek Das, Ramakrishna Vedantam, Devi Parikh, and Dhruv Batra
Ramprasaath R. Selvaraju, Michael Cogswell, Abhishek Das, Ramakrishna Vedantam, Devi Parikh, and Dhruv Batra. Grad-cam: Visual explanations from deep networks via gradient- based localization.International Journal of Computer Vision, 128(2):336–359, October
-
[19]
doi:10.1007/s11263-019-01228-7
ISSN 1573-1405. doi:10.1007/s11263-019-01228-7. URLhttp://dx.doi.org/10. 1007/s11263-019-01228-7
-
[20]
Representation Learning: A Review and New Perspectives
Y oshua Bengio, Aaron Courville, and Pascal Vincent. Representation learning: A review and new perspectives, 2014. URLhttps://arxiv.org/abs/1206.5538
work page internal anchor Pith review Pith/arXiv arXiv 2014
-
[21]
Emerging Properties in Self-Supervised Vision Transformers
Mathilde Caron, Hugo Touvron, Ishan Misra, Herv ´e J´egou, Julien Mairal, Piotr Bojanowski, and Armand Joulin. Emerging properties in self-supervised vision transformers, 2021. URL https://arxiv.org/abs/2104.14294
work page internal anchor Pith review Pith/arXiv arXiv 2021
-
[22]
Concept bottleneck models, 2020
Pang Wei Koh, Thao Nguyen, Y ew Siang Tang, Stephen Mussmann, Emma Pierson, Been Kim, and Percy Liang. Concept bottleneck models, 2020. URLhttps://arxiv.org/abs/ 2007.04612
-
[23]
Explaining classifiers with causal concept effect (cace), 2020
Y ash Goyal, Amir Feder, Uri Shalit, and Been Kim. Explaining classifiers with causal concept effect (cace), 2020. URLhttps://arxiv.org/abs/1907.07165
-
[24]
Towards automatic concept- based explanations, 2019
Amirata Ghorbani, James Wexler, James Zou, and Been Kim. Towards automatic concept- based explanations, 2019. URLhttps://arxiv.org/abs/1902.03129. 23
-
[25]
Fatemeh Haghighi, Mohammad Reza Hosseinzadeh Taher, Zongwei Zhou, Michael B. Got- way, and Jianming Liang. Transferable visual words: Exploiting the semantics of anatomical patterns for self-supervised learning.IEEE Transactions on Medical Imaging, 40(10):2857– 2868, 2021. doi:10.1109/TMI.2021.3060634
-
[26]
HGCLIP: Exploring vision-language models with graph representations for hierarchical understanding
Peng Xia, Xingtong Yu, Ming Hu, Lie Ju, Zhiyong Wang, Peibo Duan, and Zongyuan Ge. HGCLIP: Exploring vision-language models with graph representations for hierarchical understanding. In Owen Rambow, Leo Wanner, Marianna Apidianaki, Hend Al-Khalifa, Barbara Di Eugenio, and Steven Schockaert, editors,Proceedings of the 31st Interna- tional Conference on Com...
-
[27]
URLhttps://aclanthology.org/2025
Association for Computational Linguistics. URLhttps://aclanthology.org/2025. coling-main.19/
work page 2025
-
[28]
Radalign: Advancing radiology report generation with vision-language concept alignment, 2025
Difei Gu, Yunhe Gao, Y ang Zhou, Mu Zhou, and Dimitris Metaxas. Radalign: Advancing radiology report generation with vision-language concept alignment, 2025. URLhttps: //arxiv.org/abs/2501.07525
-
[29]
The cancer genome atlas pan-cancer analysis project.Nature Genetics, 45(10):1113–1120, 2013
John N Weinstein et al. The cancer genome atlas pan-cancer analysis project.Nature Genetics, 45(10):1113–1120, 2013
work page 2013
-
[30]
Nathan J Edwards et al. The cancer proteome atlas (cptac): a community resource for proteogenomic cancer research.Journal of Proteome Research, 14(6):2707–2713, 2015
work page 2015
-
[31]
Integrated genomic and molecular charac- terization of cervical cancer.Nature, 543:378–384, 2017
The Cancer Genome Atlas Research Network. Integrated genomic and molecular charac- terization of cervical cancer.Nature, 543:378–384, 2017. doi:10.1038/nature21386
-
[32]
Pooya Mobadersany, Sahand Y ousefi, Mohamed Amgad, David A. Gutman, Jill S. Barnholtz-Sloan, Jorge E. Vel ´azquez Vega, Daniel J. Brat, and Lee A. D. Cooper. Pre- dicting cancer outcomes from histology and genomics using convolutional networks.Pro- ceedings of the National Academy of Sciences, 115(13):E2970–E2979, Mar 2018. doi: 10.1073/pnas.1717139115
-
[33]
Kristofer Stacke, Gabriel Eilertsen, Jonas Unger, and Claes Lundstr ¨om. Measuring domain shift for deep learning in histopathology.IEEE Journal of Biomedical and Health Informatics, 25(2):325–336, 2021. doi:10.1109/JBHI.2020.3032060
-
[34]
Zarella, Deborah Bowman, Famke Aeffner, Navid Farahani, Andrew Xthona, Syed F
Mark D. Zarella, Deborah Bowman, Famke Aeffner, Navid Farahani, Andrew Xthona, Syed F . Absar, Anil Parwani, Marilyn Bui, and Douglas J. Hartman. A practical guide to whole slide imaging: A white paper from the digital pathology association.Archives of Pathology & Laboratory Medicine, 143(2):222–234, 2019. doi:10.5858/arpa.2018-0343-RA
-
[35]
Ozan Ciga, Tony Xu, and Anne Louise Martel. Self supervised contrastive learning for digital histopathology.Machine Learning with Applications, 7:100198, 2022. ISSN 2666-
work page 2022
-
[36]
doi:https://doi.org/10.1016/j.mlwa.2021.100198. URLhttps://www.sciencedirect. com/science/article/pii/S2666827021000992
-
[37]
Tissue concepts v2: A supervised foundation model for whole slide images, 2025
Till Nicke, Daniela Schacherer, Jan Raphael Sch ¨afer, Natalia Artysh, Antje Prasse, Andr ´e Homeyer, Andrea Schenk, Henning H ¨ofener, and Johannes Lotz. Tissue concepts v2: A supervised foundation model for whole slide images, 2025. URLhttps://arxiv.org/abs/ 2507.05742
-
[38]
Towards a general-purpose foundation model for computational pathology.Nature Medicine, 2024
Richard J Chen, Tong Ding, Ming Y Lu, Drew FK Williamson, Guillaume Jaume, Bowen Chen, Andrew Zhang, Daniel Shao, Andrew H Song, Muhammad Shaban, et al. Towards a general-purpose foundation model for computational pathology.Nature Medicine, 2024. 24
work page 2024
-
[39]
Ming Y . Lu, Bowen Chen, Drew F . K. Williamson, Richard J. Chen, Isaac Liang, Tian Ding, Guillaume Jaume, Igor Odintsov, Long Phi Le, Georg Gerber, Anil V. Parwani, Ann Zhang, and Faisal Mahmood. A visual-language foundation model for computational pathology. Nature Medicine, 30(3):863–874, March 2024. doi:10.1038/s41591-024-02856-4
-
[40]
Wright, Ari Robicsek, Brian Piening, Carlo Bifulco, Sheng Wang, and Hoifung Poon
Hanwen Xu, Naoto Usuyama, Jaspreet Bagga, Sheng Zhang, Rajesh Rao, Tristan Nau- mann, Cliff Wong, Zelalem Gero, Javier Gonz ´alez, Yu Gu, Y anbo Xu, Mu Wei, Wenhui Wang, Shuming Ma, Furu Wei, Jianwei Y ang, Chunyuan Li, Jianfeng Gao, Jaylen Rose- mon, Tucker Bower, Soohee Lee, Roshanthi Weerasinghe, Bill J. Wright, Ari Robicsek, Brian Piening, Carlo Biful...
work page 2024
-
[41]
M. Emre Celebi, Hassan A. Kingravi, and Patricio A. Vela. A comparative study of efficient initialization methods for the k-means clustering algorithm.Expert Systems with Applica- tions, 40(1):200–210, January 2013. ISSN 0957-4174. doi:10.1016/j.eswa.2012.07.021. URLhttp://dx.doi.org/10.1016/j.eswa.2012.07.021
-
[42]
Comprehensive k-means clustering.Journal of Computer and Communica- tions, 12(3), March 26 2024
Ethan Xiao. Comprehensive k-means clustering.Journal of Computer and Communica- tions, 12(3), March 26 2024
work page 2024
-
[43]
S. Lloyd. Least squares quantization in pcm.IEEE Transactions on Information Theory, 28 (2):129–137, 1982. doi:10.1109/TIT.1982.1056489
-
[44]
Hengyi Wang, Shiwei Tan, and Hao Wang. Probabilistic conceptual explainers: Trustwor- thy conceptual explanations for vision foundation models. InInternational Conference on Machine Learning, 2024
work page 2024
- [45]
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.