Learning Label-Efficient Interpretable Medical Image Diagnosis via Semi-supervised Hypergraph Concept Bottleneck Model
Pith reviewed 2026-06-28 15:07 UTC · model grok-4.3
The pith
A semi-supervised concept bottleneck model with dual-level hypergraphs improves interpretability and accuracy in medical image diagnosis using fewer expert labels.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
By combining a concept-level hypergraph for modeling inter-concept dependencies with an image-level hypergraph for domain-adaptive pseudo-label generation inside a semi-supervised concept bottleneck architecture, the model achieves higher accuracy and interpretability than prior CBMs while requiring substantially fewer manual concept annotations.
What carries the argument
Dual-level hypergraph learning, in which the concept-level hypergraph reasons over high-order concept relations and the image-level hypergraph generates robust pseudo-labels for unlabeled images.
If this is right
- Clinicians gain the ability to intervene on individual concepts while the model still accounts for their mutual dependencies.
- New medical imaging tasks can be trained with far less expert time spent annotating intermediate concepts.
- The same dual-hypergraph structure transfers across ultrasound and dermoscopic modalities without task-specific redesign.
Where Pith is reading between the lines
- If the pseudo-label mechanism proves stable across hospitals, the framework could lower the barrier to deploying interpretable models in additional high-stakes imaging domains.
- The approach suggests a route to test whether explicit modeling of concept co-occurrence graphs improves calibration of uncertainty estimates in safety-critical settings.
Load-bearing premise
The hypergraph structures accurately reflect genuine clinical concept relationships and the pseudo-labels they generate remain reliable enough that they do not require extensive additional expert correction.
What would settle it
An ablation study in which removing either hypergraph component produces no measurable drop in accuracy or concept-level intervention quality on the PAS or breast ultrasound test sets.
Figures
read the original abstract
Deep learning has revolutionized medical image analysis, delivering exceptional diagnostic accuracy across diverse applications. Yet, the lack of interpretability in its decision-making hinders clinical adoption, particularly in high-stakes medical contexts where transparency is paramount for trustworthiness. For example, in Placenta Accreta Spectrum (PAS), subtle cues in ultrasound imaging challenge reliable diagnosis, rendering black-box models untrustworthy for accurate scoring. To address this, Concept Bottleneck Models (CBMs) offer a promising avenue by embedding clinically meaningful intermediate concepts into the diagnosis pipeline, enabling clinicians to scrutinize and refine model outputs. However, conventional CBMs falter in capturing complex inter-concept dependencies and demand costly, expert-driven concept annotations, limiting their scalability. This study introduces a novel semi-supervised CBM framework designed for medical imaging, which leverages dual-level hypergraph learning to model high-order concept dependencies and generate domain-adaptive pseudo-labels. Our approach achieves superior interpretability and performance by integrating a concept-level hypergraph for enhanced reasoning and an image-level hypergraph for robust pseudo-label generation. Experiments on a newly annotated PAS ultrasound dataset and a breast ultrasound public dataset demonstrate the effectiveness of the proposed concept label-efficient interpretable framework. Its universality is further validated on the dermoscopic image dataset SkinCon. The code is available at https://github.com/scott-yjyang/HyperCBM.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper proposes a semi-supervised Concept Bottleneck Model (CBM) framework called HyperCBM that integrates dual-level hypergraph learning: a concept-level hypergraph to capture high-order inter-concept dependencies for enhanced reasoning, and an image-level hypergraph to generate domain-adaptive pseudo-labels. This aims to improve label efficiency, interpretability, and diagnostic performance in medical imaging tasks. Experiments are reported on a newly annotated Placenta Accreta Spectrum (PAS) ultrasound dataset, a public breast ultrasound dataset, and the SkinCon dermoscopic dataset, with code released at a GitHub repository.
Significance. If the dual-hypergraph components demonstrably improve both accuracy and concept-level interpretability without introducing unvalidated biases in pseudo-labels, the approach could meaningfully extend CBMs to label-scarce medical domains by reducing reliance on expert concept annotations while preserving clinical scrutability. The public code release supports reproducibility.
minor comments (2)
- The abstract claims 'superior interpretability and performance' but provides no quantitative metrics, baselines, or ablation results; these should be summarized with effect sizes in the abstract for immediate assessment.
- The description of the 'newly annotated PAS ultrasound dataset' lacks any mention of annotation protocol, inter-rater agreement, or dataset statistics; this information is needed to evaluate the label-efficiency claim.
Simulated Author's Rebuttal
We thank the referee for their thoughtful summary of our work and for recognizing the potential significance of the dual-hypergraph CBM framework in label-scarce medical imaging domains. We are encouraged by the positive note on reproducibility via the public code release. The referee recommendation is listed as uncertain, but no specific major comments were provided in the report. We therefore have no point-by-point responses to address at this stage and would welcome any additional detailed feedback to strengthen the manuscript.
Circularity Check
No significant circularity detected
full rationale
The provided abstract and description contain no equations, derivations, or load-bearing steps that reduce by construction to inputs. The framework is described at a high level as integrating concept-level and image-level hypergraphs for semi-supervised learning, with effectiveness shown via experiments on datasets. No self-definitional patterns, fitted inputs called predictions, or self-citation chains are evident. The central claims rest on empirical results rather than tautological reductions, making the derivation self-contained against external benchmarks.
Axiom & Free-Parameter Ledger
Reference graph
Works this paper leans on
-
[1]
Chen, X., X. Wang, K. Zhang, et al. Recent advances and clinical applications of deep learning in medical image analysis.Medical image analysis, 79:102444, 2022
2022
-
[2]
Siegel, D
Liu, T., E. Siegel, D. Shen. Deep learning and medical image analysis for covid-19 diagnosis and prediction.Annual review of biomedical engineering, 24(1):179–201, 2022
2022
-
[3]
Zhou, S. K., H. Greenspan, C. Davatzikos, et al. A review of deep learning in medical imaging: Imaging traits, technology trends, case studies with progress highlights, and future promises. Proceedings of the IEEE, 109(5):820–838, 2021
2021
-
[4]
Yang, Y ., H. Fu, A. I. Aviles-Rivero, et al. Diffmic: Dual-guidance diffusion network for medical image classification. InInternational conference on medical image computing and computer-assisted intervention, pages 95–105. Springer, 2023
2023
-
[5]
Diffmic-v2: Medical image classification via improved diffusion network.IEEE Transactions on Medical Imaging, 44(5):2244–2255, 2025
Yang, Y . Diffmic-v2: Medical image classification via improved diffusion network.IEEE Transactions on Medical Imaging, 44(5):2244–2255, 2025
2025
- [6]
-
[7]
Gong, Z., S. Gao, B. Zhao, et al. Cect-mamba: a hierarchical contrast-enhanced-aware model for pancreatic tumor subtyping from multi-phase cect. InProceedings of the IEEE/CVF International Conference on Computer Vision, pages 1161–1171. 2025
2025
-
[8]
Tjoa, E., C. Guan. A survey on explainable artificial intelligence (xai): Toward medical xai. IEEE transactions on neural networks and learning systems, 32(11):4793–4813, 2020
2020
-
[9]
Explainability and artificial intelligence in medicine.The Lancet Digital Health, 4(4):e214–e215, 2022
Reddy, S. Explainability and artificial intelligence in medicine.The Lancet Digital Health, 4(4):e214–e215, 2022
2022
-
[10]
Meier, S
Reyes, M., R. Meier, S. Pereira, et al. On the interpretability of artificial intelligence in radiology: challenges and opportunities.Radiology: artificial intelligence, 2(3):e190043, 2020. 10
2020
-
[11]
Alizadehsani, U
Nasarian, E., R. Alizadehsani, U. R. Acharya, et al. Designing interpretable ml system to en- hance trust in healthcare: A systematic review to proposed responsible clinician-ai-collaboration framework.Information Fusion, page 102412, 2024
2024
-
[12]
Yang, Y ., Z.-Y . Wang, Q. Liu, et al. Medical world model. InProceedings of the IEEE/CVF International Conference on Computer Vision, pages 8319–8329. 2025
2025
-
[13]
Collins, G
Jauniaux, E., S. Collins, G. J. Burton. Placenta accreta spectrum: pathophysiology and evidence- based anatomy for prenatal ultrasound imaging.American journal of obstetrics and gynecology, 218(1):75–87, 2018
2018
-
[14]
Forlani, C
Cali, G., F. Forlani, C. Lees, et al. Prenatal ultrasound staging system for placenta accreta spectrum disorders.Ultrasound in Obstetrics & Gynecology, 53(6):752–760, 2019
2019
-
[15]
Ioannou, P
Sarris, I., C. Ioannou, P. Chamberlain, et al. Intra-and interobserver variability in fetal ultrasound measurements.Ultrasound in obstetrics & gynecology, 39(3):266–273, 2012
2012
-
[16]
Cinque, A
Avola, D., L. Cinque, A. Fagioli, et al. Ultrasound medical imaging techniques: a survey.ACM Computing Surveys (CSUR), 54(3):1–38, 2021
2021
-
[17]
Yang, Y ., Z. Xing, L. Yu, et al. Vivim: a video vision mamba for ultrasound video segmentation. IEEE Transactions on Circuits and Systems for Video Technology, 2025
2025
-
[18]
Xu, H., Y . Yang, A. I. Aviles-Rivero, et al. Lgrnet: Local-global reciprocal network for uterine fibroid segmentation in ultrasound videos. InInternational Conference on Medical Image Computing and Computer-Assisted Intervention, pages 667–677. Springer, 2024
2024
-
[19]
Koh, P. W., T. Nguyen, Y . S. Tang, et al. Concept bottleneck models. InInternational conference on machine learning, pages 5338–5348. PMLR, 2020
2020
- [20]
-
[21]
Kim, I., J. Kim, J. Choi, et al. Concept bottleneck with visual concept filtering for explainable medical image classification. InInternational Conference on Medical Image Computing and Computer-Assisted Intervention, pages 225–233. Springer, 2023
2023
-
[22]
Pang, W., X. Ke, S. Tsutsui, et al. Integrating clinical knowledge into concept bottleneck models. InInternational Conference on Medical Image Computing and Computer-Assisted Intervention, pages 243–253. Springer, 2024
2024
-
[23]
Chowdhury, T. F., V . M. H. Phan, K. Liao, et al. Adacbm: An adaptive concept bottleneck model for explainable and accurate diagnosis. InInternational Conference on Medical Image Computing and Computer-Assisted Intervention, pages 35–45. Springer, 2024
2024
-
[24]
Semi-supervised concept bottleneck models.arXiv preprint, 2024
Hu, L., T. Huang, H. Xie, et al. Semi-supervised concept bottleneck models.CoRR, abs/2406.18992, 2024
-
[25]
Liu, S., S. Yin, L. Qu, et al. Reducing domain gap in frequency and spatial domain for cross- modality domain adaptation on medical image segmentation. InProceedings of the AAAI Conference on Artificial Intelligence, vol. 37, pages 1719–1727. 2023
2023
-
[26]
Li, H., Y . Wang, R. Wan, et al. Domain generalization for medical imaging classification with linear-dependency regularization.Advances in neural information processing systems, 33:3118–3129, 2020
2020
-
[27]
Barbiero, G
Espinosa Zarlenga, M., P. Barbiero, G. Ciravegna, et al. Concept embedding models: Beyond the accuracy-explainability trade-off.Advances in Neural Information Processing Systems, 35:21400–21413, 2022
2022
-
[28]
Tiwari, J
Chauhan, K., R. Tiwari, J. Freyberg, et al. Interactive concept bottleneck models. InProceedings of the AAAI Conference on Artificial Intelligence, vol. 37, pages 5948–5955. 2023
2023
- [29]
-
[30]
Lai, S., L. Hu, J. Wang, et al. Faithful vision-language interpretation via concept bottleneck models. InThe Twelfth International Conference on Learning Representations. 2023
2023
- [31]
-
[32]
Giannini, G
Barbiero, P., F. Giannini, G. Ciravegna, et al. Relational concept bottleneck models.Advances in Neural Information Processing Systems, 37:77663–77685, 2024
2024
-
[33]
Parbhoo, F
Havasi, M., S. Parbhoo, F. Doshi-Velez. Addressing leakage in concept bottleneck models. Advances in Neural Information Processing Systems, 35:23386–23397, 2022
2022
- [34]
-
[35]
Ebrahimi Kahou
Sheth, I., S. Ebrahimi Kahou. Auxiliary losses for learning generalizable concept-based models. Advances in Neural Information Processing Systems, 36:26966–26990, 2023
2023
-
[36]
A., V .-T
Kamraoui, R. A., V .-T. Ta, N. Papadakis, et al. Popcorn: Progressive pseudo-labeling with con- sistency regularization and neighboring. InMedical Image Computing and Computer Assisted Intervention–MICCAI 2021: 24th International Conference, Strasbourg, France, September 27–October 1, 2021, Proceedings, Part II 24, pages 373–382. Springer, 2021
2021
-
[37]
Li, Y ., J. Chen, X. Xie, et al. Self-loop uncertainty: A novel pseudo-label for semi-supervised medical image segmentation. InMedical Image Computing and Computer Assisted Intervention– MICCAI 2020: 23rd International Conference, Lima, Peru, October 4–8, 2020, Proceedings, Part I 23, pages 614–623. Springer, 2020
2020
-
[38]
Wu, H., Y . Yang, A. I. Aviles-Rivero, et al. Semi-supervised video desnowing network via temporal decoupling experts and distribution-driven contrastive regularization. InEuropean Conference on Computer Vision, pages 70–89. Springer, 2024
2024
-
[39]
Liu, X., Y . Yang, Y . Xu, et al. Autoregressive-conditioned diffusion for semi-supervised thyroid ultrasound segmentation with optical flow-based pseudo labels. InProceedings of the IEEE/CVF International Conference on Computer Vision, pages 1340–1350. 2025
2025
-
[40]
Gu, Y ., T. Zhou, Y . Zhang, et al. Dual-scale enhanced and cross-generative consistency learning for semi-supervised medical image segmentation.Pattern Recognition, 158:110962, 2025
2025
-
[41]
Xiao, H., Y . Wang, S. Xiong, et al. Cuamt: A mri semi-supervised medical image segmentation framework based on contextual information and mixed uncertainty.Computer Methods and Programs in Biomedicine, page 108755, 2025
2025
-
[42]
Carlini, I
Berthelot, D., N. Carlini, I. Goodfellow, et al. Mixmatch: A holistic approach to semi-supervised learning.Advances in neural information processing systems, 32, 2019
2019
-
[43]
Berthelot, N
Sohn, K., D. Berthelot, N. Carlini, et al. Fixmatch: Simplifying semi-supervised learning with consistency and confidence.Advances in neural information processing systems, 33:596–608, 2020
2020
-
[44]
Deng, X., H. Wu, R. Zeng, et al. Memsam: taming segment anything model for echocardiogra- phy video segmentation. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 9622–9631. 2024
2024
-
[45]
Aviles-Rivero, A. I., N. Papadakis, R. Li, et al. Graphx net-chest x-ray classification under extreme minimal supervision. InInternational Conference on Medical Image Computing and Computer-Assisted Intervention, pages 504–512. Springer, 2019
2019
-
[46]
Nguyen, S
Unnikrishnan, B., C. Nguyen, S. Balaram, et al. Semi-supervised classification of radiology images with noteacher: A teacher that is not mean.Medical Image Analysis, 73:102148, 2021
2021
-
[47]
Carnegie Mellon University, 2005
Zhu, X.Semi-supervised learning with graphs. Carnegie Mellon University, 2005
2005
-
[48]
Chong, Y ., Y . Ding, Q. Yan, et al. Graph-based semi-supervised learning: A review.Neurocom- puting, 408:216–230, 2020. 12
2020
-
[49]
Song, Z., X. Yang, Z. Xu, et al. Graph-based semi-supervised learning: A comprehensive review.IEEE Transactions on Neural Networks and Learning Systems, 34(11):8174–8194, 2022
2022
-
[50]
Gao, Y ., M. Wang, D. Tao, et al. 3-d object retrieval and recognition with hypergraph analysis. IEEE transactions on image processing, 21(9):4290–4303, 2012
2012
-
[51]
Huang, Y ., Q. Liu, D. Metaxas. ] video object segmentation by hypergraph cut. In2009 IEEE conference on computer vision and pattern recognition, pages 1738–1745. IEEE, 2009
2009
-
[52]
Han, Y ., P. Wang, S. Kundu, et al. Vision hgnn: An image is more than a graph of nodes. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 19878– 19888. 2023
2023
- [53]
-
[54]
Hypergraph vision transformers: Images are more than nodes, more than edges
Fixelle, J. Hypergraph vision transformers: Images are more than nodes, more than edges. In Proceedings of the Computer Vision and Pattern Recognition Conference, pages 9751–9761. 2025
2025
-
[55]
Gao, Y ., Y . Feng, S. Ji, et al. Hgnn+: General hypergraph neural networks.IEEE Transactions on Pattern Analysis and Machine Intelligence, 45(3):3181–3199, 2022
2022
-
[56]
Feng, Y ., H. You, Z. Zhang, et al. Hypergraph neural networks. InProceedings of the AAAI conference on artificial intelligence, vol. 33, pages 3558–3565. 2019
2019
-
[57]
Huang, S
Feng, Y ., J. Huang, S. Du, et al. Hyper-yolo: When visual object detection meets hypergraph computation.IEEE Transactions on Pattern Analysis and Machine Intelligence, 2024
2024
-
[58]
´Cwierz-Pie´nkowska, A
Pawłowska, A., A. ´Cwierz-Pie´nkowska, A. Domalik, et al. Curated benchmark dataset for ultrasound based breast lesion analysis.Scientific Data, 11(1):148, 2024
2024
-
[59]
Yuksekgonul, Z
Daneshjou, R., M. Yuksekgonul, Z. R. Cai, et al. Skincon: A skin disease dataset densely annotated by domain experts for fine-grained debugging and analysis.Advances in Neural Information Processing Systems, 35:18157–18167, 2022
2022
- [60]
-
[61]
Harris, L
Groh, M., C. Harris, L. Soenksen, et al. Evaluating deep neural networks trained on clinical images in dermatology with the fitzpatrick 17k dataset. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 1820–1828. 2021
2021
-
[62]
Zhang, S
He, K., X. Zhang, S. Ren, et al. Deep residual learning for image recognition. InProceedings of the IEEE conference on computer vision and pattern recognition, pages 770–778. 2016
2016
-
[63]
Selvaraju, R. R., M. Cogswell, A. Das, et al. Grad-cam: Visual explanations from deep networks via gradient-based localization. InProceedings of the IEEE international conference on computer vision, pages 618–626. 2017. 13
2017
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.