Learning Label-Efficient Interpretable Medical Image Diagnosis via Semi-supervised Hypergraph Concept Bottleneck Model

Angelica I Aviles-Rivero; Jing Qin; Lei Zhu; Lijie Hu; Ruiqiang Xiao; Yijun Yang; Yunzhu Wu

arxiv: 2606.01698 · v1 · pith:AYKOSPIHnew · submitted 2026-06-01 · 💻 cs.CV

Learning Label-Efficient Interpretable Medical Image Diagnosis via Semi-supervised Hypergraph Concept Bottleneck Model

Yijun Yang , Ruiqiang Xiao , Lijie Hu , Angelica I Aviles-Rivero , Yunzhu Wu , Jing Qin , Lei Zhu This is my paper

Pith reviewed 2026-06-28 15:07 UTC · model grok-4.3

classification 💻 cs.CV

keywords concept bottleneck modelshypergraph learningsemi-supervised learningmedical image diagnosisinterpretabilityultrasound imagingplacenta accreta spectrum

0 comments

The pith

A semi-supervised concept bottleneck model with dual-level hypergraphs improves interpretability and accuracy in medical image diagnosis using fewer expert labels.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces a framework that embeds clinically meaningful concepts into deep learning pipelines for medical images while addressing two limits of standard concept bottleneck models. It adds a concept-level hypergraph to capture high-order dependencies among concepts and an image-level hypergraph to produce reliable pseudo-labels from unlabeled data. Experiments on a new placenta accreta spectrum ultrasound dataset, a public breast ultrasound set, and a dermoscopic set show gains in both diagnostic performance and the ability for clinicians to inspect and edit the reasoning steps.

Core claim

By combining a concept-level hypergraph for modeling inter-concept dependencies with an image-level hypergraph for domain-adaptive pseudo-label generation inside a semi-supervised concept bottleneck architecture, the model achieves higher accuracy and interpretability than prior CBMs while requiring substantially fewer manual concept annotations.

What carries the argument

Dual-level hypergraph learning, in which the concept-level hypergraph reasons over high-order concept relations and the image-level hypergraph generates robust pseudo-labels for unlabeled images.

If this is right

Clinicians gain the ability to intervene on individual concepts while the model still accounts for their mutual dependencies.
New medical imaging tasks can be trained with far less expert time spent annotating intermediate concepts.
The same dual-hypergraph structure transfers across ultrasound and dermoscopic modalities without task-specific redesign.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

If the pseudo-label mechanism proves stable across hospitals, the framework could lower the barrier to deploying interpretable models in additional high-stakes imaging domains.
The approach suggests a route to test whether explicit modeling of concept co-occurrence graphs improves calibration of uncertainty estimates in safety-critical settings.

Load-bearing premise

The hypergraph structures accurately reflect genuine clinical concept relationships and the pseudo-labels they generate remain reliable enough that they do not require extensive additional expert correction.

What would settle it

An ablation study in which removing either hypergraph component produces no measurable drop in accuracy or concept-level intervention quality on the PAS or breast ultrasound test sets.

Figures

Figures reproduced from arXiv: 2606.01698 by Angelica I Aviles-Rivero, Jing Qin, Lei Zhu, Lijie Hu, Ruiqiang Xiao, Yijun Yang, Yunzhu Wu.

**Figure 1.** Figure 1: Traditional methods degenerate in a semi-supervised spirit. The conventional CEM (a) and our HyperCBM (b) try to infer the PAS severity level from the predicted concepts. CEM illustrates three error modes: ignoring lacunae, misinterpreting the retroplacental space, and focusing on a biased placental location. These concept errors yield the wrong severity. Instead, HyperCBM successfully predicts severity fr… view at source ↗

**Figure 2.** Figure 2: Overview of HyperCBM, a hypergraph-driven semi-supervised concept bottleneck model for ultrasound imaging. The framework integrates Hypergraph-Enhanced Concept Representation Learning (HECRL) for high-order inter-concept modeling via adaptive hypergraph propagation, and Hypergraph Image Dynamic Pseudo-labeling (HIDP) for reliable pseudo-label generation. demonstrated CBMs could improve generalization and t… view at source ↗

**Figure 3.** Figure 3: Interpretability Visualization: (a) Concept saliency maps on the PAS dataset, highlighting learned concepts (e.g., placental lacunae, retroplacental space). (b) Concept saliency maps on the BrEaST dataset, capturing key diagnostic features (e.g., irregular shape, posterior features). (a) (b) [PITH_FULL_IMAGE:figures/full_fig_p008_3.png] view at source ↗

**Figure 4.** Figure 4: Test-time intervention results on the PAS dataset: (a) Any concept whose score exceeds the intervention threshold is forced to zero. This intervention causes a nearly monotonic degradation in diagnosis. (b) An example demonstration of test-time intervention, where correcting "Skin Thickening" shifts the prediction from benign to malignant, improving diagnosis results and demonstrating model applicability. … view at source ↗

read the original abstract

Deep learning has revolutionized medical image analysis, delivering exceptional diagnostic accuracy across diverse applications. Yet, the lack of interpretability in its decision-making hinders clinical adoption, particularly in high-stakes medical contexts where transparency is paramount for trustworthiness. For example, in Placenta Accreta Spectrum (PAS), subtle cues in ultrasound imaging challenge reliable diagnosis, rendering black-box models untrustworthy for accurate scoring. To address this, Concept Bottleneck Models (CBMs) offer a promising avenue by embedding clinically meaningful intermediate concepts into the diagnosis pipeline, enabling clinicians to scrutinize and refine model outputs. However, conventional CBMs falter in capturing complex inter-concept dependencies and demand costly, expert-driven concept annotations, limiting their scalability. This study introduces a novel semi-supervised CBM framework designed for medical imaging, which leverages dual-level hypergraph learning to model high-order concept dependencies and generate domain-adaptive pseudo-labels. Our approach achieves superior interpretability and performance by integrating a concept-level hypergraph for enhanced reasoning and an image-level hypergraph for robust pseudo-label generation. Experiments on a newly annotated PAS ultrasound dataset and a breast ultrasound public dataset demonstrate the effectiveness of the proposed concept label-efficient interpretable framework. Its universality is further validated on the dermoscopic image dataset SkinCon. The code is available at https://github.com/scott-yjyang/HyperCBM.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper adds dual hypergraphs to semi-supervised CBMs for label-efficient medical ultrasound diagnosis, but the abstract gives no equations or numbers to check whether the hypergraphs actually drive the gains.

read the letter

The main takeaway is a semi-supervised CBM that uses one hypergraph over concepts to model higher-order dependencies and another over images to produce pseudo-labels. They apply it to placenta accreta spectrum ultrasound, a breast ultrasound dataset, and SkinCon, with code released.

The work does a reasonable job extending standard CBMs to the semi-supervised medical setting where expert concept labels are expensive. The dual-level hypergraph idea is a concrete technical step beyond the usual linear concept predictors, and testing on both a new annotated dataset and public ones shows some effort to demonstrate generality.

The soft spots are clear from the abstract alone. No equations or ablation tables are visible, so there is no evidence that the hypergraph components outperform simpler graph or attention baselines, or that the pseudo-labels avoid domain-specific errors. The interpretability claim rests on the CBM structure but lacks any reported concept accuracy or clinician validation. If the full paper has those controls, the central argument could hold; right now it cannot be verified.

This is for people working on interpretable models in medical imaging who already know CBMs and want to see a hypergraph variant. A reader looking for new first-principles ideas will find less here.

It is worth sending to peer review so the methods and results can be checked properly.

Referee Report

0 major / 2 minor

Summary. The paper proposes a semi-supervised Concept Bottleneck Model (CBM) framework called HyperCBM that integrates dual-level hypergraph learning: a concept-level hypergraph to capture high-order inter-concept dependencies for enhanced reasoning, and an image-level hypergraph to generate domain-adaptive pseudo-labels. This aims to improve label efficiency, interpretability, and diagnostic performance in medical imaging tasks. Experiments are reported on a newly annotated Placenta Accreta Spectrum (PAS) ultrasound dataset, a public breast ultrasound dataset, and the SkinCon dermoscopic dataset, with code released at a GitHub repository.

Significance. If the dual-hypergraph components demonstrably improve both accuracy and concept-level interpretability without introducing unvalidated biases in pseudo-labels, the approach could meaningfully extend CBMs to label-scarce medical domains by reducing reliance on expert concept annotations while preserving clinical scrutability. The public code release supports reproducibility.

minor comments (2)

The abstract claims 'superior interpretability and performance' but provides no quantitative metrics, baselines, or ablation results; these should be summarized with effect sizes in the abstract for immediate assessment.
The description of the 'newly annotated PAS ultrasound dataset' lacks any mention of annotation protocol, inter-rater agreement, or dataset statistics; this information is needed to evaluate the label-efficiency claim.

Simulated Author's Rebuttal

0 responses · 0 unresolved

We thank the referee for their thoughtful summary of our work and for recognizing the potential significance of the dual-hypergraph CBM framework in label-scarce medical imaging domains. We are encouraged by the positive note on reproducibility via the public code release. The referee recommendation is listed as uncertain, but no specific major comments were provided in the report. We therefore have no point-by-point responses to address at this stage and would welcome any additional detailed feedback to strengthen the manuscript.

Circularity Check

0 steps flagged

No significant circularity detected

full rationale

The provided abstract and description contain no equations, derivations, or load-bearing steps that reduce by construction to inputs. The framework is described at a high level as integrating concept-level and image-level hypergraphs for semi-supervised learning, with effectiveness shown via experiments on datasets. No self-definitional patterns, fitted inputs called predictions, or self-citation chains are evident. The central claims rest on empirical results rather than tautological reductions, making the derivation self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract-only review yields no explicit free parameters, axioms, or invented entities; the hypergraph structures are presented as methodological innovations rather than new physical entities.

pith-pipeline@v0.9.1-grok · 5798 in / 1145 out tokens · 17792 ms · 2026-06-28T15:07:48.495420+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

63 extracted references · 8 canonical work pages

[1]

Chen, X., X. Wang, K. Zhang, et al. Recent advances and clinical applications of deep learning in medical image analysis.Medical image analysis, 79:102444, 2022

2022
[2]

Siegel, D

Liu, T., E. Siegel, D. Shen. Deep learning and medical image analysis for covid-19 diagnosis and prediction.Annual review of biomedical engineering, 24(1):179–201, 2022

2022
[3]

Zhou, S. K., H. Greenspan, C. Davatzikos, et al. A review of deep learning in medical imaging: Imaging traits, technology trends, case studies with progress highlights, and future promises. Proceedings of the IEEE, 109(5):820–838, 2021

2021
[4]

Yang, Y ., H. Fu, A. I. Aviles-Rivero, et al. Diffmic: Dual-guidance diffusion network for medical image classification. InInternational conference on medical image computing and computer-assisted intervention, pages 95–105. Springer, 2023

2023
[5]

Diffmic-v2: Medical image classification via improved diffusion network.IEEE Transactions on Medical Imaging, 44(5):2244–2255, 2025

Yang, Y . Diffmic-v2: Medical image classification via improved diffusion network.IEEE Transactions on Medical Imaging, 44(5):2244–2255, 2025

2025
[6]

Yang, Y ., S. Wang, L. Liu, et al. Mammodg: Generalisable deep learning breaks the limits of cross-domain multi-center breast cancer screening.arXiv preprint arXiv:2308.01057, 2023

work page arXiv 2023
[7]

Gong, Z., S. Gao, B. Zhao, et al. Cect-mamba: a hierarchical contrast-enhanced-aware model for pancreatic tumor subtyping from multi-phase cect. InProceedings of the IEEE/CVF International Conference on Computer Vision, pages 1161–1171. 2025

2025
[8]

Tjoa, E., C. Guan. A survey on explainable artificial intelligence (xai): Toward medical xai. IEEE transactions on neural networks and learning systems, 32(11):4793–4813, 2020

2020
[9]

Explainability and artificial intelligence in medicine.The Lancet Digital Health, 4(4):e214–e215, 2022

Reddy, S. Explainability and artificial intelligence in medicine.The Lancet Digital Health, 4(4):e214–e215, 2022

2022
[10]

Meier, S

Reyes, M., R. Meier, S. Pereira, et al. On the interpretability of artificial intelligence in radiology: challenges and opportunities.Radiology: artificial intelligence, 2(3):e190043, 2020. 10

2020
[11]

Alizadehsani, U

Nasarian, E., R. Alizadehsani, U. R. Acharya, et al. Designing interpretable ml system to en- hance trust in healthcare: A systematic review to proposed responsible clinician-ai-collaboration framework.Information Fusion, page 102412, 2024

2024
[12]

Yang, Y ., Z.-Y . Wang, Q. Liu, et al. Medical world model. InProceedings of the IEEE/CVF International Conference on Computer Vision, pages 8319–8329. 2025

2025
[13]

Collins, G

Jauniaux, E., S. Collins, G. J. Burton. Placenta accreta spectrum: pathophysiology and evidence- based anatomy for prenatal ultrasound imaging.American journal of obstetrics and gynecology, 218(1):75–87, 2018

2018
[14]

Forlani, C

Cali, G., F. Forlani, C. Lees, et al. Prenatal ultrasound staging system for placenta accreta spectrum disorders.Ultrasound in Obstetrics & Gynecology, 53(6):752–760, 2019

2019
[15]

Ioannou, P

Sarris, I., C. Ioannou, P. Chamberlain, et al. Intra-and interobserver variability in fetal ultrasound measurements.Ultrasound in obstetrics & gynecology, 39(3):266–273, 2012

2012
[16]

Cinque, A

Avola, D., L. Cinque, A. Fagioli, et al. Ultrasound medical imaging techniques: a survey.ACM Computing Surveys (CSUR), 54(3):1–38, 2021

2021
[17]

Yang, Y ., Z. Xing, L. Yu, et al. Vivim: a video vision mamba for ultrasound video segmentation. IEEE Transactions on Circuits and Systems for Video Technology, 2025

2025
[18]

Xu, H., Y . Yang, A. I. Aviles-Rivero, et al. Lgrnet: Local-global reciprocal network for uterine fibroid segmentation in ultrasound videos. InInternational Conference on Medical Image Computing and Computer-Assisted Intervention, pages 667–677. Springer, 2024

2024
[19]

Koh, P. W., T. Nguyen, Y . S. Tang, et al. Concept bottleneck models. InInternational conference on machine learning, pages 5338–5348. PMLR, 2020

2020
[20]

Yuksekgonul, M., M. Wang, J. Zou. Post-hoc concept bottleneck models.arXiv preprint arXiv:2205.15480, 2022

work page arXiv 2022
[21]

Kim, I., J. Kim, J. Choi, et al. Concept bottleneck with visual concept filtering for explainable medical image classification. InInternational Conference on Medical Image Computing and Computer-Assisted Intervention, pages 225–233. Springer, 2023

2023
[22]

Pang, W., X. Ke, S. Tsutsui, et al. Integrating clinical knowledge into concept bottleneck models. InInternational Conference on Medical Image Computing and Computer-Assisted Intervention, pages 243–253. Springer, 2024

2024
[23]

Chowdhury, T. F., V . M. H. Phan, K. Liao, et al. Adacbm: An adaptive concept bottleneck model for explainable and accurate diagnosis. InInternational Conference on Medical Image Computing and Computer-Assisted Intervention, pages 35–45. Springer, 2024

2024
[24]

Semi-supervised concept bottleneck models.arXiv preprint, 2024

Hu, L., T. Huang, H. Xie, et al. Semi-supervised concept bottleneck models.CoRR, abs/2406.18992, 2024

work page arXiv 2024
[25]

Liu, S., S. Yin, L. Qu, et al. Reducing domain gap in frequency and spatial domain for cross- modality domain adaptation on medical image segmentation. InProceedings of the AAAI Conference on Artificial Intelligence, vol. 37, pages 1719–1727. 2023

2023
[26]

Li, H., Y . Wang, R. Wan, et al. Domain generalization for medical imaging classification with linear-dependency regularization.Advances in neural information processing systems, 33:3118–3129, 2020

2020
[27]

Barbiero, G

Espinosa Zarlenga, M., P. Barbiero, G. Ciravegna, et al. Concept embedding models: Beyond the accuracy-explainability trade-off.Advances in Neural Information Processing Systems, 35:21400–21413, 2022

2022
[28]

Tiwari, J

Chauhan, K., R. Tiwari, J. Freyberg, et al. Interactive concept bottleneck models. InProceedings of the AAAI Conference on Artificial Intelligence, vol. 37, pages 5948–5955. 2023

2023
[29]

Oikarinen, T., S. Das, L. M. Nguyen, et al. Label-free concept bottleneck models.arXiv preprint arXiv:2304.06129, 2023. 11

work page arXiv 2023
[30]

Lai, S., L. Hu, J. Wang, et al. Faithful vision-language interpretation via concept bottleneck models. InThe Twelfth International Conference on Learning Representations. 2023

2023
[31]

Magister, L. C., D. Kazhdan, V . Singh, et al. Gcexplainer: Human-in-the-loop concept-based explanations for graph neural networks.arXiv preprint arXiv:2107.11889, 2021

work page arXiv 2021
[32]

Giannini, G

Barbiero, P., F. Giannini, G. Ciravegna, et al. Relational concept bottleneck models.Advances in Neural Information Processing Systems, 37:77663–77685, 2024

2024
[33]

Parbhoo, F

Havasi, M., S. Parbhoo, F. Doshi-Velez. Addressing leakage in concept bottleneck models. Advances in Neural Information Processing Systems, 35:23386–23397, 2022

2022
[34]

Kim, E., D. Jung, S. Park, et al. Probabilistic concept bottleneck models.arXiv preprint arXiv:2306.01574, 2023

work page arXiv 2023
[35]

Ebrahimi Kahou

Sheth, I., S. Ebrahimi Kahou. Auxiliary losses for learning generalizable concept-based models. Advances in Neural Information Processing Systems, 36:26966–26990, 2023

2023
[36]

A., V .-T

Kamraoui, R. A., V .-T. Ta, N. Papadakis, et al. Popcorn: Progressive pseudo-labeling with con- sistency regularization and neighboring. InMedical Image Computing and Computer Assisted Intervention–MICCAI 2021: 24th International Conference, Strasbourg, France, September 27–October 1, 2021, Proceedings, Part II 24, pages 373–382. Springer, 2021

2021
[37]

Li, Y ., J. Chen, X. Xie, et al. Self-loop uncertainty: A novel pseudo-label for semi-supervised medical image segmentation. InMedical Image Computing and Computer Assisted Intervention– MICCAI 2020: 23rd International Conference, Lima, Peru, October 4–8, 2020, Proceedings, Part I 23, pages 614–623. Springer, 2020

2020
[38]

Wu, H., Y . Yang, A. I. Aviles-Rivero, et al. Semi-supervised video desnowing network via temporal decoupling experts and distribution-driven contrastive regularization. InEuropean Conference on Computer Vision, pages 70–89. Springer, 2024

2024
[39]

Liu, X., Y . Yang, Y . Xu, et al. Autoregressive-conditioned diffusion for semi-supervised thyroid ultrasound segmentation with optical flow-based pseudo labels. InProceedings of the IEEE/CVF International Conference on Computer Vision, pages 1340–1350. 2025

2025
[40]

Gu, Y ., T. Zhou, Y . Zhang, et al. Dual-scale enhanced and cross-generative consistency learning for semi-supervised medical image segmentation.Pattern Recognition, 158:110962, 2025

2025
[41]

Xiao, H., Y . Wang, S. Xiong, et al. Cuamt: A mri semi-supervised medical image segmentation framework based on contextual information and mixed uncertainty.Computer Methods and Programs in Biomedicine, page 108755, 2025

2025
[42]

Carlini, I

Berthelot, D., N. Carlini, I. Goodfellow, et al. Mixmatch: A holistic approach to semi-supervised learning.Advances in neural information processing systems, 32, 2019

2019
[43]

Berthelot, N

Sohn, K., D. Berthelot, N. Carlini, et al. Fixmatch: Simplifying semi-supervised learning with consistency and confidence.Advances in neural information processing systems, 33:596–608, 2020

2020
[44]

Deng, X., H. Wu, R. Zeng, et al. Memsam: taming segment anything model for echocardiogra- phy video segmentation. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 9622–9631. 2024

2024
[45]

Aviles-Rivero, A. I., N. Papadakis, R. Li, et al. Graphx net-chest x-ray classification under extreme minimal supervision. InInternational Conference on Medical Image Computing and Computer-Assisted Intervention, pages 504–512. Springer, 2019

2019
[46]

Nguyen, S

Unnikrishnan, B., C. Nguyen, S. Balaram, et al. Semi-supervised classification of radiology images with noteacher: A teacher that is not mean.Medical Image Analysis, 73:102148, 2021

2021
[47]

Carnegie Mellon University, 2005

Zhu, X.Semi-supervised learning with graphs. Carnegie Mellon University, 2005

2005
[48]

Chong, Y ., Y . Ding, Q. Yan, et al. Graph-based semi-supervised learning: A review.Neurocom- puting, 408:216–230, 2020. 12

2020
[49]

Song, Z., X. Yang, Z. Xu, et al. Graph-based semi-supervised learning: A comprehensive review.IEEE Transactions on Neural Networks and Learning Systems, 34(11):8174–8194, 2022

2022
[50]

Gao, Y ., M. Wang, D. Tao, et al. 3-d object retrieval and recognition with hypergraph analysis. IEEE transactions on image processing, 21(9):4290–4303, 2012

2012
[51]

Huang, Y ., Q. Liu, D. Metaxas. ] video object segmentation by hypergraph cut. In2009 IEEE conference on computer vision and pattern recognition, pages 1738–1745. IEEE, 2009

2009
[52]

Han, Y ., P. Wang, S. Kundu, et al. Vision hgnn: An image is more than a graph of nodes. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 19878– 19888. 2023

2023
[53]

Srinivas, S. S., R. K. Sarkar, S. Gangasani, et al. Vision hgnn: An electron-micrograph is worth hypergraph of hypernodes.arXiv preprint arXiv:2408.11351, 2024

work page arXiv 2024
[54]

Hypergraph vision transformers: Images are more than nodes, more than edges

Fixelle, J. Hypergraph vision transformers: Images are more than nodes, more than edges. In Proceedings of the Computer Vision and Pattern Recognition Conference, pages 9751–9761. 2025

2025
[55]

Gao, Y ., Y . Feng, S. Ji, et al. Hgnn+: General hypergraph neural networks.IEEE Transactions on Pattern Analysis and Machine Intelligence, 45(3):3181–3199, 2022

2022
[56]

Feng, Y ., H. You, Z. Zhang, et al. Hypergraph neural networks. InProceedings of the AAAI conference on artificial intelligence, vol. 33, pages 3558–3565. 2019

2019
[57]

Huang, S

Feng, Y ., J. Huang, S. Du, et al. Hyper-yolo: When visual object detection meets hypergraph computation.IEEE Transactions on Pattern Analysis and Machine Intelligence, 2024

2024
[58]

´Cwierz-Pie´nkowska, A

Pawłowska, A., A. ´Cwierz-Pie´nkowska, A. Domalik, et al. Curated benchmark dataset for ultrasound based breast lesion analysis.Scientific Data, 11(1):148, 2024

2024
[59]

Yuksekgonul, Z

Daneshjou, R., M. Yuksekgonul, Z. R. Cai, et al. Skincon: A skin disease dataset densely annotated by domain experts for fine-grained debugging and analysis.Advances in Neural Information Processing Systems, 35:18157–18167, 2022

2022
[60]

Wang, H., J. Hou, H. Chen. Concept complement bottleneck model for interpretable medical image diagnosis.arXiv preprint arXiv:2410.15446, 2024

work page arXiv 2024
[61]

Harris, L

Groh, M., C. Harris, L. Soenksen, et al. Evaluating deep neural networks trained on clinical images in dermatology with the fitzpatrick 17k dataset. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 1820–1828. 2021

2021
[62]

Zhang, S

He, K., X. Zhang, S. Ren, et al. Deep residual learning for image recognition. InProceedings of the IEEE conference on computer vision and pattern recognition, pages 770–778. 2016

2016
[63]

Selvaraju, R. R., M. Cogswell, A. Das, et al. Grad-cam: Visual explanations from deep networks via gradient-based localization. InProceedings of the IEEE international conference on computer vision, pages 618–626. 2017. 13

2017

[1] [1]

Chen, X., X. Wang, K. Zhang, et al. Recent advances and clinical applications of deep learning in medical image analysis.Medical image analysis, 79:102444, 2022

2022

[2] [2]

Siegel, D

Liu, T., E. Siegel, D. Shen. Deep learning and medical image analysis for covid-19 diagnosis and prediction.Annual review of biomedical engineering, 24(1):179–201, 2022

2022

[3] [3]

Zhou, S. K., H. Greenspan, C. Davatzikos, et al. A review of deep learning in medical imaging: Imaging traits, technology trends, case studies with progress highlights, and future promises. Proceedings of the IEEE, 109(5):820–838, 2021

2021

[4] [4]

Yang, Y ., H. Fu, A. I. Aviles-Rivero, et al. Diffmic: Dual-guidance diffusion network for medical image classification. InInternational conference on medical image computing and computer-assisted intervention, pages 95–105. Springer, 2023

2023

[5] [5]

Diffmic-v2: Medical image classification via improved diffusion network.IEEE Transactions on Medical Imaging, 44(5):2244–2255, 2025

Yang, Y . Diffmic-v2: Medical image classification via improved diffusion network.IEEE Transactions on Medical Imaging, 44(5):2244–2255, 2025

2025

[6] [6]

Yang, Y ., S. Wang, L. Liu, et al. Mammodg: Generalisable deep learning breaks the limits of cross-domain multi-center breast cancer screening.arXiv preprint arXiv:2308.01057, 2023

work page arXiv 2023

[7] [7]

Gong, Z., S. Gao, B. Zhao, et al. Cect-mamba: a hierarchical contrast-enhanced-aware model for pancreatic tumor subtyping from multi-phase cect. InProceedings of the IEEE/CVF International Conference on Computer Vision, pages 1161–1171. 2025

2025

[8] [8]

Tjoa, E., C. Guan. A survey on explainable artificial intelligence (xai): Toward medical xai. IEEE transactions on neural networks and learning systems, 32(11):4793–4813, 2020

2020

[9] [9]

Explainability and artificial intelligence in medicine.The Lancet Digital Health, 4(4):e214–e215, 2022

Reddy, S. Explainability and artificial intelligence in medicine.The Lancet Digital Health, 4(4):e214–e215, 2022

2022

[10] [10]

Meier, S

Reyes, M., R. Meier, S. Pereira, et al. On the interpretability of artificial intelligence in radiology: challenges and opportunities.Radiology: artificial intelligence, 2(3):e190043, 2020. 10

2020

[11] [11]

Alizadehsani, U

Nasarian, E., R. Alizadehsani, U. R. Acharya, et al. Designing interpretable ml system to en- hance trust in healthcare: A systematic review to proposed responsible clinician-ai-collaboration framework.Information Fusion, page 102412, 2024

2024

[12] [12]

Yang, Y ., Z.-Y . Wang, Q. Liu, et al. Medical world model. InProceedings of the IEEE/CVF International Conference on Computer Vision, pages 8319–8329. 2025

2025

[13] [13]

Collins, G

Jauniaux, E., S. Collins, G. J. Burton. Placenta accreta spectrum: pathophysiology and evidence- based anatomy for prenatal ultrasound imaging.American journal of obstetrics and gynecology, 218(1):75–87, 2018

2018

[14] [14]

Forlani, C

Cali, G., F. Forlani, C. Lees, et al. Prenatal ultrasound staging system for placenta accreta spectrum disorders.Ultrasound in Obstetrics & Gynecology, 53(6):752–760, 2019

2019

[15] [15]

Ioannou, P

Sarris, I., C. Ioannou, P. Chamberlain, et al. Intra-and interobserver variability in fetal ultrasound measurements.Ultrasound in obstetrics & gynecology, 39(3):266–273, 2012

2012

[16] [16]

Cinque, A

Avola, D., L. Cinque, A. Fagioli, et al. Ultrasound medical imaging techniques: a survey.ACM Computing Surveys (CSUR), 54(3):1–38, 2021

2021

[17] [17]

Yang, Y ., Z. Xing, L. Yu, et al. Vivim: a video vision mamba for ultrasound video segmentation. IEEE Transactions on Circuits and Systems for Video Technology, 2025

2025

[18] [18]

Xu, H., Y . Yang, A. I. Aviles-Rivero, et al. Lgrnet: Local-global reciprocal network for uterine fibroid segmentation in ultrasound videos. InInternational Conference on Medical Image Computing and Computer-Assisted Intervention, pages 667–677. Springer, 2024

2024

[19] [19]

Koh, P. W., T. Nguyen, Y . S. Tang, et al. Concept bottleneck models. InInternational conference on machine learning, pages 5338–5348. PMLR, 2020

2020

[20] [20]

Yuksekgonul, M., M. Wang, J. Zou. Post-hoc concept bottleneck models.arXiv preprint arXiv:2205.15480, 2022

work page arXiv 2022

[21] [21]

Kim, I., J. Kim, J. Choi, et al. Concept bottleneck with visual concept filtering for explainable medical image classification. InInternational Conference on Medical Image Computing and Computer-Assisted Intervention, pages 225–233. Springer, 2023

2023

[22] [22]

Pang, W., X. Ke, S. Tsutsui, et al. Integrating clinical knowledge into concept bottleneck models. InInternational Conference on Medical Image Computing and Computer-Assisted Intervention, pages 243–253. Springer, 2024

2024

[23] [23]

Chowdhury, T. F., V . M. H. Phan, K. Liao, et al. Adacbm: An adaptive concept bottleneck model for explainable and accurate diagnosis. InInternational Conference on Medical Image Computing and Computer-Assisted Intervention, pages 35–45. Springer, 2024

2024

[24] [24]

Semi-supervised concept bottleneck models.arXiv preprint, 2024

Hu, L., T. Huang, H. Xie, et al. Semi-supervised concept bottleneck models.CoRR, abs/2406.18992, 2024

work page arXiv 2024

[25] [25]

Liu, S., S. Yin, L. Qu, et al. Reducing domain gap in frequency and spatial domain for cross- modality domain adaptation on medical image segmentation. InProceedings of the AAAI Conference on Artificial Intelligence, vol. 37, pages 1719–1727. 2023

2023

[26] [26]

Li, H., Y . Wang, R. Wan, et al. Domain generalization for medical imaging classification with linear-dependency regularization.Advances in neural information processing systems, 33:3118–3129, 2020

2020

[27] [27]

Barbiero, G

Espinosa Zarlenga, M., P. Barbiero, G. Ciravegna, et al. Concept embedding models: Beyond the accuracy-explainability trade-off.Advances in Neural Information Processing Systems, 35:21400–21413, 2022

2022

[28] [28]

Tiwari, J

Chauhan, K., R. Tiwari, J. Freyberg, et al. Interactive concept bottleneck models. InProceedings of the AAAI Conference on Artificial Intelligence, vol. 37, pages 5948–5955. 2023

2023

[29] [29]

Oikarinen, T., S. Das, L. M. Nguyen, et al. Label-free concept bottleneck models.arXiv preprint arXiv:2304.06129, 2023. 11

work page arXiv 2023

[30] [30]

Lai, S., L. Hu, J. Wang, et al. Faithful vision-language interpretation via concept bottleneck models. InThe Twelfth International Conference on Learning Representations. 2023

2023

[31] [31]

Magister, L. C., D. Kazhdan, V . Singh, et al. Gcexplainer: Human-in-the-loop concept-based explanations for graph neural networks.arXiv preprint arXiv:2107.11889, 2021

work page arXiv 2021

[32] [32]

Giannini, G

Barbiero, P., F. Giannini, G. Ciravegna, et al. Relational concept bottleneck models.Advances in Neural Information Processing Systems, 37:77663–77685, 2024

2024

[33] [33]

Parbhoo, F

Havasi, M., S. Parbhoo, F. Doshi-Velez. Addressing leakage in concept bottleneck models. Advances in Neural Information Processing Systems, 35:23386–23397, 2022

2022

[34] [34]

Kim, E., D. Jung, S. Park, et al. Probabilistic concept bottleneck models.arXiv preprint arXiv:2306.01574, 2023

work page arXiv 2023

[35] [35]

Ebrahimi Kahou

Sheth, I., S. Ebrahimi Kahou. Auxiliary losses for learning generalizable concept-based models. Advances in Neural Information Processing Systems, 36:26966–26990, 2023

2023

[36] [36]

A., V .-T

Kamraoui, R. A., V .-T. Ta, N. Papadakis, et al. Popcorn: Progressive pseudo-labeling with con- sistency regularization and neighboring. InMedical Image Computing and Computer Assisted Intervention–MICCAI 2021: 24th International Conference, Strasbourg, France, September 27–October 1, 2021, Proceedings, Part II 24, pages 373–382. Springer, 2021

2021

[37] [37]

Li, Y ., J. Chen, X. Xie, et al. Self-loop uncertainty: A novel pseudo-label for semi-supervised medical image segmentation. InMedical Image Computing and Computer Assisted Intervention– MICCAI 2020: 23rd International Conference, Lima, Peru, October 4–8, 2020, Proceedings, Part I 23, pages 614–623. Springer, 2020

2020

[38] [38]

Wu, H., Y . Yang, A. I. Aviles-Rivero, et al. Semi-supervised video desnowing network via temporal decoupling experts and distribution-driven contrastive regularization. InEuropean Conference on Computer Vision, pages 70–89. Springer, 2024

2024

[39] [39]

Liu, X., Y . Yang, Y . Xu, et al. Autoregressive-conditioned diffusion for semi-supervised thyroid ultrasound segmentation with optical flow-based pseudo labels. InProceedings of the IEEE/CVF International Conference on Computer Vision, pages 1340–1350. 2025

2025

[40] [40]

Gu, Y ., T. Zhou, Y . Zhang, et al. Dual-scale enhanced and cross-generative consistency learning for semi-supervised medical image segmentation.Pattern Recognition, 158:110962, 2025

2025

[41] [41]

Xiao, H., Y . Wang, S. Xiong, et al. Cuamt: A mri semi-supervised medical image segmentation framework based on contextual information and mixed uncertainty.Computer Methods and Programs in Biomedicine, page 108755, 2025

2025

[42] [42]

Carlini, I

Berthelot, D., N. Carlini, I. Goodfellow, et al. Mixmatch: A holistic approach to semi-supervised learning.Advances in neural information processing systems, 32, 2019

2019

[43] [43]

Berthelot, N

Sohn, K., D. Berthelot, N. Carlini, et al. Fixmatch: Simplifying semi-supervised learning with consistency and confidence.Advances in neural information processing systems, 33:596–608, 2020

2020

[44] [44]

Deng, X., H. Wu, R. Zeng, et al. Memsam: taming segment anything model for echocardiogra- phy video segmentation. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 9622–9631. 2024

2024

[45] [45]

Aviles-Rivero, A. I., N. Papadakis, R. Li, et al. Graphx net-chest x-ray classification under extreme minimal supervision. InInternational Conference on Medical Image Computing and Computer-Assisted Intervention, pages 504–512. Springer, 2019

2019

[46] [46]

Nguyen, S

Unnikrishnan, B., C. Nguyen, S. Balaram, et al. Semi-supervised classification of radiology images with noteacher: A teacher that is not mean.Medical Image Analysis, 73:102148, 2021

2021

[47] [47]

Carnegie Mellon University, 2005

Zhu, X.Semi-supervised learning with graphs. Carnegie Mellon University, 2005

2005

[48] [48]

Chong, Y ., Y . Ding, Q. Yan, et al. Graph-based semi-supervised learning: A review.Neurocom- puting, 408:216–230, 2020. 12

2020

[49] [49]

Song, Z., X. Yang, Z. Xu, et al. Graph-based semi-supervised learning: A comprehensive review.IEEE Transactions on Neural Networks and Learning Systems, 34(11):8174–8194, 2022

2022

[50] [50]

Gao, Y ., M. Wang, D. Tao, et al. 3-d object retrieval and recognition with hypergraph analysis. IEEE transactions on image processing, 21(9):4290–4303, 2012

2012

[51] [51]

Huang, Y ., Q. Liu, D. Metaxas. ] video object segmentation by hypergraph cut. In2009 IEEE conference on computer vision and pattern recognition, pages 1738–1745. IEEE, 2009

2009

[52] [52]

Han, Y ., P. Wang, S. Kundu, et al. Vision hgnn: An image is more than a graph of nodes. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 19878– 19888. 2023

2023

[53] [53]

Srinivas, S. S., R. K. Sarkar, S. Gangasani, et al. Vision hgnn: An electron-micrograph is worth hypergraph of hypernodes.arXiv preprint arXiv:2408.11351, 2024

work page arXiv 2024

[54] [54]

Hypergraph vision transformers: Images are more than nodes, more than edges

Fixelle, J. Hypergraph vision transformers: Images are more than nodes, more than edges. In Proceedings of the Computer Vision and Pattern Recognition Conference, pages 9751–9761. 2025

2025

[55] [55]

Gao, Y ., Y . Feng, S. Ji, et al. Hgnn+: General hypergraph neural networks.IEEE Transactions on Pattern Analysis and Machine Intelligence, 45(3):3181–3199, 2022

2022

[56] [56]

Feng, Y ., H. You, Z. Zhang, et al. Hypergraph neural networks. InProceedings of the AAAI conference on artificial intelligence, vol. 33, pages 3558–3565. 2019

2019

[57] [57]

Huang, S

Feng, Y ., J. Huang, S. Du, et al. Hyper-yolo: When visual object detection meets hypergraph computation.IEEE Transactions on Pattern Analysis and Machine Intelligence, 2024

2024

[58] [58]

´Cwierz-Pie´nkowska, A

Pawłowska, A., A. ´Cwierz-Pie´nkowska, A. Domalik, et al. Curated benchmark dataset for ultrasound based breast lesion analysis.Scientific Data, 11(1):148, 2024

2024

[59] [59]

Yuksekgonul, Z

Daneshjou, R., M. Yuksekgonul, Z. R. Cai, et al. Skincon: A skin disease dataset densely annotated by domain experts for fine-grained debugging and analysis.Advances in Neural Information Processing Systems, 35:18157–18167, 2022

2022

[60] [60]

Wang, H., J. Hou, H. Chen. Concept complement bottleneck model for interpretable medical image diagnosis.arXiv preprint arXiv:2410.15446, 2024

work page arXiv 2024

[61] [61]

Harris, L

Groh, M., C. Harris, L. Soenksen, et al. Evaluating deep neural networks trained on clinical images in dermatology with the fitzpatrick 17k dataset. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 1820–1828. 2021

2021

[62] [62]

Zhang, S

He, K., X. Zhang, S. Ren, et al. Deep residual learning for image recognition. InProceedings of the IEEE conference on computer vision and pattern recognition, pages 770–778. 2016

2016

[63] [63]

Selvaraju, R. R., M. Cogswell, A. Das, et al. Grad-cam: Visual explanations from deep networks via gradient-based localization. InProceedings of the IEEE international conference on computer vision, pages 618–626. 2017. 13

2017