T-DuMpRa: Teacher-guided Dual-path Multi-prototype Retrieval Augmented framework for fine-grained medical image classification
Pith reviewed 2026-05-10 06:22 UTC · model grok-4.3
The pith
A teacher-guided dual-path framework with multi-prototype retrieval and confidence-gated fusion improves accuracy on visually ambiguous medical images.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The T-DuMpRa framework jointly optimizes discriminative classification and multi-prototype retrieval during training by using an EMA teacher to build a clustered memory bank in embedding space, then at inference fuses the classifier distribution with cosine similarity to the prototypes through a conservative confidence gate that activates retrieval solely when the base prediction is uncertain and the retrieval evidence is decisive and conflicting.
What carries the argument
The confidence-gated fusion mechanism that selectively combines the base classifier output with cosine similarity scores to a multi-prototype memory bank constructed from EMA teacher embeddings, activating only on uncertain and conflicting cases.
If this is right
- The framework can be attached to any existing backbone by adding a compact prototype bank without retraining the original model from scratch.
- Joint cross-entropy and contrastive training produces embeddings that support both classification and reliable prototype matching.
- The EMA teacher supplies smoother representations that enable stable clustering into multiple prototypes per class.
- The conservative gate leaves confident correct predictions untouched while targeting only the ambiguous subset.
- Visualization of activation patterns confirms the method focuses retrieval on visually similar inter-class examples.
Where Pith is reading between the lines
- The selective activation logic could be tested on other fine-grained domains such as plant species or product variants where uncertainty also signals visual overlap.
- Replacing the fixed prototype bank with an online-updating version might allow the method to adapt to distribution shift without full retraining.
- Varying the uncertainty and conflict thresholds per dataset could reveal whether the reported gains are conservative or near-optimal.
- The dual-path training might be extended by adding a third path that learns to predict when retrieval will be helpful, turning the gate into a learned component.
Load-bearing premise
The gated fusion will activate retrieval exactly when it resolves ambiguity without introducing errors on predictions that are already correct but uncertain.
What would settle it
On the HAM10000 or ISIC2019 test sets, identify the subset of cases where the base classifier is uncertain yet correct, apply the fusion unconditionally, and check whether accuracy falls relative to the base classifier alone.
Figures
read the original abstract
Fine-grained medical image classification is challenged by subtle inter-class variations and visually ambiguous cases, where confidence estimates often exhibit uncertainty rather than being overconfident. In such scenarios, purely discriminative classifiers may achieve high overall accuracy yet still fail to distinguish between highly similar categories, leading to miscalibrated predictions. We propose T-DuMpRa, a teacher-guided dual-path multi-prototype retrieval-augmented framework, where discriminative classification and multi-prototype retrieval jointly drive both training and prediction. During training, we jointly optimize cross-entropy and supervised contrastive objectives to learn a cosine-compatible embedding geometry for reliable prototype matching. We further employ an exponential moving average (EMA) teacher to obtain smoother representations and build a multi-prototype memory bank by clustering teacher embeddings in the teacher embedding space. Our framework is plug-and-play: it can be easily integrated into existing classification models by constructing a compact prototype bank, thereby improving performance on visually ambiguous cases. At inference, we combine the classifier's predicted distribution with a similarity-based distribution computed via cosine matching to prototypes, and apply a conservative confidence-gated fusion that activates retrieval only when the classifier's prediction is uncertain and the retrieval evidence is decisive and conflicting, otherwise keeping confident predictions unchanged. On HAM10000 and ISIC2019, our method yields 0.68%-0.21% and 0.44%-2.69% improvements on 5 different backbones. And visualization analysis proves our model can enhance the model's ability to handle visually ambiguous cases.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper introduces T-DuMpRa, a teacher-guided dual-path multi-prototype retrieval-augmented framework for fine-grained medical image classification. It jointly trains a classifier with cross-entropy and supervised contrastive losses, uses an EMA teacher to build a multi-prototype memory bank from clustered embeddings, and at inference fuses the classifier's distribution with a prototype similarity distribution using a conservative confidence gate that only activates retrieval for uncertain and conflicting cases. The authors claim small but consistent improvements on HAM10000 (0.21-0.68%) and ISIC2019 (0.44-2.69%) across five backbones, with visualizations suggesting better handling of ambiguous cases.
Significance. If validated, this work could offer a lightweight, plug-and-play method to boost performance of standard backbones on medical datasets with high visual similarity between classes. The conservative gating strategy is a positive aspect to prevent degradation on easy cases. The gains are modest, so the significance would be in providing a practical tool rather than a breakthrough in accuracy.
major comments (3)
- Abstract: The reported performance improvements are given as ranges without specifying per-backbone results, statistical significance, or number of runs, which is critical to evaluate if the gains are reliable and attributable to the proposed fusion mechanism rather than training variations.
- Inference mechanism (as described in abstract): The confidence-gated fusion is presented qualitatively without quantitative analysis of activation frequency, false positive rate on non-ambiguous cases, or ablation removing the gate; this directly impacts whether the central claim that retrieval augmentation enhances ambiguous case handling holds.
- Method description: No ablation studies are described to separate the contributions of the joint training objectives, EMA teacher, and the inference-time fusion, making it difficult to confirm that the dual-path aspect is responsible for the observed improvements on the two datasets.
minor comments (2)
- Abstract: The improvement ranges are written as '0.68%-0.21%' which is non-standard ordering and unclear; it should be clarified if this is the range across backbones or something else.
- The paper would benefit from including the exact values of free parameters such as EMA decay rate and number of prototypes per class in the main text or appendix for reproducibility.
Simulated Author's Rebuttal
We thank the referee for the constructive and detailed comments, which help us improve the clarity and rigor of the manuscript. We address each major comment point by point below, indicating the revisions we will incorporate.
read point-by-point responses
-
Referee: Abstract: The reported performance improvements are given as ranges without specifying per-backbone results, statistical significance, or number of runs, which is critical to evaluate if the gains are reliable and attributable to the proposed fusion mechanism rather than training variations.
Authors: We agree that the abstract summary could be more precise. The ranges (0.21-0.68% on HAM10000 and 0.44-2.69% on ISIC2019) are used for brevity to convey the consistent gains across backbones. Detailed per-backbone results are already provided in Tables 1 and 2 of the main text. In the revised manuscript, we will update the abstract to explicitly note that experiments were run with fixed random seeds for reproducibility and to reference the per-backbone values and any variance reported in the tables. This will allow readers to better assess reliability without lengthening the abstract excessively. revision: yes
-
Referee: Inference mechanism (as described in abstract): The confidence-gated fusion is presented qualitatively without quantitative analysis of activation frequency, false positive rate on non-ambiguous cases, or ablation removing the gate; this directly impacts whether the central claim that retrieval augmentation enhances ambiguous case handling holds.
Authors: The abstract necessarily presents the gating strategy at a high level. The full manuscript includes qualitative visualizations and case studies showing improved handling of ambiguous examples. We acknowledge that quantitative support would strengthen the central claim. In the revision, we will add: (1) the percentage of test samples where the gate activates, (2) an analysis of false-positive activations (cases where the gate triggers but the classifier prediction was correct), and (3) an ablation comparing performance with the gate disabled. These additions will be placed in the experimental or analysis section. revision: yes
-
Referee: Method description: No ablation studies are described to separate the contributions of the joint training objectives, EMA teacher, and the inference-time fusion, making it difficult to confirm that the dual-path aspect is responsible for the observed improvements on the two datasets.
Authors: The current manuscript emphasizes the integrated framework and its overall results. We agree that component-wise ablations would help isolate contributions and confirm the value of the dual-path design. In the revised version, we will add ablation experiments that separately evaluate: (i) cross-entropy only versus joint cross-entropy + supervised contrastive loss, (ii) prototype bank construction with versus without the EMA teacher, and (iii) inference with versus without the gated fusion. These will be reported on both datasets to directly address the concern. revision: yes
Circularity Check
No circularity: empirical framework with independent experimental validation
full rationale
The paper describes a plug-and-play empirical architecture (joint CE + supervised contrastive training, EMA teacher for prototype bank construction, and conservative confidence-gated fusion at inference) whose performance claims are presented as measured improvements on HAM10000 and ISIC2019 across backbones, supported by visualization. No mathematical derivation chain exists that reduces a claimed prediction or result to its own inputs by construction; there are no equations shown that equate fitted parameters to outputs, no self-definitional loops, and no load-bearing self-citations or uniqueness theorems invoked to force the method. The reported gains and ambiguity-handling claims rest on external dataset evaluation rather than tautological re-expression of training objectives.
Axiom & Free-Parameter Ledger
free parameters (3)
- EMA decay rate
- Number of prototypes per class
- Gating thresholds
axioms (2)
- domain assumption Joint optimization of cross-entropy and supervised contrastive losses yields cosine-compatible embeddings
- domain assumption Clustering in teacher embedding space produces useful multi-prototypes for retrieval
Reference graph
Works this paper leans on
-
[1]
International Journal of Intelligent Systems2025(1), 3164952 (2025)
Alam, F., Ullah, A., Shah, D., Ali, S., Tahir, M.: Artificial intelligence in melanoma detection: a review of current technologies and future directions. International Journal of Intelligent Systems2025(1), 3164952 (2025)
work page 2025
-
[2]
In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition
Aleem, S., Wang, F., Maniparambil, M., Arazo, E., Dietlmeier, J., Curran, K., Connor, N.E., Little, S.: Test-time adaptation with salip: A cascade of sam and clip for zero-shot medical image segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. pp. 5184–5193 (2024)
work page 2024
-
[3]
Sage Open5(4), 2158244015611451 (2015)
Bresciani,S.,Eppler,M.J.:The pitfallsofvisualrepresentations: Areviewandclas- sification of common errors made while designing and interpreting visualizations. Sage Open5(4), 2158244015611451 (2015)
work page 2015
-
[4]
Annals of translational medicine8(11), 713 (2020)
Cai, L., Gao, J., Zhao, D.: A review of the application of deep learning in medical image classification and segmentation. Annals of translational medicine8(11), 713 (2020)
work page 2020
-
[5]
IEEE Journal of Biomedical and Health Informatics (2025)
Cao, L., Li, H., Dong, Y., Liu, T., Li, J.: Few-shot class-incremental learning with dynamic prototype refinement for brain activity classification. IEEE Journal of Biomedical and Health Informatics (2025)
work page 2025
-
[6]
Computers in biology and medicine185, 109507 (2025)
Chen, C., Isa, N.A.M., Liu, X.: A review of convolutional neural network based methods for medical image classification. Computers in biology and medicine185, 109507 (2025)
work page 2025
-
[7]
In: International conference on medical image computing and computer-assisted intervention
Chen, W., Wang, P., Ren, H., Sun, L., Li, Q., Yuan, Y., Li, X.: Medical image synthesisviafine-grainedimage-textalignmentandanatomy-pathologyprompting. In: International conference on medical image computing and computer-assisted intervention. pp. 240–250. Springer (2024)
work page 2024
-
[8]
Advances in neural information processing systems 35, 23049–23062 (2022)
Chen, Z., Deng, Y., Wu, Y., Gu, Q., Li, Y.: Towards understanding the mixture-of- experts layer in deep learning. Advances in neural information processing systems 35, 23049–23062 (2022)
work page 2022
-
[9]
Medical Image Analysis76, 102313 (2022)
Cheng, J., Tian, S., Yu, L., Gao, C., Kang, X., Ma, X., Wu, W., Liu, S., Lu, H.: Resganet: Residual group attention network for medical image classification and segmentation. Medical Image Analysis76, 102313 (2022)
work page 2022
-
[10]
In: Proceedings of the IEEE/CVF international conference on computer vision
Cheng, P., Lin, L., Lyu, J., Huang, Y., Luo, W., Tang, X.: Prior: Prototype rep- resentation joint learning from medical images and reports. In: Proceedings of the IEEE/CVF international conference on computer vision. pp. 21361–21371 (2023)
work page 2023
-
[11]
The Lancet Digital Health4(5), e330–e339 (2022)
Combalia, M., Codella, N., Rotemberg, V., Carrera, C., Dusza, S., Gutman, D., Helba, B., Kittler, H., Kurtansky, N.R., Liopyris, K., et al.: Validation of artificial intelligence prediction models for skin cancer diagnosis using dermoscopy images: the 2019 international skin imaging collaboration grand challenge. The Lancet Digital Health4(5), e330–e339 (2022)
work page 2019
-
[12]
In: International Conference on Machine Learning
Conti, J.R., Noiry, N., Clemencon, S., Despiegel, V., Gentric, S.: Mitigating gender bias in face recognition using the von mises-fisher mixture model. In: International Conference on Machine Learning. pp. 4344–4369. PMLR (2022)
work page 2022
-
[13]
Cochrane Database of Systematic Reviews (12) (2018)
Dinnes, J., Deeks, J.J., Chuchu, N., di Ruffano, L.F., Matin, R.N., Thomson, D.R., Wong, K.Y., Aldridge, R.B., Abbott, R., Fawzy, M., et al.: Dermoscopy, with and without visual inspection, for diagnosing melanoma in adults. Cochrane Database of Systematic Reviews (12) (2018)
work page 2018
-
[14]
An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale
Dosovitskiy, A., Beyer, L., Kolesnikov, A., Weissenborn, D., Zhai, X., Unterthiner, T., Dehghani, M., Minderer, M., Heigold, G., Gelly, S., et al.: An image is worth 16x16 words: Transformers for image recognition at scale. arXiv preprint arXiv:2010.11929 (2020) 16 Z. Tang et al
work page internal anchor Pith review Pith/arXiv arXiv 2010
-
[15]
Advances in Neural Information Processing Systems34, 30284–30297 (2021)
Englesson, E., Azizpour, H.: Generalized jensen-shannon divergence loss for learn- ing with noisy labels. Advances in Neural Information Processing Systems34, 30284–30297 (2021)
work page 2021
-
[16]
Ad- vances in neural information processing systems30(2017)
Geifman, Y., El-Yaniv, R.: Selective classification for deep neural networks. Ad- vances in neural information processing systems30(2017)
work page 2017
-
[17]
Advances in Neural Information Processing Systems37, 111047–111073 (2024)
Goren, S., Galil, I., El-Yaniv, R.: Hierarchical selective classification. Advances in Neural Information Processing Systems37, 111047–111073 (2024)
work page 2024
-
[18]
In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition
Han, Z., Yang, F., Huang, J., Zhang, C., Yao, J.: Multimodal dynamics: Dynamical fusion for trustworthy multimodal classification. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. pp. 20707–20717 (2022)
work page 2022
-
[19]
Hasani, N., Morris, M.A., Rhamim, A., Summers, R.M., Jones, E., Siegel, E., Saboury, B.: Trustworthy artificial intelligence in medical imaging. PET clinics 17(1), 1 (2022)
work page 2022
-
[20]
von mises-fisher mixture model-based deep learning: Application to face verification,
Hasnat, M.A., Bohné, J., Milgram, J., Gentric, S., Chen, L.: von mises-fisher mix- ture model-based deep learning: Application to face verification. arXiv preprint arXiv:1706.04264 (2017)
-
[21]
In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition
He, K., Fan, H., Wu, Y., Xie, S., Girshick, R.: Momentum contrast for unsupervised visual representation learning. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. pp. 9729–9738 (2020)
work page 2020
-
[22]
He,K.,Zhang,X.,Ren,S.,Sun,J.:Deepresiduallearningforimagerecognition.In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (June 2016)
work page 2016
-
[23]
In: Proceedings of the Thirty-Fourth International Joint Conference on Artificial Intelligence
Hu, P., Qin, Y., Gou, Y., Li, Y., Yang, M., Peng, X.: Probabilistic multimodal learning with von mises-fisher distributions. In: Proceedings of the Thirty-Fourth International Joint Conference on Artificial Intelligence. pp. 5390–5398 (2025)
work page 2025
-
[24]
Hu, X., Zeng, D., Xu, X., Shi, Y.: Semi-supervised contrastive learning for label- efficientmedicalimagesegmentation.In:Internationalconferenceonmedicalimage computing and computer-assisted intervention. pp. 481–490. Springer (2021)
work page 2021
-
[25]
Hussain, T., Shouno, H., Hussain, A., Hussain, D., Ismail, M., Mir, T.H., Hsu, F.R., Alam, T., Akhy, S.A.: Effresnet-vit: A fusion-based convolutional and vision transformer model for explainable medical image classification. IEEE Access (2025)
work page 2025
-
[26]
In: Proceedings of the Computer Vision and Pattern Recognition Conference
Huy, T.D., Tran, S.K., Nguyen, P., Tran, N.H., Sam, T.B., Van Den Hengel, A., Liao, Z., Verjans, J.W., To, M.S., Phan, V.M.H.: Interactive medical image analysis with concept-based similarity reasoning. In: Proceedings of the Computer Vision and Pattern Recognition Conference. pp. 30797–30806 (2025)
work page 2025
-
[27]
Kendall, A., Gal, Y.: What uncertainties do we need in bayesian deep learning for computer vision? Advances in neural information processing systems30(2017)
work page 2017
-
[28]
Khan, A., Rauf, Z., Khan, A.R., Rathore, S., Khan, S.H., Shah, N., Farooq, U., Asif, H., Asif, A., Zahoora, U., et al.: A recent survey of vision transformers for medical image segmentation. IEEE Access (2025)
work page 2025
-
[29]
Advances in neural information processing systems33, 18661–18673 (2020)
Khosla, P., Teterwak, P., Wang, C., Sarna, A., Tian, Y., Isola, P., Maschinot, A., Liu, C., Krishnan, D.: Supervised contrastive learning. Advances in neural information processing systems33, 18661–18673 (2020)
work page 2020
-
[30]
BMC medical imaging22(1), 69 (2022)
Kim, H.E., Cosa-Linan, A., Santhanam, N., Jannesari, M., Maros, M.E., Gans- landt, T.: Transfer learning for medical image classification: a literature review. BMC medical imaging22(1), 69 (2022)
work page 2022
-
[31]
The lancet oncology3(3), 159–165 (2002)
Kittler, H., Pehamberger, H., Wolff, K., Binder, M.: Diagnostic accuracy of der- moscopy. The lancet oncology3(3), 159–165 (2002)
work page 2002
-
[32]
Multimedia Tools and Applications83(7), 19683– 19728 (2024) T-DuMpRa 17
Kumar, R., Kumbharkar, P., Vanam, S., Sharma, S.: Medical images classification using deep learning: a survey. Multimedia Tools and Applications83(7), 19683– 19728 (2024) T-DuMpRa 17
work page 2024
-
[33]
In: Proceed- ings of the IEEE/CVF conference on computer vision and pattern recognition
Li, T., Cao, P., Yuan, Y., Fan, L., Yang, Y., Feris, R.S., Indyk, P., Katabi, D.: Targeted supervised contrastive learning for long-tailed recognition. In: Proceed- ings of the IEEE/CVF conference on computer vision and pattern recognition. pp. 6918–6928 (2022)
work page 2022
-
[34]
IEEE Transactions on Neural Networks and Learning Systems (2025)
Li, W., Peng, Y., Zhang, M., Ding, L., Hu, H., Shen, L.: Deep model fusion: A survey. IEEE Transactions on Neural Networks and Learning Systems (2025)
work page 2025
-
[35]
IEEE Transactions on Pattern Analysis and Machine Intelligence (2025)
Li, X., Li, J., Du, Z., Zhu, L., Shen, H.T.: Unified modality separation: A vision- language framework for unsupervised domain adaptation. IEEE Transactions on Pattern Analysis and Machine Intelligence (2025)
work page 2025
-
[36]
IEEE Journal of Biomedical and Health Informatics29(5), 3587–3597 (2025)
Liang, X., Li, X., Li, F., Jiang, J., Dong, Q., Wang, W., Wang, K., Dong, S., Luo, G., Li, S.: Medfilip: Medical fine-grained language-image pre-training. IEEE Journal of Biomedical and Health Informatics29(5), 3587–3597 (2025)
work page 2025
-
[37]
In: Proceedings of the 33rd ACM International Conference on Multimedia
Liang,Y.,Chen,H.,Xiong,Y.,Zhou,Z.,Lyu,M.,Lin,Z.,Niu,S.,Zhao,S.,Han,J., Ding, G.: Advancing reliable test-time adaptation of vision-language models under visual variations. In: Proceedings of the 33rd ACM International Conference on Multimedia. pp. 4788–4797 (2025)
work page 2025
-
[38]
IEEE Transactions on Medical Imaging43(2), 674–685 (2023)
Ling, Y., Wang, Y., Dai, W., Yu, J., Liang, P., Kong, D.: Mtanet: Multi-task attention network for automatic medical image segmentation and classification. IEEE Transactions on Medical Imaging43(2), 674–685 (2023)
work page 2023
-
[39]
In: Proceedings of the IEEE/CVF conference on computer vision and pattern recog- nition
Liu, F., Tian, Y., Chen, Y., Liu, Y., Belagiannis, V., Carneiro, G.: Acpl: Anti- curriculum pseudo-labelling for semi-supervised medical image classification. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recog- nition. pp. 20697–20706 (2022)
work page 2022
-
[40]
Liu, Z., Lin, Y., Cao, Y., Hu, H., Wei, Y., Zhang, Z., Lin, S., Guo, B.: Swin transformer:Hierarchicalvisiontransformerusingshiftedwindows.In:Proceedings of the IEEE/CVF international conference on computer vision. pp. 10012–10022 (2021)
work page 2021
-
[41]
In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition
Liu, Z., Mao, H., Wu, C.Y., Feichtenhofer, C., Darrell, T., Xie, S.: A convnet for the 2020s. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. pp. 11976–11986 (2022)
work page 2022
-
[42]
In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition
Long, A., Yin, W., Ajanthan, T., Nguyen, V., Purkait, P., Garg, R., Blair, A., Shen, C., Van den Hengel, A.: Retrieval augmented classification for long-tail visual recognition. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. pp. 6959–6969 (2022)
work page 2022
-
[43]
In: Proceedings of the IEEE/CVF International Conference on Computer Vision
Manhardt, F., Arroyo, D.M., Rupprecht, C., Busam, B., Birdal, T., Navab, N., Tombari, F.: Explaining the ambiguity of object detection and 6d pose from visual data. In: Proceedings of the IEEE/CVF International Conference on Computer Vision. pp. 6841–6850 (2019)
work page 2019
-
[44]
In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition
Meng, M., Feng, D., Bi, L., Kim, J.: Correlation-aware coarse-to-fine mlps for de- formable medical image registration. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. pp. 9645–9654 (2024)
work page 2024
-
[45]
In: Pro- ceedings of the Computer Vision and Pattern Recognition Conference
Mildenberger, D., Hager, P., Rueckert, D., Menten, M.J.: A tale of two classes: adapting supervised contrastive learning to binary imbalanced datasets. In: Pro- ceedings of the Computer Vision and Pattern Recognition Conference. pp. 10305– 10314 (2025)
work page 2025
-
[46]
Advances in neural information processing systems 34, 14200–14213 (2021)
Nagrani, A., Yang, S., Arnab, A., Jansen, A., Schmid, C., Sun, C.: Attention bot- tlenecks for multimodal fusion. Advances in neural information processing systems 34, 14200–14213 (2021)
work page 2021
-
[47]
Nguyen, T.T.D., Rezatofighi, H., Vo, B.N., Vo, B.T., Savarese, S., Reid, I.: How trustworthy are performance evaluations for basic vision tasks? IEEE Transactions on Pattern Analysis and Machine Intelligence45(7), 8538–8552 (2022) 18 Z. Tang et al
work page 2022
-
[48]
ACM Computing Surveys56(4), 1–41 (2023)
Patrício,C.,Neves,J.C.,Teixeira,L.F.:Explainabledeeplearningmethodsinmed- ical image classification: A survey. ACM Computing Surveys56(4), 1–41 (2023)
work page 2023
-
[49]
In: Proceedings of the IEEE/CVF International Conference on Computer Vision
Pellicer, A.L., Mariucci, A., Angelov, P., Bukhari, M., Kerns, J.G.: Protomedx: Towards explainable multi-modal prototype learning for bone health classification. In: Proceedings of the IEEE/CVF International Conference on Computer Vision. pp. 7357–7366 (2025)
work page 2025
-
[50]
In: Proceedings of the IEEE/CVF International Conference on Computer Vision
Rao, B., Liao, H., Guan, Y., Wang, C., Wang, B., Zhang, J., Li, Z.: Amd: Adap- tive momentum and decoupled contrastive learning framework for robust long-tail trajectory prediction. In: Proceedings of the IEEE/CVF International Conference on Computer Vision. pp. 28849–28858 (2025)
work page 2025
-
[51]
In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision
Sacha, M., Rymarczyk, D., Struski, Ł., Tabor, J., Zieliński, B.: Protoseg: Inter- pretable semantic segmentation with prototypical parts. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision. pp. 1481– 1492 (2023)
work page 2023
-
[52]
IEEE Signal Processing Letters31, 1109–1113 (2024)
Shao, R., Bi, X.J., Chen, Z.: Hybrid vit-cnn network for fine-grained image classi- fication. IEEE Signal Processing Letters31, 1109–1113 (2024)
work page 2024
-
[53]
In: Interna- tional conference on medical image computing and computer-assisted intervention
Sharma,S.,Kumar,A.,Chandra,J.:Confidencematters:Enhancingmedicalimage classification through uncertainty-driven contrastive self-distillation. In: Interna- tional conference on medical image computing and computer-assisted intervention. pp. 133–142. Springer (2024)
work page 2024
-
[54]
Ad- vances in neural information processing systems30(2017)
Snell, J., Swersky, K., Zemel, R.: Prototypical networks for few-shot learning. Ad- vances in neural information processing systems30(2017)
work page 2017
-
[55]
Journal of Electronic Imaging33(3), 033013–033013 (2024)
Song, W., Chen, D.: Posture-guided part learning for fine-grained image catego- rization. Journal of Electronic Imaging33(3), 033013–033013 (2024)
work page 2024
-
[56]
Multimedia Tools and Applications83(9), 27305–27329 (2024)
Spolaor, N., Lee, H.D., Mendes, A.I., Nogueira, C.V., Parmezan, A.R.S., Takaki, W.S.R., Coy, C.S.R., Wu, F.C., Fonseca-Pinto, R.: Fine-tuning pre-trained neural networks for medical image classification in small clinical datasets. Multimedia Tools and Applications83(9), 27305–27329 (2024)
work page 2024
-
[57]
Advances in neural information processing systems33, 6100– 6110 (2020)
Sutter,T.,Daunhawer,I.,Vogt,J.:Multimodalgenerativelearningutilizingjensen- shannon-divergence. Advances in neural information processing systems33, 6100– 6110 (2020)
work page 2020
-
[58]
In: International conference on machine learning
Tan, M., Le, Q.: Efficientnet: Rethinking model scaling for convolutional neural networks. In: International conference on machine learning. pp. 6105–6114. PMLR (2019)
work page 2019
-
[59]
In: International Conference on Medical Image Computing and Computer-Assisted Intervention
Tang,Z.,Sun,B.,He,S.,Hong,Y.,Yu,D.,Liu,Z.,Li,M.,Chen,B.,Zhao,S.:Mibf- net: Multi-modal information balanced fusion network for clinical diagnosis via patient narratives and lesion image. In: International Conference on Medical Image Computing and Computer-Assisted Intervention. pp. 366–375. Springer (2025)
work page 2025
-
[60]
Advances in neural information processing systems30(2017)
Tarvainen, A., Valpola, H.: Mean teachers are better role models: Weight-averaged consistency targets improve semi-supervised deep learning results. Advances in neural information processing systems30(2017)
work page 2017
-
[61]
Journal of Oral Biosciences64(3), 312–320 (2022)
Tsuneki, M.: Deep learning models in medical image analysis. Journal of Oral Biosciences64(3), 312–320 (2022)
work page 2022
-
[62]
Advances in Neural Information Processing Systems35, 18034–18045 (2022)
Valmadre, J.: Hierarchical classification at multiple operating points. Advances in Neural Information Processing Systems35, 18034–18045 (2022)
work page 2022
-
[63]
Medical image analysis79, 102470 (2022)
Van der Velden, B.H., Kuijf, H.J., Gilhuijs, K.G., Viergever, M.A.: Explainable artificial intelligence (xai) in deep learning-based medical image analysis. Medical image analysis79, 102470 (2022)
work page 2022
-
[64]
In: International Conference on Medical Image Computing and Computer- Assisted Intervention
Wang, K., Zhan, B., Zu, C., Wu, X., Zhou, J., Zhou, L., Wang, Y.: Tripled- uncertainty guided mean teacher model for semi-supervised medical image segmen- T-DuMpRa 19 tation. In: International Conference on Medical Image Computing and Computer- Assisted Intervention. pp. 450–460. Springer (2021)
work page 2021
-
[65]
The Lancet Digital Health 4(1), e64–e74 (2022)
Wen, D., Khan, S.M., Xu, A.J., Ibrahim, H., Smith, L., Caballero, J., Zepeda, L., de Blas Perez, C., Denniston, A.K., Liu, X., et al.: Characteristics of publicly avail- able skin cancer image datasets: a systematic review. The Lancet Digital Health 4(1), e64–e74 (2022)
work page 2022
-
[66]
In: International Conference on Machine Learning
Wen, Z., Li, Y.: Toward understanding the feature learning process of self- supervised contrastive learning. In: International Conference on Machine Learning. pp. 11112–11122. PMLR (2021)
work page 2021
-
[67]
Neural Networks187, 107311 (2025)
Xu, Y., Wang, D., Zhang, L., Zhang, L.: Dual selective fusion transformer network for hyperspectral image classification. Neural Networks187, 107311 (2025)
work page 2025
-
[68]
Yang, M., Zhou, Z., Gong, W.: Revisiting the representation learning in long-tailed medical image classification. Pattern Recognition p. 112683 (2025)
work page 2025
-
[69]
IEEE transactions on pattern analysis and machine intelligence43(9), 3126–3137 (2020)
Zadeh, S.G., Schmid, M.: Bias in cross-entropy-based training of deep survival networks. IEEE transactions on pattern analysis and machine intelligence43(9), 3126–3137 (2020)
work page 2020
-
[70]
IEEE Transactions on Neural Networks and Learning Systems (2025)
Zhao, L., Chen, X., Chen, E.Z., Liu, Y., Chen, T., Sun, S.: Retrieval-augmented few-shot medical image segmentation with foundation models. IEEE Transactions on Neural Networks and Learning Systems (2025)
work page 2025
-
[71]
Advances in Neu- ral Information Processing Systems35, 7103–7114 (2022)
Zhou, Y., Lei, T., Liu, H., Du, N., Huang, Y., Zhao, V., Dai, A.M., Le, Q.V., Laudon, J., et al.: Mixture-of-experts with expert choice routing. Advances in Neu- ral Information Processing Systems35, 7103–7114 (2022)
work page 2022
-
[72]
Medical Image Analysis 97, 103281 (2024)
Zhu, Y., Wang, S., Yu, H., Li, W., Tian, J.: Sfpl: Sample-specific fine-grained proto- type learning for imbalanced medical image classification. Medical Image Analysis 97, 103281 (2024)
work page 2024
-
[73]
Zhu, Z., Yu, K., Qi, G., Cong, B., Li, Y., Li, Z., Gao, X.: Lightweight medical image segmentation network with multi-scale feature-guided fusion. Computers in Biology and Medicine182, 109204 (2024) 20 Z. Tang et al. A Effectiveness Analysis of Confidence-Gated Prototype Retrieval This appendix provides a theoretical justification for the proposed confide...
work page 2024
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.