pith. machine review for the scientific record. sign in

arxiv: 2605.01563 · v1 · submitted 2026-05-02 · 💻 cs.CV

Recognition: unknown

Multi-Dataset Cross-Domain Knowledge Distillation for Unified Medical Image Segmentation, Classification, and Detection

Authors on Pith no claims yet

Pith reviewed 2026-05-09 14:10 UTC · model grok-4.3

classification 💻 cs.CV
keywords medical image analysisknowledge distillationcross-domain transfersegmentationclassificationobject detectiondomain-invariant featuresmulti-task learning
0
0 comments X

The pith

A joint teacher model trained across multiple medical imaging datasets improves specialized student models for segmentation, classification, and detection via multi-level distillation.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper sets out to show that one shared teacher network can pull useful, domain-spanning features from several different medical scan collections at once. It then transfers those features through layered distillation so that separate student networks, each tuned to one task, perform better than models trained only on single datasets or with simple multi-head setups. A reader would care because medical imaging data varies sharply by scanner, hospital, and patient population, and collecting fresh labels for every new setting is expensive. The claim is that this cross-dataset teacher-student route yields steadier results on MRI and CT volumes for outlining structures, labeling images, and locating objects.

Core claim

The authors establish that a joint teacher model, trained on heterogeneous source datasets, aggregates domain-invariant representations which are then passed via multi-level knowledge distillation to task-specific student models; this yields consistent gains over dataset-specific and multi-head baselines on six segmentation benchmarks (BrainMetShare, ISLES, BraTS, Lung MSD, LiTS, KiTS) plus classification and detection collections, with better robustness to distributional shifts across modalities.

What carries the argument

The joint teacher that aggregates domain-invariant representations from multiple source datasets, followed by multi-level knowledge distillation to task-specific students.

If this is right

  • Performance rises across segmentation, classification, and detection without requiring new task-specific architectures for each dataset.
  • The same framework handles both MRI and CT inputs and produces more stable outputs when input distributions shift between sources.
  • Extending the original segmentation setup to image-level classification and bounding-box detection shows the approach is task-agnostic.
  • Multi-dataset training plus distillation scales more readily than building separate models for every new hospital or modality.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • Hospitals could pool existing public datasets more effectively instead of labeling large new cohorts for every clinical site.
  • The same distillation structure might apply to other medical tasks such as image registration or survival prediction if additional output heads are added.
  • If domain-invariant features prove too coarse, rare or site-specific pathologies could still need targeted fine-tuning of the student.
  • Outside medicine, the same teacher-student pattern could address domain shifts in satellite or industrial imaging where labeled data is scarce.

Load-bearing premise

A single teacher trained on mixed datasets can extract and combine features that remain useful when passed to students without discarding information needed for accurate segmentation, classification, or detection.

What would settle it

Retraining the framework on a fresh collection of scans from a previously unseen scanner vendor or patient group and finding no gain or a clear drop relative to single-dataset baselines would refute the central transfer claim.

Figures

Figures reproduced from arXiv: 2605.01563 by Alexe Dumitru-Bogdan, Anghelina Ion-Marian, Ceausescu Ciprian-Mihai.

Figure 1
Figure 1. Figure 1: Overview of our pipeline. Stage 1: Teacher models are trained on both the target and source tasks. The target dataset 𝐃𝑡 is incorporated into the training of source teacher models to align the feature distributions between the source and target domains. Stage 2: A joint teacher model is constructed by integrating features from the encoder and bottleneck of the target and source teachers at corresponding le… view at source ↗
Figure 2
Figure 2. Figure 2: Teacher model  𝑠𝑘 is trained using a domain adaptation strategy. The losses 𝑦 and 𝑑 are computed to update the model parameters, thus enabling the encoder to learn domain-invariant features. Source teachers with domain adaptation. Each source teacher  𝑠𝑘 , with 𝑘 ∈ {1, … , 𝑚}, is trained on its own source dataset 𝐃𝑠𝑘 and on the target dataset 𝐃𝑡 (Algorithm 1, stage 1, lines 8–19) to encourage domain-in… view at source ↗
Figure 3
Figure 3. Figure 3: Qualitative results. The top half presents MRI results for BrainMetShare (first column), ISLES (second column), and BraTS (third column), while the bottom half shows CT results for Lung MSD (first column), LiTS (second column), and KiTS (third column). For each dataset, the first row displays TResUNet outputs and the last row shows UNet outputs, including the original image, ground truth, output from the d… view at source ↗
Figure 4
Figure 4. Figure 4: Qualitative results. Attention maps for dataset-specific baseline model trained from scratch (left), and the corresponding student model output distilled from a teacher with the same architecture (right) view at source ↗
Figure 5
Figure 5. Figure 5: Qualitative object detection results on lung CT datasets. The top half shows Faster R-CNN predictions, while the bottom half shows RF-DETR predictions. For each detector, results are displayed for Lung Cancer CT & PET-CT (first row), LungCT (second row), and DeepLesion (third row). Within each row, we show (from left to right): the original input image, the ground-truth bounding boxes, the output of the da… view at source ↗
Figure 6
Figure 6. Figure 6: t-SNE visualizations of learned feature representations across tasks. Top: Pixel-level embeddings extracted from the TResUNet bottleneck on the BrainMetShare dataset, showing separation between brain metastases (foreground) and background tissue. Middle: Image-level embeddings from the penultimate layer of MedViT on the OASIS MRI dataset, illustrating the clustering of the four diagnostic classes. Bottom: … view at source ↗
read the original abstract

We propose a unified cross-domain transfer learning framework that leverages knowledge from multiple heterogeneous medical imaging datasets to improve performance across segmentation, classification, and object detection tasks. Our approach employs a teacher-student paradigm in which a joint teacher model aggregates domain-invariant representations learned from diverse source datasets, while a task-specific student model is trained via multi-level knowledge distillation. Originally developed for medical image segmentation, the framework is extended to support image-level classification and object-level detection, enabling a general multi-task formulation for medical image analysis. We evaluate our method on a broad suite of datasets, including six segmentation benchmarks, BrainMetShare, ISLES, BraTS (MRI) and Lung MSD, LiTS, KiTS (CT), as well as multiple classification datasets for pulmonary disease and dementia, and detection datasets with native bounding-box annotations. Across all tasks and modalities, the proposed approach yields consistent improvements over strong dataset-specific and multi-head baselines, demonstrating enhanced robustness to distributional shifts and superior generalization. These findings highlight the potential of multi-dataset knowledge distillation as a scalable and task-agnostic approach for enhancing segmentation, classification, and object detection performance across heterogeneous medical imaging domains.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The manuscript proposes a unified cross-domain transfer learning framework for medical image analysis using a teacher-student paradigm. A joint teacher model is trained on multiple heterogeneous datasets (six segmentation benchmarks: BrainMetShare, ISLES, BraTS (MRI) and Lung MSD, LiTS, KiTS (CT), plus classification datasets for pulmonary disease/dementia and detection datasets with bounding boxes) to aggregate domain-invariant representations. Task-specific student models are then trained via multi-level knowledge distillation. The framework extends from segmentation to classification and detection, with claims of consistent improvements over dataset-specific and multi-head baselines across tasks and modalities, indicating better robustness to distributional shifts.

Significance. If the empirical claims hold after addressing the noted gaps, the work has moderate significance for medical image analysis by showing how multi-dataset knowledge distillation can enable more generalizable, task-agnostic models without separate per-dataset training. The broad evaluation suite spanning modalities and tasks (segmentation, classification, detection) is a positive aspect. No machine-checked proofs or parameter-free derivations are present, but the multi-task extension of distillation is a reasonable incremental idea if validated.

major comments (2)
  1. [§3 (Proposed Method)] §3 (Proposed Method): The joint teacher is described as aggregating domain-invariant representations from heterogeneous MRI/CT datasets, but the training procedure includes no explicit invariance mechanisms such as domain-adversarial losses, feature alignment terms, or MMD penalties. This is load-bearing for the central claim of superior generalization, as the teacher could instead learn a compromise representation with negative transfer across modalities, which would not support the reported gains over single-dataset baselines.
  2. [§4 (Experiments)] §4 (Experiments): The abstract and evaluation claim consistent improvements and enhanced robustness to distributional shifts, but provide no quantitative metrics, error bars, ablation studies (e.g., joint teacher vs. single-domain teachers), or details on data splits and distillation loss formulations. Without these in the results tables, it is impossible to verify that gains are not attributable to increased data volume or model capacity alone.
minor comments (2)
  1. [Abstract] Abstract: Include at least one or two example quantitative improvement values (e.g., Dice score deltas) to make the performance claims more immediately assessable.
  2. Notation: Ensure consistent use of symbols for teacher/student losses and multi-level distillation components across sections to improve readability.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the thoughtful and constructive comments. We address each major point below and have made revisions to strengthen the manuscript, including added ablations and clarifications.

read point-by-point responses
  1. Referee: The joint teacher is described as aggregating domain-invariant representations from heterogeneous MRI/CT datasets, but the training procedure includes no explicit invariance mechanisms such as domain-adversarial losses, feature alignment terms, or MMD penalties. This is load-bearing for the central claim of superior generalization, as the teacher could instead learn a compromise representation with negative transfer across modalities, which would not support the reported gains over single-dataset baselines.

    Authors: We agree that explicit invariance losses would provide stronger guarantees. In the original submission, the joint teacher relies on simultaneous training over the combined multi-modal dataset with a shared encoder and task-specific heads; this setup empirically encourages domain-robust features because the model must perform well on all source domains simultaneously. To directly address the concern about negative transfer, we have added a new ablation (Table 4 in the revision) that compares the joint teacher against single-domain teachers trained on the same total data volume. The joint teacher consistently outperforms the single-domain variants on held-out test sets from each domain, indicating that the shared training aggregates useful invariant representations rather than a harmful compromise. We have also expanded Section 3 to explicitly describe the training objective and note the absence of adversarial terms while justifying the design choice via the empirical results. revision: yes

  2. Referee: The abstract and evaluation claim consistent improvements and enhanced robustness to distributional shifts, but provide no quantitative metrics, error bars, ablation studies (e.g., joint teacher vs. single-domain teachers), or details on data splits and distillation loss formulations. Without these in the results tables, it is impossible to verify that gains are not attributable to increased data volume or model capacity alone.

    Authors: The full manuscript already contains quantitative tables (Tables 1–3) reporting Dice, accuracy, and mAP metrics for all tasks and datasets, with comparisons to dataset-specific and multi-head baselines. Data splits are detailed in Section 4.1 (70/15/15 train/val/test per dataset, with cross-dataset evaluation for robustness). The multi-level distillation loss (feature + logit terms with temperature and weighting hyperparameters) is formulated in Section 3.3. However, we acknowledge that error bars from repeated runs and the requested joint-vs-single-domain ablation were insufficiently prominent. In the revision we have added error bars (mean ± std over 3 seeds) to all tables, included the joint-teacher ablation controlling for data volume (by subsampling single-domain training sets to match total samples), and moved the full loss equations and hyperparameter table to the main text. These additions confirm the gains exceed what can be explained by data volume or capacity alone. revision: yes

Circularity Check

0 steps flagged

No significant circularity; claims rest on external empirical comparisons

full rationale

The paper describes a teacher-student multi-level knowledge distillation framework trained on heterogeneous medical imaging datasets for segmentation, classification, and detection. Its central claims of improved robustness and generalization are supported by direct performance comparisons against dataset-specific and multi-head baselines on held-out benchmarks (BrainMetShare, ISLES, BraTS, Lung MSD, LiTS, KiTS, plus classification and detection sets). No equations, definitions, or load-bearing steps reduce to self-referential fits, self-citations, or ansatzes by construction; the method is presented as an extension of standard KD techniques with results validated externally rather than derived tautologically from inputs.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract-only view reveals no explicit free parameters, axioms, or invented entities; the framework implicitly relies on standard distillation loss weighting and domain-invariance assumptions common to KD literature, but none are quantified or justified here.

pith-pipeline@v0.9.0 · 5520 in / 1196 out tokens · 46556 ms · 2026-05-09T14:10:05.775106+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

97 extracted references · 42 canonical work pages · 5 internal anchors

  1. [1]

    Vision transformers in medical imaging: a com- prehensive review of advancements and applications across multiple diseases

    Aburass, S., Dorgham, O., {Al Shaqsi}, J., {Abu Rumman}, M., Al- Kadi, O., 2025. Vision transformers in medical imaging: a com- prehensive review of advancements and applications across multiple diseases. Journal of Imaging Informatics in Medicine doi:10.1007/ s10278-025-01481-y. publisher Copyright:©The Author(s) under exclusive licence to Society for Im...

  2. [2]

    Themedicalsegmentationdecathlon

    Antonelli, M., Reinke, A., Bakas, S., Farahani, K., Kopp-Schneider, A., Landman, B.A., Litjens, G., Menze, B., Ronneberger, O., Sum- mers,R.M.,etal.,2022. Themedicalsegmentationdecathlon. Nature communications 13, 4128

  3. [3]

    Advances in medical image analysis with vision transformers: A comprehensive review

    Azad,R.,Kazerouni,A.,Heidari,M.,Aghdam,E.K.,Molaei,A.,Jia, Y., Jose, A., Roy, R., Merhof, D., 2024. Advances in medical image analysis with vision transformers: A comprehensive review. Medi- cal Image Analysis 91, 103000. URL:https://www.sciencedirect. com/science/article/pii/S1361841523002608,doi:https://doi.org/10. 1016/j.media.2023.103000

  4. [4]

    Advancing the cancer genome atlas glioma mri collections with expert segmen- tation labels and radiomic features

    Bakas, S., Akbari, H., Sotiras, A., Bilello, M., Rozycki, M., Kirby, J.S., Freymann, J.B., Farahani, K., Davatzikos, C., 2017. Advancing the cancer genome atlas glioma mri collections with expert segmen- tation labels and radiomic features. Scientific data 4, 1–13

  5. [5]

    Identifying the Best Machine Learning Algorithms for Brain Tumor Segmentation, Progression Assessment, and Overall Survival Prediction in the BRATS Challenge

    Bakas, S., Reyes, M., Jakab, A., Bauer, S., Rempfler, M., Crimi, A., Shinohara, R.T., Berger, C., Ha, S.M., Rozycki, M., et al., 2018. Identifying the best machine learning algorithms for brain tumor segmentation,progressionassessment,andoverallsurvivalprediction in the brats challenge. arXiv preprint arXiv:1811.02629

  6. [6]

    Bilic, P., Christ, P., Li, H.B., Vorontsov, E., Ben-Cohen, A., Kaissis, G., Szeskin, A., Jacobs, C., Mamani, G.E.H., Chartrand, G., et al.,

  7. [7]

    MedicalImage Analysis 84, 102680

    Thelivertumorsegmentationbenchmark(lits). MedicalImage Analysis 84, 102680

  8. [8]

    Multi-scale feature enhancementinmulti-tasklearningformedicalimageanalysis

    Bui, P.N., Le, D.T., Bum, J., Choo, H., 2024. Multi-scale feature enhancementinmulti-tasklearningformedicalimageanalysis. URL: https://arxiv.org/abs/2412.00351,arXiv:2412.00351

  9. [9]

    Swin-unet: Unet-like pure transformer for medical image segmentation, in: Proceedings ECCVW

    Cao, H., Wang, Y., Chen, J., Jiang, D., Zhang, X., Tian, Q., Wang, M., 2022. Swin-unet: Unet-like pure transformer for medical image segmentation, in: Proceedings ECCVW

  10. [10]

    Multi-dataset cross- domain knowledge distillation for medical image segmentation

    Ceausescu, C.M., Alexe, B., 2025. Multi-dataset cross- domain knowledge distillation for medical image segmentation. Procedia Computer Science 270, 3007–3016. URL:https: //www.sciencedirect.com/science/article/pii/S1877050925030984, doi:https://doi.org/10.1016/j.procs.2025.09.425. 29th International Conference on Knowledge-Based and Intelligent Informatio...

  11. [11]

    Ceaus,escu, C.M., Alexe, B., Volpi, R., 2024. Coreset based medical image anomaly detection and segmentation, in: Proceedings of the 19thInternationalJointConferenceonComputerVision,Imagingand Computer Graphics Theory and Applications - Volume 4: VISAPP, INSTICC

  12. [12]

    TransUNet: Transformers Make Strong Encoders for Medical Image Segmentation

    Chen, J., Lu, Y., Yu, Q., Luo, X., Adeli, E., Wang, Y., Lu, L., Yuille, A.L.,Zhou,Y.,2021. Transunet:Transformersmakestrongencoders for medical image segmentation. arXiv preprint arXiv:2102.04306

  13. [13]

    Berdiff: Conditional bernoulli diffusion model for medical image segmentation, in: MICCAI, Springer

    Chen, T., Wang, C., Shan, H., 2023. Berdiff: Conditional bernoulli diffusion model for medical image segmentation, in: MICCAI, Springer

  14. [14]

    Explainingknowledge distillation by quantifying the knowledge, in: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp

    Cheng,X.,Rao,Z.,Chen,Y.,Zhang,Q.,2020. Explainingknowledge distillation by quantifying the knowledge, in: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 12925–12935

  15. [15]

    On the efficacy of knowledge distillation, in: Proceedings of the IEEE/CVF ICCV, pp

    Cho, J.H., Hariharan, B., 2019. On the efficacy of knowledge distillation, in: Proceedings of the IEEE/CVF ICCV, pp. 4794–4802

  16. [16]

    Learning a similarity metricdiscriminatively,withapplicationtofaceverification,in:2005 IEEEComputerSocietyConferenceonComputerVisionandPattern Recognition (CVPR’05), pp

    Chopra, S., Hadsell, R., LeCun, Y., 2005. Learning a similarity metricdiscriminatively,withapplicationtofaceverification,in:2005 IEEEComputerSocietyConferenceonComputerVisionandPattern Recognition (CVPR’05), pp. 539–546 vol. 1. doi:10.1109/CVPR.2005. 202

  17. [17]

    Can ai help in screening viral and covid-19 pneumonia? IEEE Access 8, 132665–132676

    Chowdhury, M.E.H., Rahman, T., Khandakar, A., Mazhar, R., Kadir, M.A., Mahbub, Z.B., Islam, K.R., Khan, M.S., Iqbal, A., Emadi, N.A., Reaz, M.B.I., Islam, M.T., 2020. Can ai help in screening viral and covid-19 pneumonia? IEEE Access 8, 132665–132676. doi:10.1109/ACCESS.2020.3010287

  18. [18]

    Çiçek,Ö.,Abdulkadir,A.,Lienkamp,S.S.,Brox,T.,Ronneberger,O.,

  19. [19]

    3d u-net: learning dense volumetric segmentation from sparse annotation,in:Internationalconferenceonmedicalimagecomputing and computer-assisted intervention, Springer. pp. 424–432

  20. [20]

    arXiv 2003.11597 , year=

    Cohen,J.P.,Morrison,P.,Dao,L.,2020. Covid-19imagedatacollec- tion. URL:https://arxiv.org/abs/2003.11597,arXiv:2003.11597

  21. [21]

    Covid-19 infection map generation and detection from chest x-ray images

    Degerli, A., Ahishali, M., Yamac, M., Kiranyaz, S., Chowdhury, M.E.H., Hameed, K., Hamid, T., Mazhar, R., Gabbouj, M., 2021. Covid-19 infection map generation and detection from chest x-ray images. Health Information Science and Systems 9, 15. doi:10.1007/ s13755-021-00146-8

  22. [22]

    Im- agenet: A large-scale hierarchical image database, in: CVPR, IEEE

    Deng,J.,Dong,W.,Socher,R.,Li,L.J.,Li,K.,Fei-Fei,L.,2009. Im- agenet: A large-scale hierarchical image database, in: CVPR, IEEE

  23. [23]

    Modelingtheprobabilisticdistribu- tion of unlabeled data for one-shot medical image segmentation, in: AAAI

    Ding,Y.,Yu,X.,Yang,Y.,2021. Modelingtheprobabilisticdistribu- tion of unlabeled data for one-shot medical image segmentation, in: AAAI

  24. [24]

    An image is worth 16x16 words: Transformers for image recognition at scale, in: International Conference on Learning Representations

    Dosovitskiy, A., Beyer, L., Kolesnikov, A., Weissenborn, D., Zhai, X., Unterthiner, T., Dehghani, M., Minderer, M., Heigold, G., Gelly, S., Uszkoreit, J., Houlsby, N., 2021. An image is worth 16x16 words: Transformers for image recognition at scale, in: International Conference on Learning Representations. URL:https://openreview. net/forum?id=YicbFdNTTy

  25. [25]

    A guide to deep learning in healthcare

    Esteva,A.,Robicquet,A.,Ramsundar,B.,Kuleshov,V.,DePristo,M., Chou, K., Cui, C., Corrado, G., Thrun, S., Dean, J., 2019. A guide to deep learning in healthcare. Nature medicine 25, 24–29

  26. [26]

    Unsupervised domain adaptation by backpropagation,in:Bach,F.,Blei,D.(Eds.),Proceedingsofthe32nd InternationalConferenceonMachineLearning,PMLR,Lille,France

    Ganin, Y., Lempitsky, V., 2015. Unsupervised domain adaptation by backpropagation,in:Bach,F.,Blei,D.(Eds.),Proceedingsofthe32nd InternationalConferenceonMachineLearning,PMLR,Lille,France. pp. 1180–1189

  27. [27]

    Domain-adversarial trainingofneuralnetworks

    Ganin, Y., Ustinova, E., Ajakan, H., Germain, P., Larochelle, H., Laviolette, F., March, M., Lempitsky, V., 2016. Domain-adversarial trainingofneuralnetworks. Journalofmachinelearningresearch17, 1–35

  28. [28]

    Glocker, B., Robinson, R., Castro, D.C., Dou, Q., Konukoglu, E.,

  29. [29]

    Machine Learning with Multi-Site Imaging Data: An Empirical Study on the Impact of Scanner Effects,

    Machine learning with multi-site imaging data: An em- pirical study on the impact of scanner effects. arXiv preprint arXiv:1910.04597

  30. [30]

    Deeplearningenablesautomaticdetectionandsegmentationofbrain metastases on multisequence mri

    Grøvik, E., Yi, D., Iv, M., Tong, E., Rubin, D., Zaharchuk, G., 2020. Deeplearningenablesautomaticdetectionandsegmentationofbrain metastases on multisequence mri. Journal of Magnetic Resonance Imaging 51, 175–182

  31. [31]

    Domain adaptation for medical image analysis: a survey

    Guan, H., Liu, M., 2021. Domain adaptation for medical image analysis: a survey. IEEE Transactions on Biomedical Engineering 69, 1173–1185

  32. [32]

    Unetr: Transformers for 3d medicalimagesegmentation,in:ProceedingsoftheIEEE/CVFwinter conference on applications of computer vision, pp

    Hatamizadeh, A., Tang, Y., Nath, V., Yang, D., Myronenko, A., Landman, B., Roth, H.R., Xu, D., 2022. Unetr: Transformers for 3d medicalimagesegmentation,in:ProceedingsoftheIEEE/CVFwinter conference on applications of computer vision, pp. 574–584

  33. [33]

    Heller, N

    Heller, N., Sathianathen, N., Kalapara, A., Walczak, E., Moore, K., Kaluzniak, H., Rosenberg, J., Blake, P., Rengel, Z., Oestreich, M., et al., 2019. The kits19 challenge data: 300 kidney tumor cases with clinical context, ct semantic segmentations, and surgical outcomes. arXiv preprint arXiv:1904.00445

  34. [34]

    Scientific data 9, 762

    Hernandez Petzsche, M.R., de la Rosa, E., Hanning, U., Wiest, R., Valenzuela, W., Reyes, M., Meyer, M., Liew, S.L., Kofler, F., Ezhov, I.,etal.,2022.Isles2022:Amulti-centermagneticresonanceimaging stroke lesion segmentation dataset. Scientific data 9, 762

  35. [35]

    Distilling the Knowledge in a Neural Network

    Hinton,G.,2015. Distillingtheknowledgeinaneuralnetwork. arXiv preprint arXiv:1503.02531

  36. [36]

    Hosseinzadeh Taher, M.R., Haghighi, F., Feng, R., Gotway, M.B., Liang, J., 2021. A systematic benchmarking analysis of transfer learningformedicalimageanalysis,in:DomainAdaptationandRep- resentation Transfer, and Affordable Healthcare and AI for Resource Diverse Global Health: Third MICCAI Workshop, DART 2021, and First MICCAI Workshop, FAIR 2021, Springe...

  37. [37]

    Explainable artificial intelligence for medical imaging systems using deep learning: a comprehensive review

    Houssein, E.H., Gamal, A.M., Younis, E.M.G., Mohamed, E., 2025. Explainable artificial intelligence for medical imaging systems using deep learning: a comprehensive review. Cluster Computing 28,

  38. [38]

    1007/s10586-025-05281-5

    URL:https://doi.org/10.1007/s10586-025-05281-5, doi:10. 1007/s10586-025-05281-5

  39. [39]

    Self-supervised learning for medical image classification: a systematic review and implementation guidelines

    Huang, S.C., Pareek, A., Jensen, M.E.K., Lungren, M.P., Yeung, S., Chaudhari, A.S., 2023. Self-supervised learning for medical image classification: a systematic review and implementation guidelines. NPJ Digital Medicine 6. URL:https://api.semanticscholar.org/ CorpusID:258355151

  40. [40]

    Multiresunet:Rethinkingtheu-net architecture for multimodal biomedical image segmentation

    Ibtehaz,N.,Rahman,M.S.,2020. Multiresunet:Rethinkingtheu-net architecture for multimodal biomedical image segmentation. Neural networks 121, 74–87

  41. [41]

    Multi-level feature distillation of joint teachers trained on distinct image datasets

    Iordache, A., Alexe, B., Ionescu, R.T., 2024. Multi-level feature distillation of joint teachers trained on distinct image datasets. URL: https://arxiv.org/abs/2410.22184,arXiv:2410.22184

  42. [42]

    Unpaired cross-modalityeduceddistillation(cmedl)formedicalimagesegmen- tation

    Jiang,J.,Rimner,A.,Deasy,J.O.,Veeraraghavan,H.,2021. Unpaired cross-modalityeduceddistillation(cmedl)formedicalimagesegmen- tation. IEEE transactions on medical imaging 41, 1057–1068

  43. [43]

    Ai in diagnostic imaging: Revolu- tionising accuracy and efficiency

    Khalifa, M., Albadawy, M., 2024. Ai in diagnostic imaging: Revolu- tionising accuracy and efficiency. Computer Methods and Programs inBiomedicineUpdate5,100146. URL:https://www.sciencedirect. com/science/article/pii/S2666990024000132,doi:https://doi.org/10. 1016/j.cmpbup.2024.100146

  44. [44]

    Exploring the potential of generative artifi- cial intelligence in medical image synthesis: opportunities, chal- lenges, and future directions

    Khosravi, B., Purkayastha, S., Erickson, B.J., Trivedi, H.M., Gi- choya, J.W., 2025. Exploring the potential of generative artifi- cial intelligence in medical image synthesis: opportunities, chal- lenges, and future directions. The Lancet Digital Health 7, 100890. URL:https://www.sciencedirect.com/science/article/ pii/S258975002500072X, doi:https://doi.o...

  45. [45]

    Transfer learning for medical image classification: a literature review

    Kim, H.E., Cosa-Linan, A., Santhanam, N., Jannesari, M., Maros, M.E., Ganslandt, T., 2022. Transfer learning for medical image classification: a literature review. BMC medical imaging 22, 69

  46. [46]

    1998 , month = nov, journal =

    LeCun, Y., Bottou, L., Bengio, Y., Haffner, P., 1998. Gradient-based learning applied to document recognition. Proceedings of the IEEE 86, 2278–2324. doi:10.1109/5.726791

  47. [47]

    Attention unet++: A nested attention-aware u-net for liver ct image segmentation, in: ICIP 2020, pp

    Li, C., Tan, Y., Chen, W., Luo, X., Gao, Y., Jia, X., Wang, Z., 2020. Attention unet++: A nested attention-aware u-net for liver ct image segmentation, in: ICIP 2020, pp. 345–349. doi:10.1109/ICIP40778. 2020.9190761

  48. [48]

    Litjens, G., Kooi, T., Bejnordi, B.E., Setio, A.A.A., Ciompi, F., Ghafoorian,M.,VanDerLaak,J.A.,VanGinneken,B.,Sánchez,C.I.,

  49. [49]

    Medical image analysis 42, 60–88

    A survey on deep learning in medical image analysis. Medical image analysis 42, 60–88

  50. [50]

    Adaptive multi-teacher multi-level knowledge distillation

    Liu, Y., Zhang, W., Wang, J., 2020. Adaptive multi-teacher multi-level knowledge distillation. Neurocomputing 415, 106–113. URL:http://dx.doi.org/10.1016/j.neucom.2020.07.048,doi:10.1016/ j.neucom.2020.07.048

  51. [51]

    Medvit:Arobustvisiontransformerforgeneralized medical image classification

    Manzari, O.N., Ahmadabadi, H., Kashiani, H., Shokouhi, S.B., Aya- tollahi,A.,2023. Medvit:Arobustvisiontransformerforgeneralized medical image classification. Computers in Biology and Medicine 157, 106791. doi:10.1016/j.compbiomed.2023.106791

  52. [52]

    Open access series of imaging studies (oasis): Cross-sectional mri data in nondemented and demented older adults

    Marcus,D.S.,Wang,T.H.,Parker,J.T.,Csernansky,J.G.,Morris,J.C., Buckner, R.L., 2007. Open access series of imaging studies (oasis): Cross-sectional mri data in nondemented and demented older adults. JournalofCognitiveNeuroscience19,1498–1507. doi:10.1162/jocn. 2007.19.9.1498

  53. [53]

    The multimodal brain tumor image segmentation benchmark (brats)

    Menze,B.H.,Jakab,A.,Bauer,S.,Kalpathy-Cramer,J.,Farahani,K., Kirby, J., Burren, Y., Porz, N., Slotboom, J., Wiest, R., et al., 2014. The multimodal brain tumor image segmentation benchmark (brats). IEEE transactions on medical imaging 34, 1993–2024

  54. [54]

    Deep convolutional neural networks in medical image analysis: A review

    Mienye, I.D., Swart, T.G., Obaido, G., Jordan, M., Ilono, P., 2025. Deep convolutional neural networks in medical image analysis: A review. Information 16. URL:https://www.mdpi.com/2078-2489/16/ 3/195, doi:10.3390/info16030195

  55. [55]

    Mok, T.C.W., Chung, A.C.S., 2019. Learning data augmentation for brain tumor segmentation with coarse-to-fine generative adversarial networks, in: Brainlesion: Glioma, Multiple Sclerosis, Stroke and Traumatic Brain Injuries, Springer International Publishing, Cham

  56. [56]

    The alzheimer’s disease neuroimaging initiative

    Mueller, S.G., Weiner, M.W., Thal, L.J., Petersen, R.C., Jack, C.R., Jagust, W., Trojanowski, J.Q., Toga, A.W., Beckett, L., 2005. The alzheimer’s disease neuroimaging initiative. Neuroimaging Clinics of North America 15, 869–877. doi:10.1016/j.nic.2005.09.008

  57. [57]

    Oktay, O., Schlemper, J., Folgoc, L.L., Lee, M., Heinrich, M., Mis- awa, K., Mori, K., McDonagh, S., Hammerla, N.Y., Kainz, B., et al.,

  58. [58]

    Attention U-Net: Learning Where to Look for the Pancreas

    Attentionu-net:Learningwheretolookforthepancreas. arXiv preprint arXiv:1804.03999

  59. [59]

    Unsupervised domain adaptation of mri skull-stripping trained on adult data to newborns, in: Proceedings IEEE/CVF WACV, pp

    Omidi, A., Mohammadshahi, A., Gianchandani, N., King, R., Lei- jser, L., Souza, R., 2024. Unsupervised domain adaptation of mri skull-stripping trained on adult data to newborns, in: Proceedings IEEE/CVF WACV, pp. 7718–7727

  60. [60]

    (Eds.), Structural, Syntactic, and Statistical Pattern Recognition, Springer Nature Switzerland, Cham

    Pătraşcu,A.V.,Ceauşescu,C.M.,Alexe,B.,2025.Fromsemanticseg- mentationofnaturalimagestomedicalimagesegmentationusingvit- based architectures, in: Torsello, A., Rossi, L., Cosmo, L., Minello, G. (Eds.), Structural, Syntactic, and Statistical Pattern Recognition, Springer Nature Switzerland, Cham. pp. 112–121

  61. [61]

    URL: https://arxiv.org/abs/2206.03671,arXiv:2206.03671

    Pavlova,M.,Tuinstra,T.,Aboutalebi,H.,Zhao,A.,Gunraj,H.,Wong, A.,2022.Covidxcxr-3:Alarge-scale,open-sourcebenchmarkdataset ofchestx-rayimagesforcomputer-aidedcovid-19diagnostics. URL: https://arxiv.org/abs/2206.03671,arXiv:2206.03671

  62. [63]

    Girshick, and Jian Sun

    Ren, S., He, K., Girshick, R., Sun, J., 2016. Faster r-cnn: Towards real-time object detection with region proposal networks. URL: https://arxiv.org/abs/1506.01497,arXiv:1506.01497

  63. [64]

    Generalizedintersectionoverunion:Ametricandalossfor bounding box regression

    Rezatofighi, H., Tsoi, N., Gwak, J., Sadeghian, A., Reid, I., Savarese, S.,2019. Generalizedintersectionoverunion:Ametricandalossfor bounding box regression. URL:https://arxiv.org/abs/1902.09630, arXiv:1902.09630

  64. [65]

    Rf-detr: neural architecture search for real-time detection transformers.arXiv preprint arXiv:2511.09554, 2025

    Robinson,I.,Robicheaux,P.,Popov,M.,Ramanan,D.,Peri,N.,2025. Rf-detr: Neural architecture search for real-time detection transform- ers. URL:https://arxiv.org/abs/2511.09554,arXiv:2511.09554

  65. [66]

    Lungct: Lung ct images with expert-annotated nodules

    Roboflow, 2021. Lungct: Lung ct images with expert-annotated nodules. Roboflow Public Dataset. 2,757 CT images with expert lung nodule annotations

  66. [67]

    FitNets: Hints for Thin Deep Nets

    Romero,A.,Ballas,N.,Kahou,S.E.,Chassang,A.,Gatta,C.,Bengio, Y.,2015.Fitnets:Hintsforthindeepnets,in:InternationalConference on Learning Representations (ICLR). URL:https://arxiv.org/abs/ 1412.6550

  67. [68]

    Ronneberger, O., Fischer, P., Brox, T., 2015. U-net: Convolutional networks for biomedical image segmentation, in: MICCAI 2015: 18th international conference, Munich, Germany, October 5-9, 2015, proceedings, part III 18, Springer. pp. 234–241

  68. [69]

    de la Rosa, E., Reyes, M., Liew, S.L., Hutton, A., Wiest, R., Kaes- macher, J., Hanning, U., Hakim, A., Zubal, R., Valenzuela, W., et al.,

  69. [70]

    A robust ensemble algorithm for ischemic stroke lesion segmentation: Generalizability and clinical utility beyond the isles challenge

  70. [71]

    An Overview of Multi-Task Learning in Deep Neural Networks

    Ruder, S., 2017. An overview of multi-task learning in deep neural networks. arXiv preprint arXiv:1706.05098

  71. [72]

    Data augmentation using generative adversarial networks (cyclegan) to improve generalizability in ct segmentation tasks

    Sandfort, V., Yan, K., Pickhardt, P.J., Summers, R.M., 2019. Data augmentation using generative adversarial networks (cyclegan) to improve generalizability in ct segmentation tasks. Scientific reports 9, 16884

  72. [73]

    Deep learning in medical image analysis

    Shen, D., Wu, G., Suk, H.I., 2017. Deep learning in medical image analysis. Annual review of biomedical engineering 19, 221–248

  73. [74]

    Deep convolutional neural networks for computer-aided detection: Cnn architectures, dataset characteristics and transfer learning

    Shin, H.C., Roth, H.R., Gao, M., Lu, L., Xu, Z., Nogues, I., Yao, J., Mollura, D., Summers, R.M., 2016. Deep convolutional neural networks for computer-aided detection: Cnn architectures, dataset characteristics and transfer learning. IEEE transactions on medical imaging 35. C.M. Ceausescu et al.:Preprint submitted to ElsevierPage 27 of 28 Multi-Dataset C...

  74. [75]

    Sun, B., Saenko, K., 2016. Deep coral: Correlation alignment for deep domain adaptation, in: Computer Vision–ECCV 2016 Work- shops: Amsterdam, The Netherlands, October 8-10 and 15-16, 2016, Proceedings, Part III 14, Springer. pp. 443–450

  75. [76]

    Overview of deep learning in medical imaging

    Suzuki, K., 2017. Overview of deep learning in medical imaging. Radiological physics and technology 10, 257–273

  76. [77]

    Rodney Long, Mark Schiffman, and Sameer Antani

    Tahir, A.M., Chowdhury, M.E., Khandakar, A., Rahman, T., Qi- blawey, Y., Khurshid, U., Kiranyaz, S., Ibtehaz, N., Rahman, M.S., Al-Maadeed, S., Mahmud, S., Ezeddin, M., Hameed, K., Hamid, T., 2021a. Covid-19 infection localization and severity grading from chest x-ray images. Computers in Biology and Medicine 139, 105002. URL:https://www.sciencedirect.com...

  77. [78]

    Kaggle (2025)

    Tahir, A.M., Chowdhury, M.E.H., Qiblawey, Y., Khandakar, A., Rahman, T., Kiranyaz, S., Khurshid, U., Ibtehaz, N., Mahmud, S., Ezeddin, M., 2021b. Covid-qu-ex. Kaggle. doi:10.34740/kaggle/ dsv/3122958

  78. [79]

    Efficientnet: Rethinking model scaling for convolutional neural networks,

    Tan, M., Le, Q.V., 2019. Efficientnet: Rethinking model scaling for convolutional neural networks. arXiv preprint arXiv:1905.11946 URL:https://arxiv.org/abs/1905.11946

  79. [80]

    Transresu-net: Transformer based resu-net for real-time colonoscopy polyp segmentation

    Tomar, N.K., Shergill, A., Rieders, B., Bagci, U., Jha, D., 2022. Transresu-net: Transformer based resu-net for real-time colonoscopy polyp segmentation. URL:https://arxiv.org/abs/2206.08985, arXiv:2206.08985

  80. [81]

    Lung cancer ct & pet-ct dataset

    Unknown, 2020. Lung cancer ct & pet-ct dataset. Medical imaging dataset for lung cancer diagnosis and detection. 36,631 DICOM images with CT, PET, and fused PET/CT studies and lung nodule bounding-box annotations

Showing first 80 references.