pith. sign in

arxiv: 2512.01510 · v3 · submitted 2025-12-01 · 💻 cs.CV · cs.LG

Semantic-aware Random Convolution and Source Matching for Domain Generalization in Medical Image Segmentation

Pith reviewed 2026-05-17 03:16 UTC · model grok-4.3

classification 💻 cs.CV cs.LG
keywords domain generalizationmedical image segmentationrandom convolutioncross-modality generalizationsingle-source DGsemantic-aware augmentationintensity mapping
0
0 comments X

The pith

Semantic-aware random convolution with intensity mapping enables single-source domain generalization for medical image segmentation across modalities.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper proposes a method to train a segmentation network on a single source domain, such as CT scans, and apply it directly to a different target domain like MR scans without any adaptation or access to target data. It achieves this by diversifying the training data through semantic-aware random convolutions that augment different annotated regions differently during training. At test time, it maps the intensity of target images to resemble the source domain. This approach outperforms previous domain generalization techniques in most cross-modality and cross-center settings for abdominal, heart, and prostate segmentation, sometimes matching the performance of models trained directly on the target domain.

Core claim

The central discovery is that semantic-aware random convolution diversifies the source domain by applying different augmentations to regions based on their semantic labels, and combining this with test-time intensity mapping to match target images to the source allows the model to generalize effectively to unseen domains and modalities.

What carries the argument

Semantic-aware random convolution, which augments different regions of the image differently based on annotation labels to increase domain diversity.

If this is right

  • The method achieves new state-of-the-art performance in domain generalization for medical image segmentation.
  • It matches the performance of an in-domain baseline in several experimental settings.
  • It works across various cross-modality tasks including abdominal, whole-heart, and prostate segmentation.
  • It also handles phase differences in cine MR data from different scanners.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • This suggests the approach could reduce the need for large multi-domain datasets in clinical AI development.
  • Potential extension to other medical imaging tasks beyond segmentation, such as classification or detection.
  • Implies that simple intensity transformations can be sufficient for bridging certain modality gaps when combined with semantic guidance.

Load-bearing premise

That semantic annotations from the source domain can guide effective region-specific augmentations and that intensity mapping alone can bridge cross-modality gaps without losing critical semantic information.

What would settle it

An experiment showing that the method fails to outperform baselines or match in-domain performance when applied to a new modality pair where intensity mapping distorts important features.

Figures

Figures reproduced from arXiv: 2512.01510 by Darko Stern, Franz Thaler, Gernot Plank, Martin Urschler, Mateusz Kozinski, Matthias AF Gsell.

Figure 1
Figure 1. Figure 1: We propose SRCSM, a single-source cross-modality domain generalization approach that (1) during training, expands the source data [PITH_FULL_IMAGE:figures/full_fig_p002_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Contrast between anatomical structures strongly depends [PITH_FULL_IMAGE:figures/full_fig_p004_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Our domain generalization approach: Images from source domain [PITH_FULL_IMAGE:figures/full_fig_p005_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: Exemplary abdominal (row 1) and cardiac (row 2) images are shown before (cols: 1, 3, 5, 7) and after (cols: 2, 4, 6, 8) applying the [PITH_FULL_IMAGE:figures/full_fig_p005_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: Exemplary abdominal (row 1) and cardiac (row 2) images are shown before (cols: 1, 3, 5, 7) and after (cols: 2, 4, 6, 8) applying the [PITH_FULL_IMAGE:figures/full_fig_p006_5.png] view at source ↗
Figure 6
Figure 6. Figure 6: The outcome of statistical analysis of multiple dataset generalization derived from 14 experimental setups reported in Tables [PITH_FULL_IMAGE:figures/full_fig_p011_6.png] view at source ↗
Figure 7
Figure 7. Figure 7: Qualitative results of our SRCSM method compared to the ground truth (GT) label, SLAug and CSDG. Abdominal images are shown for [PITH_FULL_IMAGE:figures/full_fig_p013_7.png] view at source ↗
read the original abstract

We tackle the challenging problem of single-source domain generalization (DG) for medical image segmentation, where we train a network on one domain (e.g., CT) and directly apply it to a different domain (e.g., MR) without adapting the model and without requiring images or annotations from the new domain during training. Our method diversifies the source domain through semantic-aware random convolution, where different regions of a source image are augmented differently at training-time, based on their annotation labels. At test-time, we complement the randomization of the training domain via mapping the intensity of target domain images, making them similar to source domain data. We perform a comprehensive evaluation on a variety of cross-modality and cross-center generalization settings for abdominal, whole-heart and prostate segmentation, where we outperform previous DG techniques in a vast majority of experiments. Additionally, we also investigate our method when training on whole-heart CT or MR data and testing on the diastolic and systolic phase of cine MR data captured with different scanner hardware. Overall, our evaluation shows that our method achieves new state-of-the-art performance in DG for medical image segmentation, even matching the performance of the in-domain baseline in several settings.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The paper claims to address single-source domain generalization for medical image segmentation by proposing semantic-aware random convolution for diversifying source domain training data based on semantic labels, combined with test-time intensity mapping to align target images to source statistics. Through evaluations on abdominal, whole-heart, and prostate segmentation tasks involving cross-modality and cross-center shifts, as well as phase differences in cine MR, the method is reported to achieve new state-of-the-art performance, outperforming prior DG techniques and matching in-domain baselines in several settings.

Significance. If the empirical results hold, this offers a practical, low-overhead approach to single-source DG in medical imaging where acquiring target data is often infeasible. The combination of label-guided training augmentation and simple test-time mapping could facilitate deployment across modalities and scanners. The comprehensive multi-task evaluation is a strength, but significance hinges on clarifying the relative contributions of each component.

major comments (2)
  1. [Section 3.2 and Results] Section 3.2 and experimental results: The central claim that semantic-aware random convolution drives the DG gains (including matching in-domain baselines) requires an ablation isolating its effect from the test-time intensity mapping alone. Without this, the results risk being explained primarily by the global intensity transform, which the skeptic note notes may not preserve higher-order structural cues across CT-MR gaps.
  2. [Results] Results tables: The claim of new SOTA and in-domain matching should report per-experiment standard deviations, number of runs, and statistical tests (e.g., paired t-tests) to substantiate superiority over baselines; current evidence strength is limited without these details.
minor comments (2)
  1. [Section 3.1] The notation for the random convolution parameters (e.g., kernel sizes, probability distributions per semantic class) could be formalized more clearly with equations in Section 3.1 for reproducibility.
  2. [Related Work and Experiments] Ensure all baseline methods are cited with exact implementation details or references to avoid ambiguity in the comparison.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their valuable comments and suggestions. We have carefully addressed each point raised in the report. Our responses are detailed below, and we have made revisions to the manuscript accordingly to strengthen the empirical support for our claims.

read point-by-point responses
  1. Referee: [Section 3.2 and Results] Section 3.2 and experimental results: The central claim that semantic-aware random convolution drives the DG gains (including matching in-domain baselines) requires an ablation isolating its effect from the test-time intensity mapping alone. Without this, the results risk being explained primarily by the global intensity transform, which the skeptic note notes may not preserve higher-order structural cues across CT-MR gaps.

    Authors: We agree that isolating the contribution of the semantic-aware random convolution is crucial for validating our central claim. In the original manuscript, Section 3.2 describes the semantic-aware random convolution in detail, and the results demonstrate improvements over methods that rely solely on intensity-based adaptations. However, to provide a more direct comparison, we have added a new ablation study in the revised manuscript (Section 4.4) that evaluates the performance using only the test-time intensity mapping without the semantic-aware random convolution during training. The results indicate that while the intensity mapping provides some benefit, the semantic-aware random convolution is essential for achieving the reported gains, especially in challenging cross-modality scenarios where higher-order structural information needs to be preserved through label-guided augmentation. We have also clarified in the discussion that the test-time mapping is a simple alignment step and does not replace the training-time diversification. revision: yes

  2. Referee: [Results] Results tables: The claim of new SOTA and in-domain matching should report per-experiment standard deviations, number of runs, and statistical tests (e.g., paired t-tests) to substantiate superiority over baselines; current evidence strength is limited without these details.

    Authors: We acknowledge that reporting standard deviations, the number of runs, and statistical tests would enhance the robustness of our claims. In the revised manuscript, we have updated all results tables to include mean performance with standard deviations computed over 5 independent training runs with different random seeds. Furthermore, we have conducted paired t-tests between our method and the competing baselines for each experiment and reported the corresponding p-values. These additions substantiate the statistical significance of the improvements and the comparability to in-domain performance in several settings. revision: yes

Circularity Check

0 steps flagged

No circularity: algorithmic method with independent empirical validation

full rationale

The paper proposes an explicit algorithmic pipeline—semantic-aware random convolution during training on source labels plus global intensity mapping at test time—for single-source domain generalization. All performance claims rest on direct experimental comparisons against baselines across abdominal, cardiac, and prostate datasets rather than any mathematical derivation, fitted parameter renamed as prediction, or self-citation chain. The method definitions and evaluation protocol are self-contained; no step reduces by construction to its own inputs.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract provides no explicit free parameters, axioms, or invented entities; the contribution is an empirical method.

pith-pipeline@v0.9.0 · 5526 in / 1014 out tokens · 39990 ms · 2026-05-17T03:16:00.727191+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

76 extracted references · 76 canonical work pages · 1 internal anchor

  1. [1]

    Ben-David, J

    S. Ben-David, J. Blitzer, K. Crammer, F. C. Pereira, Analysis of representations for domain adaptation, Advances in Neural Information Processing Systems (NeurIPS) 19 (2006) 137–144. doi:10.7551/mitpress/7503.003.0022

  2. [2]

    Torralba, A

    A. Torralba, A. A. Efros, Unbiased look at dataset bias, in: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2011, pp. 1521–1528. doi:10.1109/CVPR.2011. 5995347

  3. [3]

    E. A. AlBadawy, A. Saha, M. A. Mazurowski, Deep learning for segmentation of brain tumors: Impact of cross-institutional training and testing, Medical Physics 45 (3) (2018) 1150–1158. doi:10.1002/mp.12752

  4. [4]

    E. H. P. Pooch, P. Ballester, R. C. Barros, Can We Trust Deep Learning Based Diagnosis? The Impact of Domain Shift in Chest Radiograph Classification, in: International Workshop on Thoracic Image Analysis, MICCAI, 2020, pp. 74–83. doi: 10.1007/978-3-030-62469-9\_7

  5. [5]

    K. Zhou, Z. Liu, Y . Qiao, T. Xiang, C. C. Loy, Domain gen- eralization: A survey, IEEE Transactions on Pattern Analy- sis and Machine Intelligence 45 (4) (2023) 4396–4415. doi: 10.1109/TPAMI.2022.3195549

  6. [6]

    J. Wang, C. Lan, C. Liu, Y . Ouyang, T. Qin, W. Lu, Y . Chen, W. Zeng, P. S. Yu, Generalizing to unseen domains: A survey on domain generalization, IEEE Transactions on Knowledge and Data Engineering 35 (8) (2023) 8052–8072. doi:10.1109/ TKDE.2022.3178128

  7. [7]

    Zhang, X

    L. Zhang, X. Wang, D. Yang, T. Sanford, S. Harmon, B. Turk- bey, B. J. Wood, H. Roth, A. Myronenko, D. Xu, Z. Xu, Gen- eralizing deep learning for medical image segmentation to un- seen domains via deep stacked transformation, IEEE Transac- tions on Medical Imaging 39 (7) (2020) 2531–2540. doi: 10.1109/TMI.2020.2973595

  8. [8]

    Z. Xu, D. Liu, J. Yang, C. Ra ffel, M. Niethammer, Robust and generalizable visual representation learning via random convo- lutions, International Conference on Learning Representations (ICLR) (2021)

  9. [9]

    S. Choi, D. Das, S. Choi, S. Yang, H. Park, S. Yun, Progres- sive random convolutions for single domain generalization, in: Proceedings of the IEEE /CVF Conference on Computer Vi- sion and Pattern Recognition (CVPR), 2023, pp. 10312–10322. doi:10.1109/CVPR52729.2023.00994

  10. [10]

    IEEE Transactions on Medi- cal Imaging41(10), 2867–2878 (Oct 2022).https://doi.org/10.1109/TMI.2022

    C. Ouyang, C. Chen, S. Li, Z. Li, C. Qin, W. Bai, D. Rueck- ert, Causality-Inspired Single-Source domain generalization for medical image segmentation, IEEE Transactions on Medical Imaging 42 (4) (2023) 1095–1106. doi:10.1109/TMI.2022. 3224067

  11. [11]

    Z. Su, K. Yao, X. Yang, K. Huang, Q. Wang, J. Sun, Rethink- ing data augmentation for Single-Source domain generalization in medical image segmentation, AAAI Conference on Artificial Intelligence 37 (2) (2023) 2366–2374. doi:10.1609/aaai. v37i2.25332

  12. [12]

    F. Qiao, L. Zhao, X. Peng, Learning to learn single domain gen- eralization, in: Proceedings of the IEEE /CVF Conference on Computer Vision and Pattern Recognition, 2020, pp. 12556– 12565. doi:10.1109/CVPR42600.2020.01257

  13. [13]

    Y . Xu, S. Xie, M. Reynolds, M. Ragoza, M. Gong, K. Bat- manghelich, Adversarial consistency for single domain gener- alization in medical image segmentation, in: International Con- ference on Medical Image Computing and Computer-Assisted Intervention (MICCAI), Springer, 2022, pp. 671–681. doi: 10.1007/978-3-031-16449-1\_64

  14. [14]

    Yang, Y .-C

    F.-E. Yang, Y .-C. Cheng, Z.-Y . Shiau, Y .-C. F. Wang, Adversar- ial teacher-student representation learning for domain general- ization, Advances in Neural Information Processing Systems 34 (2021) 19448–19460

  15. [15]

    C. Chen, C. Qin, H. Qiu, C. Ouyang, S. Wang, L. Chen, G. Tar- roni, W. Bai, D. Rueckert, Realistic adversarial data augmenta- tion for MR image segmentation, in: Medical Image Computing and Computer Assisted Intervention (MICCAI), 2020, pp. 667–

  16. [16]

    doi:10.1007/978-3-030-59710-8\_65

  17. [17]

    Y . Ji, H. Bai, J. Yang, C. Ge, Y . Zhu, R. Zhang, Z. Li, L. Zhang, W. Ma, X. Wan, P. Luo, AMOS: A Large-Scale abdominal Multi-Organ benchmark for versatile medical image segmen- tation, in: Neural Information Processing Systems (NeurIPS) Track on Datasets and Benchmarks, 2022. doi:10.48550/ arXiv.2206.08023

  18. [18]

    Z. Zhao, L. Yang, S. Long, J. Pi, L. Zhou, J. Wang, Aug- mentation matters: A simple-yet-e ffective approach to semi- supervised semantic segmentation, in: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recog- nition (CVPR), 2023, pp. 11350–11359. doi:10.1109/ CVPR52729.2023.01092. 16

  19. [19]

    Huang, H

    R. Huang, H. Cai, W. Zhuo, S. Cai, H. Lin, W. Fan, W. Su, Ada: An adaptive augmentation framework for single-source domain generalization in medical image segmentation, in: Interna- tional Conference on Medical Image Computing and Computer- Assisted Intervention (MICCAI), Springer, 2025, pp. 45–54. doi:10.1007/978-3-032-05127-1\_5

  20. [20]

    Improved Regularization of Convolutional Neural Networks with Cutout

    T. DeVries, G. W. Taylor, Improved regularization of convolutional neural networks with cutout, arXiv preprint arXiv:1708.04552 (2017)

  21. [21]

    V2VNet: Vehicle-to-vehicle communication for joint perception and prediction,

    Z. Huang, H. Wang, E. P. Xing, D. Huang, Self-challenging Improves Cross-Domain Generalization, in: European Confer- ence on Computer Vision (ECCV), 2020, pp. 124–140. doi: 10.1007/978-3-030-58536-5\_8

  22. [22]

    J. Yi, Q. Bi, H. Zheng, H. Zhan, W. Ji, Y . Huang, S. Li, Y . Li, Y . Zheng, F. Huang, Hallucinated style distillation for single do- main generalization in medical image segmentation, in: Interna- tional Conference on Medical Image Computing and Computer- Assisted Intervention (MICCAI), Springer, 2024, pp. 438–448. doi:10.1007/978-3-031-72117-5\_41

  23. [23]

    K. Zhou, Y . Yang, Y . Qiao, T. Xiang, Domain generalization with MixStyle, International Conference on Learning Represen- tations (ICLR) (2021)

  24. [24]

    C. Chen, Z. Li, C. Ouyang, M. Sinclair, W. Bai, D. Rueck- ert, MaxStyle: Adversarial style composition for robust medi- cal image segmentation, in: International Conference on Med- ical Image Computing and Computer-Assisted Intervention (MICCAI), Springer, 2022, pp. 151–161. doi:10.1007/ 978-3-031-16443-9\_15

  25. [25]

    Jiang, Y

    J.-X. Jiang, Y . Li, Z. Wang, Structure-aware single-source gen- eralization with pixel-level disentanglement for joint optic disc and cup segmentation, Biomedical Signal Processing and Con- trol 99 (2025) 106801. doi:10.1016/j.bspc.2024.106801

  26. [26]

    D. Peng, Y . Lei, M. Hayat, Y . Guo, W. Li, Semantic- aware domain generalized segmentation, in: Proceedings of the IEEE /CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2022, pp. 2594–2605. doi:10.1109/ CVPR52688.2022.00262

  27. [27]

    R. Wang, J. Guo, J. Zhang, L. Qi, Q. Yu, Y . Shi, A hybrid dual- augmentation constraint framework for single-source domain generalization in medical image segmentation, Pattern Recog- nition 170 (2026) 112082. doi:10.1016/j.patcog.2025. 112082

  28. [28]

    S. Choi, S. Jung, H. Yun, J. T. Kim, S. Kim, J. Choo, Ro- bustNet: Improving domain generalization in urban-scene seg- mentation via instance selective whitening, in: Proceedings of the IEEE /CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2021, pp. 11580–11590. doi:10.1109/ CVPR46437.2021.01141

  29. [29]

    Z. Niu, H. Sun, S. Ouyang, S. Xie, Y .-w. Chen, R. Tong, L. Lin, IRLSG: Invariant Representation Learning for Single-Domain Generalization in Medical Image Segmentation, in: ICASSP 2024-2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), IEEE, 2024, pp. 5585–5589. doi:10.1109/ICASSP48485.2024.10446700

  30. [30]

    Ahluwalia, M

    D. Scholz, A. C. Erdur, J. C. Peeken, A. Varma, R. Graf, J. S. Kirschke, D. Rueckert, B. Wiestler, Random convolutions for domain generalization of deep learning–based medical image segmentation models, Radiology: Artificial Intelligence 8 (1) (2026) e240502. arXiv:https://doi.org/10.1148/ryai. 240502, doi:10.1148/ryai.240502

  31. [31]

    B. Ren, Y . Li, J. Sun, H. Chen, L. Chen, Anatomically- robust and feature-unbiased domain generalization for medi- cal segmentation, Expert Systems with Applications 298 (2026) 129752. doi:10.1016/j.eswa.2025.129752

  32. [32]

    arXiv preprint arXiv:2304.13785 (2023)

    K. Zhang, D. Liu, Customized segment anything model for medical image segmentation, arXiv preprint arXiv:2304.13785 (2023)

  33. [33]

    Y . Gao, W. Xia, D. Hu, W. Wang, X. Gao, DeSAM: De- coupled segment anything model for generalizable medical image segmentation, in: International Conference on Med- ical Image Computing and Computer-Assisted Intervention (MICCAI), Springer, 2024, pp. 509–519. doi:10.1007/ 978-3-031-72390-2\_48

  34. [34]

    X. Lin, Y . Xiang, L. Yu, Z. Yan, Beyond Adapting SAM: Towards End-to-End Ultrasound Image Segmentation via Auto Prompting, in: International Conference on Med- ical Image Computing and Computer-Assisted Intervention (MICCAI), Springer, 2024, pp. 24–34. doi:10.1007/ 978-3-031-72111-3\_3

  35. [35]

    H. Guan, M. Liu, Domain adaptation for medical image anal- ysis: A survey, IEEE Transactions on Biomedical Engineer- ing 69 (3) (2022) 1173–1185. doi:10.1109/TBME.2021. 3117407

  36. [36]

    Zhang, D

    S. Kumari, P. Singh, Deep learning for unsupervised do- main adaptation in medical imaging: Recent advancements and future perspectives, Computers in Biology and Medicine 170 (3) (2024) 107912. doi:10.1016/j.compbiomed.2023. 107912

  37. [37]

    J. Ma, Histogram matching augmentation for domain adaptation with application to multi-centre, multi-vendor and multi-disease cardiac image segmentation, in: Statistical Atlases and Compu- tational Models of the Heart. M&Ms and EMIDEC Challenges., 2021, pp. 177–186. doi:10.1007/978-3-030-68107-4\ _18

  38. [38]

    Yaras, K

    C. Yaras, K. Kassaw, B. Huang, K. Bradbury, J. M. Malof, Ran- domized histogram matching: A simple augmentation for unsu- pervised domain adaptation in overhead imagery, IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing 17 (2024) 1988–1998. doi:10.1109/JSTARS.2023. 3340412

  39. [39]

    Liang, R

    J. Liang, R. He, T. Tan, A comprehensive survey on test- time adaptation under distribution shifts, International Journal of Computer Vision 133 (1) (2025) 31–64. doi:10.1007/ s11263-024-02181-w

  40. [40]

    J. Zhu, B. Bolsterlee, Y . Song, E. Meijering, Improving cross- domain generalizability of medical image segmentation using uncertainty and shape-aware continual test-time domain adap- tation, Medical Image Analysis 101 (2025) 103422. doi: 10.1016/j.media.2024.103422

  41. [41]

    URL https:// repo-prod.prod.sagebase.org/repo/v1/ doi/locate?id=syn3193805&type=ENTITY

    B. Landman, Z. Xu, J. E. Iglesias, M. Styner, T. Langerak, A. Klein, MICCAI multi-atlas labeling beyond the cranial vault- workshop and challenge, Proc. MICCAI Multi-Atlas Labeling Beyond Cranial Vault—Workshop Challenge (2015). doi:DOI: https://doi.org/10.7303/syn3193805

  42. [42]

    A. E. Kavur, N. S. Gezer, M. Barıs ¸, S. Aslan, P.-H. Conze, V . Groza, D. D. Pham, S. Chatterjee, P. Ernst, S.¨Ozkan, B. Bay- dar, D. Lachinov, S. Han, J. Pauli, F. Isensee, M. Perkonigg, R. Sathish, R. Rajan, D. Sheet, G. Dovletov, O. Speck, A. N ¨urnberger, K. H. Maier-Hein, G. Bozda ˘gı Akar, G. ¨Unal, O. Dicle, M. A. Selver, CHAOS challenge - combine...

  43. [43]

    Graham, Q

    X. Zhuang, L. Li, C. Payer, D. ˇStern, M. Urschler, M. P. Hein- rich, J. Oster, C. Wang, ¨O. Smedby, C. Bian, X. Yang, P.-A. Heng, A. Mortazi, U. Bagci, G. Yang, C. Sun, G. Galisot, J.-Y . Ramel, T. Brouard, Q. Tong, W. Si, X. Liao, G. Zeng, Z. Shi, G. Zheng, C. Wang, T. MacGillivray, D. Newby, K. Rhode, S. Ourselin, R. Mohiaddin, J. Keegan, D. Firmin, G....

  44. [44]

    C. Chen, Q. Dou, H. Chen, J. Qin, P. A. Heng, Unsupervised bidirectional Cross-Modality adaptation via deeply synergistic image and feature alignment for medical image segmentation, IEEE Transactions on Medical Imaging 39 (7) (2020) 2494–

  45. [45]

    doi:10.1109/TMI.2020.2972701

  46. [46]

    Martin-Isla, V

    C. Martin-Isla, V . M. Campello, C. Izquierdo, K. Kushibar, C. Sendra-Balcells, P. Gkontra, A. Sojoudi, M. J. Fulton, T. W. Arega, K. Punithakumar, L. Li, X. Sun, Y . Al Khalil, D. Liu, S. Jabbar, S. Queiros, F. Galati, M. Mazher, Z. Gao, M. Beetz, L. Tautz, C. Galazis, M. Varela, M. Hullebrand, V . Grau, X. Zhuang, D. Puig, M. A. Zuluaga, H. Mohy-Ud- Din...

  47. [47]

    Bloch, A

    N. Bloch, A. Madabhushi, H. Huisman, J. Freymann, J. Kirby, M. Grauer, A. Enquobahrie, C. Ja ffe, L. Clarke, K. Farahani, Nci-isbi 2013 challenge: automated segmentation of prostate structures, The Cancer Imaging Archive 370 (6) (2015) 5

  48. [48]

    Lema ˆıtre, R

    G. Lema ˆıtre, R. Mart ´ı, J. Freixenet, J. C. Vilanova, P. M. Walker, F. Meriaudeau, Computer-aided detection and diagnosis for prostate cancer based on mono and multi-parametric MRI: a review, Computers in Biology and Medicine 60 (2015) 8–31. doi:10.1016/j.compbiomed.2015.02.009

  49. [49]

    gradient descent

    G. Litjens, R. Toth, W. van de Ven, C. Hoeks, S. Kerkstra, B. van Ginneken, G. Vincent, G. Guillard, N. Birbeck, J. Zhang, R. Strand, F. Malmberg, Y . Ou, C. Davatzikos, M. Kirschner, F. Jung, J. Yuan, W. Qiu, Q. Gao, P. E. Edwards, B. Maan, F. van der Heijden, S. Ghose, J. Mitra, J. Dowling, D. Barratt, H. Huisman, A. Madabhushi, Evaluation of prostate s...

  50. [50]

    Q. Liu, Q. Dou, L. Yu, P. A. Heng, MS-net: Multi-site net- work for improving prostate segmentation with heterogeneous MRI data, IEEE Transactions on Medical Imaging 39 (9) (2020) 2713–2724. doi:10.1109/TMI.2020.2974574

  51. [51]

    Thaler, D

    F. Thaler, D. Stern, G. Plank, M. Urschler, LA-CaRe-CNN: Cascading refinement CNN for left atrial scar segmentation, in: MICCAI Challenge on Comprehensive Analysis and Comput- ing of Real-World Medical Images. CARE 2024. Lecture Notes in Computer Science, V ol. 15548, Springer, Cham, 2025, pp. 180–191. doi:10.1007/978-3-031-87009-5_18

  52. [52]

    In: Navab, N., Hornegger, J., Wells, W.M., Frangi, A.F

    O. Ronneberger, P. Fischer, T. Brox, U-Net: Convolutional net- works for biomedical image segmentation, in: Medical Im- age Computing and Computer-Assisted Intervention (MICCAI), 2015, pp. 234–241. doi:10.1007/978-3-319-24574-4\ _28

  53. [53]

    Srivastava, G

    N. Srivastava, G. Hinton, A. Krizhevsky, I. Sutskever, R. Salakhutdinov, Dropout: A simple way to prevent neural net- works from overfitting, Journal of Machine Learning Research 15 (1) (2014) 1929–1958

  54. [54]

    D. P. Kingma, J. Ba, Adam: A method for stochastic optimiza- tion, in: International Conference on Learning Representations (ICLR), 2015

  55. [55]

    K. He, X. Zhang, S. Ren, J. Sun, Delving deep into rectifiers: Surpassing human-level performance on ImageNet classifica- tion, in: IEEE International Conference on Computer Vision (ICCV), 2015, pp. 1026–1034. doi:10.1109/ICCV.2015. 123

  56. [56]

    Laine, T

    S. Laine, T. Aila, Temporal ensembling for Semi-Supervised learning, in: International Conference on Learning Represen- tations (ICLR), 2017

  57. [57]

    Thaler, D

    F. Thaler, D. Stern, G. Plank, M. Urschler, Augmentation- based domain generalization and joint training from multiple source domains for whole heart segmentation, in: MICCAI Challenge on Comprehensive Analysis and Computing of Real- World Medical Images. CARE 2024. Lecture Notes in Com- puter Science, V ol. 15548, Springer, Cham, 2025, pp. 168–179. doi:1...

  58. [58]

    Hendrycks, N

    D. Hendrycks, N. Mu, E. D. Cubuk, B. Zoph, J. Gilmer, B. Lak- shminarayanan, AugMix: A Simple Data Processing Method to Improve Robustness and Uncertainty, in: International Confer- ence on Learning Representations (ICLR), 2020

  59. [59]

    J. Ma, Y . He, F. Li, L. Han, C. You, B. Wang, Segment anything in medical images, Nature Communications 15 (1) (2024) 654. doi:10.1038/s41467-024-44824-z

  60. [60]

    J.-Y . Zhu, T. Park, P. Isola, A. A. Efros, Unpaired image-to- image translation using cycle-consistent adversarial networks, in: International Conference on Computer Vision (ICCV), 2017, pp. 2242–2251. doi:10.1109/iccv.2017.244

  61. [61]

    In2023 IEEE 20th International Symposium on Biomedical Imaging (ISBI), 1–6, DOI: 10.1109/ISBI53787.2023.10230477 (2023)

    X. Zhang, C. Zhang, D. Liu, Q. Yu, W. Cai, SynthMix: Mixing up aligned synthesis for medical cross-modality domain adapta- tion, in: 2023 IEEE 20th International Symposium on Biomed- ical Imaging (ISBI), IEEE, 2023, pp. 1–5. doi:10.1109/ ISBI53787.2023.10230360

  62. [62]

    X. Du, Y . Liu, Constraint-Based Unsupervised Domain Adap- tation Network for Multi-Modality Cardiac Image Segmenta- tion, IEEE Journal of Biomedical and Health Informatics 26 (1) (2022) 67–78. doi:10.1109/JBHI.2021.3126874

  63. [63]

    Z. Liu, Z. Zhu, S. Zheng, Y . Liu, J. Zhou, Y . Zhao, Mar- gin Preserving Self-Paced Contrastive Learning Towards Do- main Adaptation for Medical Image Segmentation, IEEE Jour- nal of Biomedical and Health Informatics 26 (2) (2022) 638–

  64. [64]

    doi:10.1109/JBHI.2022.3140853

  65. [65]

    S. Ding, Z. Liu, P. Liu, W. Zhu, H. Xu, Z. Li, H. Niu, J. Cheng, T. Liu, C3R: Category contrastive adaptation and consistency regularization for cross-modality medical image segmentation, Expert Systems with Applications 269 (2025) 126304. doi: 10.1016/j.eswa.2024.126304

  66. [66]

    Bateson, H

    M. Bateson, H. Kervadec, J. Dolz, H. Lombaert, I. Ben Ayed, Source-Relaxed Domain Adaptation for Image Segmentation, in: Medical Image Computing and Computer Assisted In- tervention (MICCAI), 2020, pp. 490–499. doi:10.1007/ 978-3-030-59710-8\_48

  67. [67]

    Aresta, T

    M. Bateson, H. Kervadec, J. Dolz, H. Lombaert, I. Ben Ayed, Source-free domain adaptation for image segmentation, Medical Image Analysis 82 (2022) 102617. doi:10.1016/j.media. 2022.102617

  68. [68]

    D. Wang, E. Shelhamer, S. Liu, B. Olshausen, T. Darrell, Tent: Fully test-time adaptation by entropy minimization, in: Interna- tional Conference on Learning Representations (ICLR), 2021

  69. [69]

    Bateson, H

    M. Bateson, H. Lombaert, I. Ben Ayed, Test-Time adaptation with shape moments for image segmentation, in: Medical Im- age Computing and Computer Assisted Intervention (MICCAI), 2022, pp. 736–745. doi:10.1007/978-3-031-16440-8\ _70

  70. [70]

    Y . Sun, X. Wang, Z. Liu, J. Miller, A. Efros, M. Hardt, Test-time training with self-supervision for generalization under distribu- tion shifts, in: International Conference on Machine Learning, PMLR, 2020, pp. 9229–9248

  71. [71]

    Karani, E

    N. Karani, E. Erdil, K. Chaitanya, E. Konukoglu, Test-time adaptable neural networks for robust medical image segmen- tation, Medical Image Analysis 68 (2021) 101907. doi:10. 1016/j.media.2020.101907

  72. [72]

    Q. Liu, C. Chen, Q. Dou, P.-A. Heng, Single-domain general- 18 ization in medical image segmentation via test-time adaptation from shape dictionary, in: Proceedings of the AAAI Confer- ence on Artificial Intelligence, V ol. 36, 2022, pp. 1756–1764. doi:10.1609/AAAI.V36I2.20068

  73. [73]

    Journal of Open Source Software5(48), 2173 (2020)

    S. Herbold, Autorank: A python package for automated rank- ing of classifiers, The Journal of Open Source Software 5 (48) (2020) 2173. doi:10.21105/joss.02173

  74. [74]

    Dem ˇsar, Statistical comparisons of classifiers over multiple data sets, Journal of Machine Learning Research 7 (1) (2006) 1–30

    J. Dem ˇsar, Statistical comparisons of classifiers over multiple data sets, Journal of Machine Learning Research 7 (1) (2006) 1–30

  75. [75]

    M. J. Fulton, C. R. Heckman, M. E. Rentschler, Deformable bayesian convolutional networks for disease-robust cardiac mri segmentation, in: International Workshop on Statistical Atlases and Computational Models of the Heart, MICCAI, Springer, 2021, pp. 296–305. doi:10.1007/978-3-030-93722-5\ _32

  76. [76]

    Payer, D

    C. Payer, D. ˇStern, H. Bischof, M. Urschler, Integrating spatial configuration into heatmap regression based CNNs for landmark localization, Medical Image Analysis 54 (5) (2019) 207–219. doi:10.1016/j.media.2019.03.007. 19