pith. sign in

arxiv: 2604.17451 · v1 · submitted 2026-04-19 · 💻 cs.CV

SegTTA: Training-Free Test-Time Augmentation for Zero-Shot Medical Imaging Segmentation

Pith reviewed 2026-05-10 05:33 UTC · model grok-4.3

classification 💻 cs.CV
keywords test-time augmentationzero-shot segmentationmedical image segmentationMedSAM2hepatic vessel segmentationuterine segmentationimage augmentation
0
0 comments X

The pith

SegTTA applies four augmentations and weighted voting across MedSAM2 checkpoints to raise zero-shot medical segmentation accuracy without retraining.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces SegTTA to address variations in medical image quality from different equipment and operators by improving segmentation at test time. It combines gamma correction, contrast enhancement, Gaussian blur, and Gaussian noise with weighted voting over multiple MedSAM2 model checkpoints. This produces consistent gains on three datasets covering healthy uterus, uterine myoma, and multiclass hepatic structures. Ablation results show intensity augmentations help large organs while noise helps small lesions, and a voting threshold lets users adjust coverage versus precision. The approach matters because retraining foundation models for each new scanner or task is expensive and data-intensive.

Core claim

SegTTA shows that four augmentations (Gamma correction, Contrast enhancement, Gaussian blur, Gaussian noise) plus weighted voting across MedSAM2 checkpoints improve zero-shot segmentation without any model retraining. On the multiclass hepatic vessel dataset the method raises mIoU by 1.6 and aIoU by 1.9 while lowering HD95 by roughly 2.0 relative to the MedSAM2 baseline. Ablation studies confirm that large organs gain from intensity-based augmentations and small lesions gain from noise-based ones, while the voting threshold directly controls the coverage-precision trade-off for clinical needs.

What carries the argument

The SegTTA framework that applies four fixed augmentations to each test image and aggregates predictions via weighted voting across multiple MedSAM2 checkpoints.

If this is right

  • Consistent accuracy gains appear across healthy uterus segmentation, uterine myoma detection, and multiclass hepatic structure segmentation.
  • Intensity augmentations improve large-organ results while noise augmentations improve small-lesion results.
  • Raising or lowering the voting threshold trades segmentation coverage for precision to match different clinical priorities.
  • The same procedure reduces Hausdorff distance by about 2.0 on hepatic vessel data.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The method may transfer to other promptable segmentation models if those models also provide multiple checkpoints with complementary errors.
  • Fixed augmentation weights may underperform on datasets with different noise or contrast profiles, suggesting future work on lightweight per-task calibration.
  • Because no retraining occurs, the framework could serve as a quick post-processing step when deploying foundation models in new hospitals.

Load-bearing premise

The specific four augmentations combined with weighted voting across MedSAM2 checkpoints will produce consistent gains on unseen medical images without introducing new errors or requiring task-specific tuning of the weights and threshold.

What would settle it

A held-out medical imaging dataset on which the four-augmentation plus weighted-voting procedure yields no gain or a loss in mIoU, aIoU, or HD95 compared with the plain MedSAM2 baseline.

Figures

Figures reproduced from arXiv: 2604.17451 by Canxuan Gang, Chunlei Li, Hao Zhang, Wenzhi Hu, Xiaoyan Li, Yihong Yao, Zeyu Zhang.

Figure 1
Figure 1. Figure 1: Framework of SegTTA. Baseline outputs from multiple MedSAM2 checkpoints and augmented predictions [PITH_FULL_IMAGE:figures/full_fig_p002_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: UterUS dataset [39] with five categories (1.27%-44%). UMD dataset [40] with two categories (39.28%-60.71%). HepaticVessel dataset [41] with five categories (2.49%-38.26%). 4 Experiments 4.1 Dataset and Evaluation Metrics Datasets [PITH_FULL_IMAGE:figures/full_fig_p005_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Qualitative segmentation results on the UterUS dataset. SegTTA demonstrates improved boundary delineation [PITH_FULL_IMAGE:figures/full_fig_p008_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: Visual comparison on the UMD dataset for uterine myoma detection. The proposed method effectively [PITH_FULL_IMAGE:figures/full_fig_p009_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: Segmentation results on the Task08 HepaticVessel dataset. SegTTA enhances the structural continuity of [PITH_FULL_IMAGE:figures/full_fig_p009_5.png] view at source ↗
read the original abstract

Increasingly advanced data augmentation techniques have greatly aided clinical medical research, increasing data diversity and improving model generalization capabilities. Although most current basic models exhibit strong generalization abilities, image quality varies due to differences in equipment and operators. To address these challenges, we present SegTTA, a framework that improves medical image segmentation without model retraining by combining four augmentations (Gamma correction, Contrast enhancement, Gaussian blur, Gaussian noise) with weighted voting across multiple MedSAM2 checkpoints. Experiments demonstrate consistent improvements across three diverse datasets: healthy uterus segmentation, uterine myoma detection, and multi class hepatic structure segmentation. Ablation studies reveal that large organs benefit from intensity augmentations while small lesions require noise augmentations. The voting threshold controls the coverage precision trade off, enabling task specific optimization for different clinical requirements. Ultimately, on a multiclass hepatic vessel dataset, compared to MedSAM2 baselines, our method achieves an increase of 1.6 in mIoU and 1.9 in aIoU, along with a reduction of approximately 2.0 in HD95. Code will be available at https://github.com/AIGeeksGroup/SegTTA.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

3 major / 1 minor

Summary. The paper introduces SegTTA, a training-free test-time augmentation framework for zero-shot medical imaging segmentation. It applies four augmentations—Gamma correction, Contrast enhancement, Gaussian blur, and Gaussian noise—followed by weighted voting across multiple MedSAM2 checkpoints. The method is evaluated on three datasets: healthy uterus segmentation, uterine myoma detection, and multiclass hepatic structure segmentation, reporting consistent improvements such as +1.6 mIoU, +1.9 aIoU, and approximately -2.0 HD95 on the hepatic vessel dataset compared to MedSAM2 baselines. Ablation studies show that intensity augmentations benefit large organs while noise augmentations help small lesions, and the voting threshold allows for task-specific coverage-precision trade-offs.

Significance. If the reported gains can be achieved with a fixed, non-tuned configuration of augmentations and voting parameters, SegTTA would offer a practical, training-free way to enhance the performance of foundation models like MedSAM2 in clinical settings without additional data or retraining. The availability of code and experiments across diverse datasets strengthen the potential impact. However, the emphasis on task-specific optimization of the threshold suggests that the improvements may depend on per-dataset adjustments, which could reduce the method's generalizability in truly zero-shot scenarios.

major comments (3)
  1. [Abstract] Abstract: The central claim of consistent improvements in a training-free regime is qualified by the statement that 'the voting threshold controls the coverage precision trade off, enabling task specific optimization.' This raises a concern that the headline metrics (e.g., +1.6 mIoU on hepatic vessels) may result from dataset-specific tuning of weights and threshold rather than a single fixed setup, which would undermine the zero-shot test-time augmentation interpretation.
  2. [Abstract / Experiments] Abstract / Experiments: No details are provided on the exact values of augmentation parameters (e.g., gamma values, noise levels), the weighting scheme for voting, or how the threshold is selected. Without this information or evidence that a single set of parameters works across datasets, the reproducibility and generality of the gains cannot be assessed.
  3. [Ablation studies] Ablation studies: The differentiation between intensity and noise augmentations based on organ/lesion size is interesting, but without quantitative results on how these choices affect the final metrics or controls for the number of augmentations, it is unclear if the four-augmentation combination is optimal or if simpler subsets would suffice.
minor comments (1)
  1. [Abstract] The abstract mentions 'approximately 2.0 in HD95' but does not specify the baseline value or units, which would aid interpretation.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for the thoughtful and constructive comments. We address each major point below with clarifications and commitments to revisions that strengthen the manuscript without altering its core claims.

read point-by-point responses
  1. Referee: [Abstract] Abstract: The central claim of consistent improvements in a training-free regime is qualified by the statement that 'the voting threshold controls the coverage precision trade off, enabling task specific optimization.' This raises a concern that the headline metrics (e.g., +1.6 mIoU on hepatic vessels) may result from dataset-specific tuning of weights and threshold rather than a single fixed setup, which would undermine the zero-shot test-time augmentation interpretation.

    Authors: We appreciate the referee's concern about potential ambiguity. The augmentation parameters and voting weights are fixed and identical for all experiments across the three datasets; only the optional threshold provides flexibility for coverage-precision trade-offs in different clinical contexts. The reported gains, including +1.6 mIoU and -2.0 HD95 on hepatic vessels, were obtained with a single fixed threshold value. We will revise the abstract to explicitly state that the primary results use a fixed, non-tuned configuration while noting the threshold as an optional control, thereby reinforcing the training-free and zero-shot nature of SegTTA. revision: yes

  2. Referee: [Abstract / Experiments] Abstract / Experiments: No details are provided on the exact values of augmentation parameters (e.g., gamma values, noise levels), the weighting scheme for voting, or how the threshold is selected. Without this information or evidence that a single set of parameters works across datasets, the reproducibility and generality of the gains cannot be assessed.

    Authors: This is a fair observation on reproducibility. In the revised manuscript we will add a new subsection (or table) specifying the exact augmentation parameters (gamma value, contrast factor, blur sigma, noise variance), the uniform weighting scheme used for voting, and the default threshold selection (with sensitivity analysis). We will also explicitly confirm and demonstrate that this identical fixed parameter set was applied to the healthy uterus, uterine myoma, and multiclass hepatic vessel datasets, producing the reported consistent improvements. revision: yes

  3. Referee: [Ablation studies] Ablation studies: The differentiation between intensity and noise augmentations based on organ/lesion size is interesting, but without quantitative results on how these choices affect the final metrics or controls for the number of augmentations, it is unclear if the four-augmentation combination is optimal or if simpler subsets would suffice.

    Authors: We agree that the ablation section would benefit from additional quantitative detail. We will expand the ablation studies in the revision to report per-augmentation and combinatorial metric changes (mIoU, aIoU, HD95) on each dataset, while controlling for the number of augmentations by comparing the full four-augmentation ensemble against intensity-only, noise-only, and other subsets. This will provide concrete evidence supporting the observed differential benefits for large organs versus small lesions and the overall utility of the chosen combination. revision: yes

Circularity Check

0 steps flagged

No significant circularity in empirical evaluation

full rationale

The paper presents an empirical test-time augmentation framework (four fixed augmentations plus weighted voting on MedSAM2 checkpoints) and reports metric gains on three held-out medical imaging datasets against independent baselines. No equations, derivations, or self-referential steps appear in the provided text that reduce any claimed result to a fitted input or self-citation by construction. The note on task-specific threshold optimization is presented as a practical feature rather than evidence that headline numbers were obtained via circular fitting on test data. The evaluation chain remains self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

2 free parameters · 1 axioms · 0 invented entities

The central claim rests on empirical selection of four augmentations and a voting procedure whose parameters are not derived from first principles.

free parameters (2)
  • Augmentation parameters
    Specific values for gamma, contrast strength, blur sigma, and noise variance are required but not reported in the abstract.
  • Voting weights and threshold
    Weights assigned to each augmented prediction and the decision threshold are central to the method and must be chosen or tuned.
axioms (1)
  • domain assumption Multiple MedSAM2 checkpoints produce sufficiently diverse and complementary predictions that weighted voting improves accuracy.
    Invoked implicitly when claiming gains from ensemble voting without retraining.

pith-pipeline@v0.9.0 · 5526 in / 1272 out tokens · 65658 ms · 2026-05-10T05:33:17.418161+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

42 extracted references · 42 canonical work pages

  1. [1]

    Medical image data augmentation: techniques, comparisons and interpretations.Artificial intelligence review, 56(11):12561–12605, 2023

    Evgin Goceri. Medical image data augmentation: techniques, comparisons and interpretations.Artificial intelligence review, 56(11):12561–12605, 2023

  2. [2]

    Improving Robustness Without Sacrificing Accuracy with Patch Gaussian Augmentation

    Raphael Gontijo Lopes, Dong Yin, Ben Poole, Justin Gilmer, and Ekin D Cubuk. Improving robustness without sacrificing accuracy with patch gaussian augmentation.arXiv preprint arXiv:1906.02611, 2019

  3. [3]

    Mediaug: Exploring visual augmentation in medical imaging

    Xuyin Qi, Zeyu Zhang, Canxuan Gang, Hao Zhang, Lei Zhang, Zhiwei Zhang, and Yang Zhao. Mediaug: Exploring visual augmentation in medical imaging. InAnnual Conference on Medical Image Understanding and Analysis, pages 218–232. Springer, 2025

  4. [4]

    Medsam2: Segment anything in 3d medical images and videos,

    Jun Ma, Zongxin Yang, Sumin Kim, Bihui Chen, Mohammed Baharoon, Adibvafa Fallahpour, Reza Asakereh, Hongwei Lyu, and Bo Wang. Medsam2: Segment anything in 3d medical images and videos.arXiv preprint arXiv:2504.03600, 2025

  5. [5]

    Test-time generative augmentation for medical image segmentation.arXiv preprint arXiv:2406.17608, 2024

    Xiao Ma, Yuhui Tao, Yuhan Zhang, Zexuan Ji, Yizhe Zhang, and Qiang Chen. Test-time generative augmentation for medical image segmentation.arXiv preprint arXiv:2406.17608, 2024

  6. [6]

    Improving medical image segmentation using test-time augmentation with medsam.Mathematics, 12(24):4003, 2024

    Wasfieh Nazzal, Karl Thurnhofer-Hemsi, and Ezequiel López-Rubio. Improving medical image segmentation using test-time augmentation with medsam.Mathematics, 12(24):4003, 2024

  7. [7]

    Ct scan contrast enhancement using singular value decomposition and adaptive gamma correction.Signal, Image and Video Processing, 12(5):905–913, 2018

    Fathi Kallel, Mouna Sahnoun, Ahmed Ben Hamida, and Khalil Chtourou. Ct scan contrast enhancement using singular value decomposition and adaptive gamma correction.Signal, Image and Video Processing, 12(5):905–913, 2018

  8. [8]

    Erdal Tasci, Caner Uluturk, and Aybars Ugur. A voting-based ensemble deep learning method focusing on image augmentation and preprocessing variations for tuberculosis detection.Neural Computing and Applications, 33(22):15541–15555, 2021

  9. [9]

    Bhsd: A 3d multi-class brain hemorrhage segmentation dataset

    Biao Wu, Yutong Xie, Zeyu Zhang, Jinchao Ge, Kaspar Yaxley, Suzan Bahadir, Qi Wu, Yifan Liu, and Minh-Son To. Bhsd: A 3d multi-class brain hemorrhage segmentation dataset. InInternational workshop on machine learning in medical imaging, pages 147–156. Springer, 2023

  10. [10]

    Segstitch: Multidimensional transformer for robust and efficient medical imaging segmentation.arXiv preprint arXiv:2408.00496, 2024

    Shengbo Tan, Zeyu Zhang, Ying Cai, Daji Ergu, Lin Wu, Binbin Hu, Pengzhang Yu, and Yang Zhao. Segstitch: Multidimensional transformer for robust and efficient medical imaging segmentation.arXiv preprint arXiv:2408.00496, 2024

  11. [11]

    Thin-thick adapter: Segmenting thin scans using thick annotations

    Zeyu Zhang, Bowen Zhang, Abhiram Hiwase, Christen Barras, Feng Chen, Biao Wu, Adam James Wells, Daniel Y Ellis, Benjamin Reddi, Andrew William Burgan, et al. Thin-thick adapter: Segmenting thin scans using thick annotations. 2023

  12. [12]

    Esa: Annotation-efficient active learning for semantic segmentation

    Jinchao Ge, Zeyu Zhang, Vu Minh Hieu Phan, Bowen Zhang, Akide Liu, Yang Zhao, and Shuwen Zhao. Esa: Annotation-efficient active learning for semantic segmentation. InInternational Conference on Intelligent Computing, pages 141–152. Springer, 2025

  13. [13]

    Doei: Dual optimization of embedding information for attention-enhanced class activation maps.arXiv preprint arXiv:2502.15885, 2025

    Hongjie Zhu, Zeyu Zhang, Guansong Pang, Xu Wang, Shimin Wen, Yu Bai, Daji Ergu, Ying Cai, and Yang Zhao. Doei: Dual optimization of embedding information for attention-enhanced class activation maps.arXiv preprint arXiv:2502.15885, 2025

  14. [14]

    Gamed-snake: Gradient-aware adaptive momentum evolution deep snake model for multi-organ segmentation.arXiv preprint arXiv:2501.12844, 2025

    Ruicheng Zhang, Haowei Guo, Zeyu Zhang, Puxin Yan, and Shen Zhao. Gamed-snake: Gradient-aware adaptive momentum evolution deep snake model for multi-organ segmentation.arXiv preprint arXiv:2501.12844, 2025

  15. [15]

    Segkan: High-resolution medical image segmentation with long-distance dependencies

    Shengbo Tan, Rundong Xue, Shipeng Luo, Zeyu Zhang, Xinran Wang, Lei Zhang, Daji Ergu, Zhang Yi, Yang Zhao, and Ying Cai. Segkan: High-resolution medical image segmentation with long-distance dependencies. arXiv preprint arXiv:2412.19990, 2024

  16. [16]

    MARL-MambaContour: Unleashing multi-agent deep reinforcement learning for active contour optimization in medical image segmentation,

    Ruicheng Zhang, Yu Sun, Zeyu Zhang, Jinai Li, Xiaofan Liu, Au Hoi Fan, Haowei Guo, and Puxin Yan. Marl- mambacontour: Unleashing multi-agent deep reinforcement learning for active contour optimization in medical image segmentation.arXiv preprint arXiv:2506.18679, 2025

  17. [17]

    Unified medical image segmentation with state space modeling snake.arXiv preprint arXiv:2507.12760, 2025

    Ruicheng Zhang, Haowei Guo, Kanghui Tian, Jun Zhou, Mingliang Yan, Zeyu Zhang, and Shen Zhao. Unified medical image segmentation with state space modeling snake.arXiv preprint arXiv:2507.12760, 2025

  18. [18]

    Sss: Semi- supervised sam-2 with efficient prompting for medical imaging segmentation.arXiv preprint arXiv:2506.08949, 2025

    Hongjie Zhu, Xiwei Liu, Rundong Xue, Zeyu Zhang, Yong Xu, Daji Ergu, Ying Cai, and Yang Zhao. Sss: Semi- supervised sam-2 with efficient prompting for medical imaging segmentation.arXiv preprint arXiv:2506.08949, 2025

  19. [19]

    Medsamix: A training-free model merging approach for medical image segmentation.arXiv preprint arXiv:2508.11032, 2025

    Yanwu Yang, Guinan Su, Jiesi Hu, Francesco Sammarco, Jonas Geiping, and Thomas Wolfers. Medsamix: A training-free model merging approach for medical image segmentation.arXiv preprint arXiv:2508.11032, 2025. 11 SegTTA: Test-Time Augmentation for Medical Imaging Segmentation

  20. [20]

    Msdet: Receptive field enhanced multiscale detection for tiny pulmonary nodule.arXiv preprint arXiv:2409.14028, 2024

    Guohui Cai, Ruicheng Zhang, Hongyang He, Zeyu Zhang, Daji Ergu, Yuanzhouhan Cao, Jinman Zhao, Binbin Hu, Zhinbin Liao, Yang Zhao, et al. Msdet: Receptive field enhanced multiscale detection for tiny pulmonary nodule.arXiv preprint arXiv:2409.14028, 2024

  21. [21]

    Meddet: Generative adversarial distillation for efficient cervical disc herniation detection

    Zeyu Zhang, Nengmin Yi, Shengbo Tan, Ying Cai, Yi Yang, Lei Xu, Qingtai Li, Zhang Yi, Daji Ergu, and Yang Zhao. Meddet: Generative adversarial distillation for efficient cervical disc herniation detection. In2024 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pages 4024–4027. IEEE, 2024

  22. [22]

    Peddet: Adaptive spectral optimization for multimodal pedestrian detection.arXiv preprint arXiv:2502.14063, 2025

    Rui Zhao, Zeyu Zhang, Yi Xu, Yi Yao, Yan Huang, Wenxin Zhang, Zirui Song, Xiuying Chen, and Yang Zhao. Peddet: Adaptive spectral optimization for multimodal pedestrian detection.arXiv preprint arXiv:2502.14063, 2025

  23. [23]

    Epdd-yolo: An efficient benchmark for pavement damage detection based on mamba-yolo.Measurement, page 117638, 2025

    Shipeng Luo, Yuxin Zhang, Zeyu Zhang, Binhua Guo, Junbo Jacob Lian, Hui Jiang, Shun Zou, and Wei Wang. Epdd-yolo: An efficient benchmark for pavement damage detection based on mamba-yolo.Measurement, page 117638, 2025

  24. [24]

    Medical artificial intelligence for early detection of lung cancer: A survey.Engineering Applications of Artificial Intelligence, 159:111577, 2025

    Guohui Cai, Ying Cai, Zeyu Zhang, Yuanzhouhan Cao, Lin Wu, Daji Ergu, Zhibin Liao, and Yang Zhao. Medical artificial intelligence for early detection of lung cancer: A survey.Engineering Applications of Artificial Intelligence, 159:111577, 2025

  25. [25]

    Mmclip: Cross-modal attention masked modelling for medical language-image pre-training.arXiv preprint arXiv:2407.19546, 2024

    Biao Wu, Yutong Xie, Zeyu Zhang, Minh Hieu Phan, Qi Chen, Ling Chen, and Qi Wu. Mmclip: Cross-modal attention masked modelling for medical language-image pre-training.arXiv preprint arXiv:2407.19546, 2024

  26. [26]

    Jointvit: Modeling oxygen saturation levels with joint supervision on long-tailed octa

    Zeyu Zhang, Xuyin Qi, Mingxi Chen, Guangxi Li, Ryan Pham, Ayub Qassim, Ella Berry, Zhibin Liao, Owen Siggs, Robert Mclaughlin, et al. Jointvit: Modeling oxygen saturation levels with joint supervision on long-tailed octa. InAnnual Conference on Medical Image Understanding and Analysis, pages 158–172. Springer, 2024

  27. [27]

    Efficient learn- ing with sine-activated low-rank matrices.arXiv preprint arXiv:2403.19243, 2024

    Yiping Ji, Hemanth Saratchandran, Cameron Gordon, Zeyu Zhang, and Simon Lucey. Efficient learning with sine-activated low-rank matrices.arXiv preprint arXiv:2403.19243, 2024

  28. [28]

    arXiv preprint arXiv:2502.00631 (2025) 18 Authors Suppressed Due to Excessive Length

    Xuyin Qi, Zeyu Zhang, Huazhan Zheng, Mingxi Chen, Numan Kutaiba, Ruth Lim, Cherie Chiang, Zi En Tham, Xuan Ren, Wenxin Zhang, et al. Medconv: Convolutions beat transformers on long-tailed bone density prediction. arXiv preprint arXiv:2502.00631, 2025

  29. [29]

    arXiv preprint arXiv:2503.17970 (2025)

    Yang Luo, Shiru Wang, Jun Liu, Jiaxuan Xiao, Rundong Xue, Zeyu Zhang, Hao Zhang, Yu Lu, Yang Zhao, and Yutong Xie. Pathohr: Breast cancer survival prediction on high-resolution pathological images.arXiv preprint arXiv:2503.17970, 2025

  30. [30]

    A deep learning approach to diabetes diagnosis

    Zeyu Zhang, Khandaker Asif Ahmed, Md Rakibul Hasan, Tom Gedeon, and Md Zakir Hossain. A deep learning approach to diabetes diagnosis. InAsian Conference on Intelligent Information and Database Systems, pages 87–99. Springer, 2024

  31. [31]

    A landmark-based approach for instability prediction in distal radius fractures

    Yang Zhao, Zhibin Liao, Yunxiang Liu, Koen Oude Nijhuis, Britt Barvelink, Jasper Prijs, Joost Colaris, Mathieu Wijffels, Max Reijman, Zeyu Zhang, et al. A landmark-based approach for instability prediction in distal radius fractures. In2024 IEEE International Symposium on Biomedical Imaging (ISBI), pages 1–5. IEEE, 2024

  32. [32]

    Projectedex: Enhancing generation in explainable ai for prostate cancer

    Xuyin Qi, Zeyu Zhang, Aaron Berliano Handoko, Huazhan Zheng, Mingxi Chen, Ta Duc Huy, Vu Minh Hieu Phan, Lei Zhang, Linqi Cheng, Shiyu Jiang, et al. Projectedex: Enhancing generation in explainable ai for prostate cancer. In2025 IEEE 38th International Symposium on Computer-Based Medical Systems (CBMS), pages 623–629. IEEE, 2025

  33. [33]

    Can rotational thromboelastometry rapidly identify theragnostic targets in isolated traumatic brain injury?Emergency Medicine Australasia, 37(1):e14480, 2025

    Abhiram D Hiwase, Christopher D Ovenden, Lola M Kaukas, Mark Finnis, Zeyu Zhang, Stephanie O’Connor, Ngee Foo, Benjamin Reddi, Adam J Wells, and Daniel Y Ellis. Can rotational thromboelastometry rapidly identify theragnostic targets in isolated traumatic brain injury?Emergency Medicine Australasia, 37(1):e14480, 2025

  34. [34]

    Rethinking few-shot medical image segmentation by sam2: A training-free framework with augmentative prompting and dynamic matching.arXiv preprint arXiv:2503.04826, 2025

    Haiyue Zu, Jun Ge, Heting Xiao, Jile Xie, Zhangzhe Zhou, Yifan Meng, Jiayi Ni, Junjie Niu, Linlin Zhang, Li Ni, et al. Rethinking few-shot medical image segmentation by sam2: A training-free framework with augmentative prompting and dynamic matching.arXiv preprint arXiv:2503.04826, 2025

  35. [35]

    Pre-trained sam as data augmentation for image segmentation.CAAI Transactions on Intelligence Technology, 10(1):268–282, 2025

    Junjun Wu, Yunbo Rao, Shaoning Zeng, and Bob Zhang. Pre-trained sam as data augmentation for image segmentation.CAAI Transactions on Intelligence Technology, 10(1):268–282, 2025

  36. [36]

    Noisymix: Boosting robustness by combining data augmentations, stability training, and noise injections.arXiv preprint arXiv:2202.01263, 1, 2022

    N Benjamin Erichson, Soon Hoe Lim, Francisco Utrera, Winnie Xu, Ziang Cao, and Michael W Mahoney. Noisymix: Boosting robustness by combining data augmentations, stability training, and noise injections.arXiv preprint arXiv:2202.01263, 1, 2022

  37. [37]

    Louisa Lam and SY Suen. Application of majority voting to pattern recognition: an analysis of its behavior and performance.IEEE Transactions on Systems, Man, and Cybernetics-Part A: Systems and Humans, 27(5):553–568, 1997. 12 SegTTA: Test-Time Augmentation for Medical Imaging Segmentation

  38. [38]

    Classification confidence weighted majority voting using decision tree classifiers

    Norbert Toth and Bela Pataki. Classification confidence weighted majority voting using decision tree classifiers. International Journal of Intelligent Computing and Cybernetics, 1(2):169–192, 2008

  39. [39]

    UterUS: Uterus ultrasound database

    Eva Boneš, Marco Gergolet, Ciril Bohak, Žiga Lesar, and Matija Marolt. UterUS: Uterus ultrasound database. https://github.com/UL-FRI-LGM/UterUS, 2024. Dataset with 3D ultrasound uterine volumes and nnUNet segmentation models; License: CC BY-NC-SA 4.0

  40. [40]

    Large-scale uterine myoma mri dataset covering all figo types with pixel-level annotations, 2024

    Haoming Pan, Menghan Chen, Wenjie Bai, et al. Large-scale uterine myoma mri dataset covering all figo types with pixel-level annotations, 2024. UMD dataset: 300 cases of uterine myoma T2WI sagittal images with FIGO classification

  41. [41]

    Jorge Cardoso et al

    M. Jorge Cardoso et al. MSD Task08: Hepatic Vessel Segmentation Challenge Dataset. http:// medicaldecathlon.com/, 2019. Part of the Medical Segmentation Decathlon (MSD). Available via Google Drive: Task08_HepaticVessel.tar

  42. [42]

    Segreg: Segmenting oars by registering mr images and ct annotations

    Zeyu Zhang, Xuyin Qi, Bowen Zhang, Biao Wu, Hien Le, Bora Jeong, Zhibin Liao, Yunxiang Liu, Johan Verjans, Minh-Son To, et al. Segreg: Segmenting oars by registering mr images and ct annotations. In2024 IEEE International Symposium on Biomedical Imaging (ISBI), pages 1–5. IEEE, 2024. 13