Frequency Adapter with SAM for Generalized Medical Image Segmentation
Pith reviewed 2026-05-12 04:03 UTC · model grok-4.3
The pith
A frequency adapter added to SAM improves generalization in medical image segmentation across domains.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
FSAM is a framework that incorporates Low-Rank Adaptation (LoRA) and a frequency adapter into SAM to extract domain-invariant high-frequency features, thereby mitigating frequency-related domain shifts for improved single-source domain generalization in medical image segmentation.
What carries the argument
The frequency adapter, which incorporates frequency-domain representations to capture domain-invariant features in the SAM model.
If this is right
- FSAM outperforms traditional domain generalization and SAM-based methods on fundus and prostate segmentation tasks.
- It enables efficient fine-tuning of SAM while addressing frequency discrepancies.
- The approach focuses on high-frequency features overlooked by spatial-domain methods.
- It supports single-source domain generalization without needing multiple source domains.
Where Pith is reading between the lines
- This technique could be adapted to other foundation models for segmentation tasks in different fields.
- Testing on more diverse medical modalities might reveal additional benefits or limitations of frequency adaptation.
- The emphasis on frequency domain suggests potential for hybrid spatial-frequency models in general computer vision robustness.
Load-bearing premise
Frequency-domain representations extracted by the adapter are reliably domain-invariant and adding them mitigates frequency-related domain shifts affecting SAM.
What would settle it
Observing no improvement or worse performance on a held-out medical dataset with pronounced frequency variations compared to standard SAM fine-tuning.
Figures
read the original abstract
Medical image segmentation is a critical task in computer-aided diagnosis and treatment planning. However, deep learning models often struggle to generalize across datasets due to domain shifts arising from variations in imaging protocols, scanner types, and patient populations. Traditional domain generalization (DG) methods utilize causal feature learning, adversarial consistency, and style augmentation to improve segmentation robustness. While effective, these approaches rely on explicit feature alignment, adversarial objectives, or handcrafted augmentations, which may not fully exploit the capabilities of foundation models. Recently, the Segment Anything Model (SAM) has demonstrated strong generalization capabilities in segmentation tasks. SAM-based DG methods attempt to improve medical image segmentation. However, these approaches primarily operate in the spatial domain and overlook frequency-based discrepancies that significantly affect model robustness. In this work, we propose Frequency-based Domain Generalization with SAM (FSAM), a novel framework that integrates Low-Rank Adaptation (LoRA) for efficient fine-tuning and a frequency adapter to incorporate frequency-domain representations for single-source domain generalization. FSAM enhances SAM's segmentation robustness by extracting domain-invariant high-frequency features, mitigating frequency-related domain shifts. Experimental results on fundus and prostate datasets demonstrate that FSAM outperforms existing traditional DG and SAM-based DG approaches in domain generalization. Codes and pre-trained models will be made available on GitHub.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript proposes FSAM, a framework that augments the Segment Anything Model (SAM) with Low-Rank Adaptation (LoRA) for efficient fine-tuning and a frequency adapter module to extract and incorporate domain-invariant high-frequency features. The central claim is that this mitigates frequency-related domain shifts in single-source domain generalization for medical image segmentation, yielding superior performance over traditional DG methods and prior SAM-based approaches on fundus and prostate datasets.
Significance. If the experimental claims hold with proper validation, the work could meaningfully extend foundation-model adaptation in medical imaging by targeting frequency-domain discrepancies that spatial-only methods overlook. The use of LoRA for parameter efficiency is a practical strength, and the promise of releasing code and pre-trained models supports reproducibility. However, the absence of quantitative metrics, ablation results, and implementation details in the current presentation substantially weakens the ability to judge whether the frequency adapter delivers genuine domain-invariant gains beyond what LoRA alone provides.
major comments (3)
- [Abstract] Abstract: the claim that 'FSAM outperforms existing traditional DG and SAM-based DG approaches' is stated without any numerical results (e.g., Dice, IoU, or Hausdorff distances), confidence intervals, or statistical tests on the fundus and prostate datasets. This omission prevents evaluation of the magnitude and reliability of the reported gains.
- [Method] Method section: the frequency adapter is introduced as extracting 'domain-invariant high-frequency features' yet no concrete description is given of the transform used (FFT, wavelet, etc.), the precise fusion mechanism with SAM's image encoder, or any regularization that would enforce invariance. Without these, it is impossible to verify whether the module reduces frequency shifts or merely adds capacity.
- [Experiments] Experiments section: no ablation isolating the frequency adapter from LoRA fine-tuning alone is reported, nor are cross-dataset quantitative tables or visualizations of frequency spectra before/after adaptation provided. These omissions make the central generalization claim impossible to substantiate.
minor comments (1)
- [Abstract] The acronym FSAM is defined only after its first use; spelling out 'Frequency-based Domain Generalization with SAM' on first mention would improve readability.
Simulated Author's Rebuttal
We thank the referee for the constructive feedback, which identifies key areas where additional details will strengthen the manuscript. We agree that the current presentation lacks sufficient quantitative support, methodological specifics, and experimental validation, and we will revise accordingly to address each point.
read point-by-point responses
-
Referee: [Abstract] Abstract: the claim that 'FSAM outperforms existing traditional DG and SAM-based DG approaches' is stated without any numerical results (e.g., Dice, IoU, or Hausdorff distances), confidence intervals, or statistical tests on the fundus and prostate datasets. This omission prevents evaluation of the magnitude and reliability of the reported gains.
Authors: We agree that the abstract would benefit from quantitative support. In the revised version, we will include key performance metrics such as Dice and IoU scores on the fundus and prostate datasets, along with direct comparisons to the baselines mentioned. Space permitting, we will also report confidence intervals to better convey the reliability of the gains. revision: yes
-
Referee: [Method] Method section: the frequency adapter is introduced as extracting 'domain-invariant high-frequency features' yet no concrete description is given of the transform used (FFT, wavelet, etc.), the precise fusion mechanism with SAM's image encoder, or any regularization that would enforce invariance. Without these, it is impossible to verify whether the module reduces frequency shifts or merely adds capacity.
Authors: We will expand the Method section with a precise description of the frequency adapter. This will specify the frequency transform, the fusion process with SAM's image encoder, and any regularization or invariance-promoting mechanisms. These additions will clarify how the module targets frequency-domain shifts rather than simply increasing model capacity. revision: yes
-
Referee: [Experiments] Experiments section: no ablation isolating the frequency adapter from LoRA fine-tuning alone is reported, nor are cross-dataset quantitative tables or visualizations of frequency spectra before/after adaptation provided. These omissions make the central generalization claim impossible to substantiate.
Authors: We will augment the Experiments section with the requested elements. This includes ablation studies separating the frequency adapter's contribution from LoRA, comprehensive cross-dataset tables reporting Dice, IoU, and other metrics, and visualizations of frequency spectra to illustrate the domain-invariant effects. These revisions will provide direct evidence for the generalization improvements. revision: yes
Circularity Check
No significant circularity
full rationale
The paper presents an empirical architecture proposal (FSAM) that combines LoRA fine-tuning with a frequency adapter module for SAM-based single-source domain generalization. All load-bearing claims rest on experimental results comparing segmentation performance on fundus and prostate datasets against baselines; there is no mathematical derivation, no fitted parameters renamed as predictions, no self-citation chain invoked for uniqueness, and no ansatz smuggled via prior work. The approach is self-contained as a practical extension of existing foundation-model techniques, with performance evaluated externally rather than by construction.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption High-frequency components in medical images are domain-invariant across scanners and protocols
invented entities (1)
-
Frequency adapter
no independent evidence
Lean theorems connected to this paper
-
IndisputableMonolith/Cost/FunctionalEquation.leanwashburn_uniqueness_aczel unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
We design a Frequency Adapter that aggregates high-frequency components, improving robustness against domain shifts.
-
IndisputableMonolith/Foundation/RealityFromDistinction.leanreality_from_one_distinction unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
FSAM enhances SAM's segmentation robustness by extracting domain-invariant high-frequency features
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
Reference graph
Works this paper leans on
-
[1]
Artificial Intelli- gence in Medicine p
Bui, P.N., Le, D.T., Bum, J., Han, J.C., Pham, V.N., Choo, H.: Multi-scale feature enhancement in multi-task learning for medical image analysis. Artificial Intelli- gence in Medicine p. 103338 (2025)
work page 2025
-
[2]
In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition
Carlucci, F.M., D’Innocente, A., Bucci, S., Caputo, B., Tommasi, T.: Domain gen- eralization by solving jigsaw puzzles. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. pp. 2229–2238 (2019) Frequency Adapter with SAM for Generalized Medical Image Segmentation 9
work page 2019
-
[3]
In: International Conference on Medical Image Computing and Computer-Assisted Intervention
Chen, C., Li, Z., Ouyang, C., Sinclair, M., Bai, W., Rueckert, D.: Maxstyle: Adver- sarial style composition for robust medical image segmentation. In: International Conference on Medical Image Computing and Computer-Assisted Intervention. pp. 151–161. Springer (2022)
work page 2022
-
[4]
Chen, C., Qin, C., Qiu, H., Ouyang, C., Wang, S., Chen, L., Tarroni, G., Bai, W., Rueckert, D.: Realistic adversarial data augmentation for mr image segmentation. In: Medical Image Computing and Computer Assisted Intervention–MICCAI 2020: 23rd International Conference, Lima, Peru, October 4–8, 2020, Proceedings, Part I 23. pp. 667–677. Springer (2020)
work page 2020
-
[5]
In: International Conference on Medical Image Computing and Computer- Assisted Intervention
Chen, Z., Pan, Y., Ye, Y., Cui, H., Xia, Y.: Treasure in distribution: A domain ran- domization based multi-source domain generalization for 2d medical image segmen- tation. In: International Conference on Medical Image Computing and Computer- Assisted Intervention. pp. 89–99. Springer (2023)
work page 2023
-
[6]
Image Analysis & Stereology pp
Decencière, E., Zhang, X., Cazuguel, G., Lay, B., Cochener, B., Trone, C., Gain, P., Ordóñez-Varela, J.R., Massin, P., Erginay, A., et al.: Feedback on a publicly distributed image database: the messidor database. Image Analysis & Stereology pp. 231–234 (2014)
work page 2014
-
[7]
An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale
Dosovitskiy, A., Beyer, L., Kolesnikov, A., Weissenborn, D., Zhai, X., Unterthiner, T., Dehghani, M., Minderer, M., Heigold, G., Gelly, S., et al.: An image is worth 16x16 words: Transformers for image recognition at scale. arXiv preprint arXiv:2010.11929 (2020)
work page internal anchor Pith review Pith/arXiv arXiv 2010
-
[8]
In: International Conference on Medical Image Computing and Computer-Assisted Intervention
Gao, Y., Xia, W., Hu, D., Wang, W., Gao, X.: Desam: Decoupled segment anything model for generalizable medical image segmentation. In: International Conference on Medical Image Computing and Computer-Assisted Intervention. pp. 509–519. Springer (2024)
work page 2024
-
[9]
Hu, E.J., Shen, Y., Wallis, P., Allen-Zhu, Z., Li, Y., Wang, S., Wang, L., Chen, W., et al.: Lora: Low-rank adaptation of large language models. ICLR1(2), 3 (2022)
work page 2022
-
[10]
In: International Conference on Medical Image Computing and Computer-Assisted Intervention
Hu, S., Liao, Z., Xia, Y.: Domain specific convolution and high frequency recon- struction based unsupervised domain adaptation for medical image segmentation. In: International Conference on Medical Image Computing and Computer-Assisted Intervention. pp. 650–659. Springer (2022)
work page 2022
-
[11]
In: International Conference on Medical Image Computing and Computer-Assisted Intervention
Hu, S., Liao, Z., Xia, Y.: Devil is in channels: Contrastive single domain general- ization for medical image segmentation. In: International Conference on Medical Image Computing and Computer-Assisted Intervention. pp. 14–23. Springer (2023)
work page 2023
-
[12]
In: 2025 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)
Imans, D., Bui, P.N., Le, D.T., Choo, H.: Unsupervised domain adaptation with sam-refiser for enhanced brain tumor segmentation. In: 2025 IEEE International Conference on Bioinformatics and Biomedicine (BIBM). pp. 3721–3724. IEEE (2025)
work page 2025
-
[13]
In: Proceedings of the IEEE/CVF international conference on computer vision
Kirillov, A., Mintun, E., Ravi, N., Mao, H., Rolland, C., Gustafson, L., Xiao, T., Whitehead, S., Berg, A.C., Lo, W.Y., et al.: Segment anything. In: Proceedings of the IEEE/CVF international conference on computer vision. pp. 4015–4026 (2023)
work page 2023
-
[14]
In: International Conference on Medical Image Computing and Computer-Assisted Intervention
Li, H., Li, H., Zhao, W., Fu, H., Su, X., Hu, Y., Liu, J.: Frequency-mixed single- source domain generalization for medical image segmentation. In: International Conference on Medical Image Computing and Computer-Assisted Intervention. pp. 127–136. Springer (2023)
work page 2023
-
[15]
In: Pro- ceedings of the IEEE/CVF conference on computer vision and pattern recognition
Lin, S., Zhang, Z., Huang, Z., Lu, Y., Lan, C., Chu, P., You, Q., Wang, J., Liu, Z., Parulkar, A., et al.: Deep frequency filtering for domain generalization. In: Pro- ceedings of the IEEE/CVF conference on computer vision and pattern recognition. pp. 11797–11807 (2023) 10 P.-N. Bui et al
work page 2023
-
[16]
Liu, Q., Dou, Q., Heng, P.A.: Shape-aware meta-learning for generalizing prostate mri segmentation to unseen domains. In: Medical Image Computing and Computer Assisted Intervention–MICCAI 2020: 23rd International Conference, Lima, Peru, October 4–8, 2020, Proceedings, Part II 23. pp. 475–485. Springer (2020)
work page 2020
-
[17]
Decoupled Weight Decay Regularization
Loshchilov, I., Hutter, F.: Decoupled weight decay regularization. arXiv preprint arXiv:1711.05101 (2017)
work page internal anchor Pith review Pith/arXiv arXiv 2017
-
[18]
Nature Communications15(1), 654 (2024)
Ma, J., He, Y., Li, F., Han, L., You, C., Wang, B.: Segment anything in medical images. Nature Communications15(1), 654 (2024)
work page 2024
-
[19]
IEEE Transactions on Medical Imaging42(4), 1095–1106 (2022)
Ouyang, C., Chen, C., Li, S., Li, Z., Qin, C., Bai, W., Rueckert, D.: Causality- inspired single-source domain generalization for medical image segmentation. IEEE Transactions on Medical Imaging42(4), 1095–1106 (2022)
work page 2022
-
[20]
Ronneberger, O., Fischer, P., Brox, T.: U-net: Convolutional networks for biomed- ical image segmentation. In: Medical image computing and computer-assisted intervention–MICCAI 2015: 18th international conference, Munich, Germany, Oc- tober 5-9, 2015, proceedings, part III 18. pp. 234–241. Springer (2015)
work page 2015
-
[21]
In: Proceedings of the AAAI conference on artificial intelligence
Su, Z., Yao, K., Yang, X., Huang, K., Wang, Q., Sun, J.: Rethinking data augmen- tation for single-source domain generalization in medical image segmentation. In: Proceedings of the AAAI conference on artificial intelligence. vol. 37, pp. 2366–2374 (2023)
work page 2023
-
[22]
In: Proceedings of the IEEE conference on computer vision and pattern recognition
Tzeng, E., Hoffman, J., Saenko, K., Darrell, T.: Adversarial discriminative domain adaptation. In: Proceedings of the IEEE conference on computer vision and pattern recognition. pp. 7167–7176 (2017)
work page 2017
-
[23]
In: International Conference on Medical Image Computing and Computer- Assisted Intervention
Wei,Z.,Dong,W.,Zhou,P.,Gu,Y.,Zhao,Z.,Xu,Y.:Promptingsegmentanything model with domain-adaptive prototype for generalizable medical image segmenta- tion. In: International Conference on Medical Image Computing and Computer- Assisted Intervention. pp. 533–543. Springer (2024)
work page 2024
-
[24]
In: International Conference on Medical Image Computing and Computer-Assisted Intervention
Xu, Y., Xie, S., Reynolds, M., Ragoza, M., Gong, M., Batmanghelich, K.: Adver- sarial consistency for single domain generalization in medical image segmentation. In: International Conference on Medical Image Computing and Computer-Assisted Intervention. pp. 671–681. Springer (2022)
work page 2022
-
[25]
arXiv preprint arXiv:2007.13003 (2020) 3
Xu, Z., Liu, D., Yang, J., Raffel, C., Niethammer, M.: Robust and general- izable visual representation learning via random convolutions. arXiv preprint arXiv:2007.13003 (2020)
-
[26]
Customized segment anything model for medical image segmentation,
Zhang, K., Liu, D.: Customized segment anything model for medical image seg- mentation. arXiv preprint arXiv:2304.13785 (2023)
-
[27]
arXiv preprint arXiv:2104.02008 , year=
Zhou, K., Yang, Y., Qiao, Y., Xiang, T.: Domain generalization with mixstyle. arXiv preprint arXiv:2104.02008 (2021)
-
[28]
In: Proceed- ings of the IEEE/CVF conference on computer vision and pattern recognition
Zhou, Z., Qi, L., Yang, X., Ni, D., Shi, Y.: Generalizable cross-modality medical image segmentation via style augmentation and dual normalization. In: Proceed- ings of the IEEE/CVF conference on computer vision and pattern recognition. pp. 20856–20865 (2022)
work page 2022
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.