On Efficient Variants of Segment Anything Model: A Survey
Pith reviewed 2026-05-23 19:40 UTC · model grok-4.3
The pith
This survey reviews acceleration strategies for the Segment Anything Model and benchmarks their efficiency-accuracy trade-offs on multiple hardware platforms.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The survey claims that categorizing SAM acceleration methods by approach, combined with a standardized cross-hardware evaluation, reveals clear performance differences among variants and identifies viable paths for deploying accurate segmentation on resource-limited devices.
What carries the argument
Categorization of acceleration strategies by approach, paired with unified benchmark evaluation across hardware.
If this is right
- Developers gain a direct comparison to select variants suited to edge or mobile hardware.
- Research can prioritize the future directions the survey identifies for further gains.
- Benchmark results establish baseline numbers for new efficiency proposals to beat.
- Hardware-specific performance data guides deployment choices in constrained environments.
Where Pith is reading between the lines
- The survey's structure could serve as a template for efficiency reviews of other large vision models beyond SAM.
- If acceleration categories prove stable, they may generalize to future foundation models with similar architectures.
- Unified evaluations reduce the need for each new paper to re-run all prior variants from scratch.
Load-bearing premise
The review assumes the authors captured all major efficient SAM variants without selection bias and that the chosen benchmarks and hardware are representative of real deployment.
What would settle it
Publication of a new SAM variant that exceeds all reviewed methods in both accuracy and efficiency on the same benchmarks and hardware would indicate the survey missed key approaches or used non-representative tests.
read the original abstract
The Segment Anything Model (SAM) is a foundational model for image segmentation tasks, known for its strong generalization across diverse applications. However, its impressive performance comes with significant computational and resource demands, making it challenging to deploy in resource-limited environments such as edge devices. To address this, a variety of SAM variants have been proposed to enhance efficiency while keeping accuracy. This survey provides the first comprehensive review of these efficient SAM variants. We begin by exploring the motivations driving this research. We then present core techniques used in SAM and model acceleration. This is followed by a detailed exploration of SAM acceleration strategies, categorized by approach, and a discussion of several future research directions. Finally, we offer a unified and extensive evaluation of these methods across various hardware, assessing their efficiency and accuracy on representative benchmarks, and providing a clear comparison of their overall performance.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper surveys efficient variants of the Segment Anything Model (SAM), claiming to be the first comprehensive review. It covers motivations for efficiency research, core SAM and acceleration techniques, a categorization of acceleration strategies by approach, future research directions, and a unified evaluation of methods across hardware platforms assessing efficiency and accuracy on representative benchmarks.
Significance. If the coverage is systematic and the evaluation is truly standardized rather than aggregated from inconsistent reports, the survey would provide a useful reference for comparing efficiency-accuracy trade-offs in SAM variants and guiding deployment on edge devices.
major comments (1)
- [Abstract, §1] Abstract and §1 (Introduction): The central claims of providing the 'first comprehensive review' and a 'unified and extensive evaluation' across hardware are load-bearing but rest on undocumented processes. No explicit literature search criteria, databases, date ranges, or inclusion/exclusion rules are stated, nor is the protocol for re-implementation or metric standardization described. This leaves both the completeness of variant coverage and the fairness of cross-method comparisons unverifiable.
Simulated Author's Rebuttal
We thank the referee for highlighting the need for greater transparency in our methodology. We agree that explicitly documenting the literature search process and evaluation protocol will make the claims of comprehensive coverage and unified benchmarking more verifiable. We will revise the manuscript to include these details.
read point-by-point responses
-
Referee: [Abstract, §1] Abstract and §1 (Introduction): The central claims of providing the 'first comprehensive review' and a 'unified and extensive evaluation' across hardware are load-bearing but rest on undocumented processes. No explicit literature search criteria, databases, date ranges, or inclusion/exclusion rules are stated, nor is the protocol for re-implementation or metric standardization described. This leaves both the completeness of variant coverage and the fairness of cross-method comparisons unverifiable.
Authors: We acknowledge that the current manuscript does not describe the literature search protocol or re-implementation details. To address this, we will add a dedicated subsection 'Survey Methodology' in §1 that specifies: (1) databases searched (Google Scholar, arXiv, IEEE Xplore, ACM Digital Library); (2) search keywords and Boolean strings (e.g., 'Segment Anything Model' AND (efficient OR acceleration OR lightweight OR edge)); (3) date range (April 2023 to October 2024, aligned with SAM release); (4) inclusion criteria (papers proposing SAM variants with efficiency improvements, including preprints with code); (5) exclusion criteria (non-English works, surveys without new variants, works not focused on SAM). For the unified evaluation, we will expand §4 and add an appendix describing: re-implementation protocol (use of official repositories where available, otherwise faithful re-coding per paper descriptions with author confirmation where possible), hardware configurations (e.g., NVIDIA A100, RTX 3090, Jetson Orin, CPU-only), input standardization (1024×1024 resolution, batch size 1), and metric reporting (consistent FPS, parameters, mIoU on COCO val, ADE20K). These additions will allow readers to assess completeness and fairness. We maintain that the survey is the first to provide both a categorized taxonomy and cross-hardware benchmarks, but agree the documentation strengthens this position. revision: yes
Circularity Check
No circularity: survey paper contains no derivations or predictions
full rationale
This is a literature survey paper whose central claims concern coverage of prior work, categorization of acceleration strategies, and presentation of a unified evaluation. No equations, fitted parameters, predictions, or first-principles derivations appear in the provided abstract or description. The reader's assessment correctly identifies the absence of any derivational chain that could reduce to its own inputs. The skeptic concerns about selection bias and standardization of benchmarks are questions of methodological transparency and potential incompleteness, not circularity under the enumerated patterns (self-definitional, fitted-input-called-prediction, self-citation load-bearing, etc.). Because the paper makes no load-bearing mathematical claims that collapse by construction, the circularity score is 0 and the steps array is empty.
Axiom & Free-Parameter Ledger
Forward citations
Cited by 1 Pith paper
-
LMMs Meet Object-Centric Vision: Understanding, Segmentation, Editing and Generation
This review organizes literature on large multimodal models and object-centric vision into four themes—understanding, referring segmentation, editing, and generation—while summarizing paradigms, strategies, and challe...
Reference graph
Works this paper leans on
-
[1]
On the Opportunities and Risks of Foundation Models
Bommasani, R., Hudson, D.A., Adeli, E., Altman, R., Arora, S., Arx, S., Bern- stein, M.S., Bohg, J., Bosselut, A., Brun- skill, E., et al.: On the opportunities and risks of foundation models. arXiv preprint arXiv:2108.07258 (2021)
work page internal anchor Pith review Pith/arXiv arXiv 2021
-
[2]
A Survey of Large Language Models
Zhao, W.X., Zhou, K., Li, J., Tang, T., Wang, X., Hou, Y., Min, Y., Zhang, B., Zhang, J., Dong, Z., et al.: A survey of large language models. arXiv preprint arXiv:2303.18223 (2023)
work page internal anchor Pith review Pith/arXiv arXiv 2023
-
[3]
Wan, Z., Wang, X., Liu, C., Alam, S., Zheng, Y., Liu, J., Qu, Z., Yan, S., Zhu, 26 Table 6: Quantitative results of the accuracy of SegAny task ( mIoU) on COCO and LVIS with points and boxes as prompts. For evaluation with points prompts, we select the center point of the ground truth bounding box ( pt1), and one or three randomly sampled points from grou...
work page 2024
-
[4]
Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakan- tan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learn- ers. In: Advances in Neural Information Processing Systems (2020) 27 Table 8 : Quantitative results of instance segmentation on COCO with YOLOv8 [169] or Ground- dingDINO [224] as object...
work page 2020
-
[5]
Achiam, J., Adler, S., Agarwal, S., Ahmad, L., Akkaya, I., Aleman, F.L., Almeida, D., Altenschmidt, J., Altman, S., Anadkat, S., et al.: Gpt-4 technical report. arXiv preprint arXiv:2303.08774 (2023)
work page internal anchor Pith review Pith/arXiv arXiv 2023
-
[6]
: Palm: Scaling lan- guage modeling with pathways
Chowdhery, A., Narang, S., Devlin, J., Bosma, M., Mishra, G., Roberts, A., Barham, P., Chung, H.W., Sutton, C., Gehrmann, S., et al. : Palm: Scaling lan- guage modeling with pathways. Journal of Machine Learning Research 24(240), 1–113 (2023)
work page 2023
-
[7]
Anil, R., Dai, A.M., Firat, O., Johnson, M., Lepikhin, D., Passos, A., Shakeri, S., Taropa, E., Bailey, P., Chen, Z., et al.: Palm 2 technical report. arXiv preprint arXiv:2305.10403 (2023)
work page internal anchor Pith review Pith/arXiv arXiv 2023
-
[8]
Touvron, H., Lavril, T., Izacard, G., Mar- tinet, X., Lachaux, M.-A., Lacroix, T., Rozi` ere, B., Goyal, N., Hambro, E., Azhar, 28 Table 10: Quantitative results of zero-shot instance segmentation on SGinW benchmark with Ground- ingDINO as the object detector.We report variants’ Average Precision (AP) on each dataset and mean AP over all 25 datasets. Data...
work page internal anchor Pith review Pith/arXiv arXiv 2023
-
[9]
Llama 2: Open Foundation and Fine-Tuned Chat Models
Touvron, H., Martin, L., Stone, K., Albert, P., Almahairi, A., Babaei, Y., Bashlykov, N., Batra, S., Bhargava, P., Bhosale, S., et al.: Llama 2: Open foundation and fine-tuned chat models. arXiv preprint arXiv:2307.09288 (2023)
work page internal anchor Pith review Pith/arXiv arXiv 2023
-
[10]
In: Interna- tional Conference on Learning Representa- tions (2021)
Dosovitskiy, A., Beyer, L., Kolesnikov, A., Weissenborn, D., Zhai, X., Unterthiner, T., Dehghani, M., Minderer, M., Heigold, G., Gelly, S., Uszkoreit, J., Houlsby, N.: An image is worth 16x16 words: Transformers for image recognition at scale. In: Interna- tional Conference on Learning Representa- tions (2021)
work page 2021
-
[11]
In: Advances in Neural Informa- tion Processing Systems (2017)
Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L.u., Polosukhin, I.: Attention is all you need. In: Advances in Neural Informa- tion Processing Systems (2017)
work page 2017
-
[12]
Radford, A., Kim, J.W., Hallacy, C., 29 Table 11 : Quantitative results of zero-shot instance segmentation on UVO benchmark with GroundingDINO as the object detector. Model AP AP S APM APL SAM-H 29.9 10 20.8 44.9 SAMfast-H 29.7 9.9 20.7 44.6 SAM2-B+ 30.9 9.4 21.3 47 FastSAM 20.8 7 14.7 30.1 MobileSAM 25.2 8.2 17.4 38 EdgeSAM 24.9 8.6 17.9 36.4 EfficientSA...
work page 2021
-
[13]
In: Advances in Neural Information Processing Systems (2023)
Liu, H., Li, C., Wu, Q., Lee, Y.J.: Visual instruction tuning. In: Advances in Neural Information Processing Systems (2023)
work page 2023
-
[14]
Maaz, M., Rasheed, H., Khan, S., Khan, F.: Video-ChatGPT: Towards detailed video understanding via large vision and language models. In: Proceedings of the 62nd Annual Meeting of the Association for Computa- tional Linguistics (Volume 1: Long Papers) (2024)
work page 2024
-
[15]
: Multi- modal foundation models: From specialists to general-purpose assistants
Li, C., Gan, Z., Yang, Z., Yang, J., Li, L., Wang, L., Gao, J., et al. : Multi- modal foundation models: From specialists to general-purpose assistants. Foundations and Trends ® in Computer Graphics and Vision 16(1-2), 1–214 (2024)
work page 2024
-
[16]
: Vision-language pre- training: Basics, recent advances, and future trends
Gan, Z., Li, L., Li, C., Wang, L., Liu, Z., Gao, J., et al. : Vision-language pre- training: Basics, recent advances, and future trends. Foundations and Trends® in Com- puter Graphics and Vision 14(3–4), 163–352 (2022)
work page 2022
-
[17]
In: Advances in Neural Information Processing Systems (2024)
Zou, X., Yang, J., Zhang, H., Li, F., Li, L., Wang, J., Wang, L., Gao, J., Lee, Y.J.: Seg- ment everything everywhere all at once. In: Advances in Neural Information Processing Systems (2024)
work page 2024
-
[18]
Tang, Y., Bi, J., Xu, S., Song, L., Liang, S., Wang, T., Zhang, D., An, J., Lin, J., Zhu, R., et al.: Video understanding with large language models: A survey. arXiv preprint arXiv:2312.17432 (2023)
-
[19]
In: Proceedings of the IEEE/CVF International Conference on Computer Vision (2023)
Kirillov, A., Mintun, E., Ravi, N., Mao, H., Rolland, C., Gustafson, L., Xiao, T., Whitehead, S., Berg, A.C., Lo, W.-Y., et al.: Segment anything. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (2023)
work page 2023
-
[20]
Nature Communications 15(1), 654 (2024)
Ma, J., He, Y., Li, F., Han, L., You, C., Wang, B.: Segment anything in medical images. Nature Communications 15(1), 654 (2024)
work page 2024
-
[21]
In: Pro- ceedings of the IEEE/CVF International Conference on Computer Vision (2023)
Chen, T., Zhu, L., Deng, C., Cao, R., Wang, Y., Zhang, S., Li, Z., Sun, L., Zang, Y., Mao, P.: Sam-adapter: Adapting segment anything in underperformed scenes. In: Pro- ceedings of the IEEE/CVF International Conference on Computer Vision (2023)
work page 2023
-
[22]
arXiv preprint arXiv:2304.09148 (2023)
Chen, T., Zhu, L., Ding, C., Cao, R., Wang, Y., Li, Z., Sun, L., Mao, P., Zang, Y.: Sam fails to segment anything?–sam- adapter: Adapting sam in underperformed scenes: Camouflage, shadow, medical image segmentation, and more. arXiv preprint arXiv:2304.09148 (2023)
-
[23]
In: Medical Imag- ing with Deep Learning, Short Paper Track (2023)
Wald, T., Roy, S., Koehler, G., Disch, N., Rokuss, M.R., Holzschuh, J., Zimmerer, D., Maier-Hein, K.: SAM.MD: Zero-shot med- ical image segmentation capabilities of the segment anything model. In: Medical Imag- ing with Deep Learning, Short Paper Track (2023)
work page 2023
-
[24]
In: Medical Image Segmentation Challenge (2024)
Le, B.-H., Nguyen-Vu, D.-K., Nguyen-Mau, 30 T.-H., Nguyen, H.-D., Tran, M.-T.: Med- ficientsam: a robust medical segmentation model with optimized inference pipeline for limited clinical settings. In: Medical Image Segmentation Challenge (2024)
work page 2024
-
[25]
Application of segment anything model for civil infrastructure defect assessment,
Ahmadi, M., Lonbar, A.G., Sharifi, A., Beris, A.T., Nouri, M., Javidi, A.S.: Appli- cation of segment anything model for civil infrastructure defect assessment. arXiv preprint arXiv:2304.12600 (2023)
-
[26]
arXiv preprint arXiv:2304.14006 (2023)
Xie, D., Wang, R., Ma, J., Chen, C., Lu, H., Yang, D., Shi, F., Lin, X.: Edit everything: A text-guided generative system for images editing. arXiv preprint arXiv:2304.14006 (2023)
-
[27]
In: The Twelfth International Confer- ence on Learning Representations (2024)
Zhang, R., Jiang, Z., Guo, Z., Yan, S., Pan, J., Dong, H., Qiao, Y., Gao, P., Li, H.: Per- sonalize segment anything model with one shot. In: The Twelfth International Confer- ence on Learning Representations (2024)
work page 2024
-
[28]
arXiv preprint arXiv:2304.11968 (2023)
Yang, J., Gao, M., Li, Z., Gao, S., Wang, F., Zheng, F.: Track anything: Seg- ment anything meets videos. arXiv preprint arXiv:2304.11968 (2023)
- [29]
-
[30]
arXiv preprint arXiv:2305.01443 (2023)
He, H., Zhang, J., Xu, M., Liu, J., Du, B., Tao, D.: Scalable mask annotation for video text spotting. arXiv preprint arXiv:2305.01443 (2023)
-
[31]
arXiv preprint arXiv:2304.10261 (2023)
Shen, Q., Yang, X., Wang, X.: Anything- 3d: Towards single-view anything recon- struction in the wild. arXiv preprint arXiv:2304.10261 (2023)
-
[32]
arXiv preprint arXiv:2311.01989 (2023)
Dong, S., Liu, F., Lin, G.: Leveraging large- scale pretrained vision foundation mod- els for label-efficient 3d point cloud seg- mentation. arXiv preprint arXiv:2311.01989 (2023)
-
[33]
In: The Thir- teenth International Conference on Learning Representations (2025)
Xu, X., Chen, H., Zhao, L., Wang, Z., Zhou, J., Lu, J.: EmbodiedSAM: Online segment any 3d thing in real time. In: The Thir- teenth International Conference on Learning Representations (2025)
work page 2025
-
[34]
SAM 2: Segment Anything in Images and Videos
Ravi, N., Gabeur, V., Hu, Y.-T., Hu, R., Ryali, C., Ma, T., Khedr, H., R¨ adle, R., Rolland, C., Gustafson, L., et al.: Sam 2: Segment anything in images and videos. arXiv preprint arXiv:2408.00714 (2024)
work page internal anchor Pith review Pith/arXiv arXiv 2024
-
[35]
arXiv preprint arXiv:2408.02635 (2024)
Shen, C., Li, W., Shi, Y., Wang, X.: Inter- active 3d medical image segmentation with sam 2. arXiv preprint arXiv:2408.02635 (2024)
-
[36]
arXiv preprint arXiv:2408.06170 (2024)
Yamagishi, Y., Hanaoka, S., Kikuchi, T., Nakao, T., Nakamura, Y., Nomura, Y., Miki, S., Yoshikawa, T., Abe, O.: Zero-shot 3d seg- mentation of abdominal organs in ct scans using segment anything model 2: Adapting video tracking capabilities for 3d medical imaging. arXiv preprint arXiv:2408.06170 (2024)
-
[37]
In: Proceedings of the Computer Vision and Pattern Recognition Conference (CVPR) (2025)
Wang, Y., Xu, H., Liu, Y., Li, J., Tang, Y.: Sam2-love: Segment anything model 2 in language-aided audio-visual scenes. In: Proceedings of the Computer Vision and Pattern Recognition Conference (CVPR) (2025)
work page 2025
-
[38]
arXiv preprint arXiv:2408.04593 (2024)
Yu, J., Wang, A., Dong, W., Xu, M., Islam, M., Wang, J., Bai, L., Ren, H.: Sam 2 in robotic surgery: An empirical evalua- tion for robustness and generalization in surgical video segmentation. arXiv preprint arXiv:2408.04593 (2024)
-
[39]
arXiv preprint arXiv:2408.12447 (2024)
Tran, T.: The 2nd solution for lsvos chal- lenge rvos track: Spatial-temporal refine- ment for consistent semantic segmentation. arXiv preprint arXiv:2408.12447 (2024)
-
[40]
arXiv preprint arXiv:2408.10469 (2024)
Liu, X., Zhang, J., Zhang, K., Liu, X., Li, L.: Lsvos challenge 3rd place report: Sam2 and cutie based vos. arXiv preprint arXiv:2408.10469 (2024)
-
[41]
arXiv preprint arXiv:2306.12156 (2023) 31
Zhao, X., Ding, W., An, Y., Du, Y., Yu, T., Li, M., Tang, M., Wang, J.: Fast segment anything. arXiv preprint arXiv:2306.12156 (2023) 31
-
[42]
arXiv preprint arXiv:2312.06736 (2023)
Varadarajan, B., Soran, B., Iandola, F., Xiang, X., Xiong, Y., Zhu, C., Krishnamoor- thi, R., Chandra, V.: Squeezesam: User friendly mobile interactive segmentation. arXiv preprint arXiv:2312.06736 (2023)
-
[43]
Faster Segment Anything: Towards Lightweight SAM for Mobile Applications
Zhang, C., Han, D., Qiao, Y., Kim, J.U., Bae, S.-H., Lee, S., Hong, C.S.: Faster seg- ment anything: Towards lightweight sam for mobile applications. arXiv preprint arXiv:2306.14289 (2023)
work page internal anchor Pith review Pith/arXiv arXiv 2023
-
[44]
In: Proceedings of the AAAI Conference on Artificial Intelligence (2025)
Shu, H., Li, W., Tang, Y., Zhang, Y., Chen, Y., Li, H., Wang, Y., Chen, X.: Tinysam: Pushing the envelope for efficient segment anything model. In: Proceedings of the AAAI Conference on Artificial Intelligence (2025)
work page 2025
-
[45]
In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (2024)
Wang, A., Chen, H., Lin, Z., Han, J., Ding, G.: Repvit: Revisiting mobile cnn from vit perspective. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (2024)
work page 2024
-
[46]
arXiv preprint arXiv:2312.06660 (2023)
Zhou, C., Li, X., Loy, C.C., Dai, B.: Edge- sam: Prompt-in-the-loop distillation for on- device deployment of sam. arXiv preprint arXiv:2312.06660 (2023)
-
[47]
In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (2024)
Lv, C., Chen, H., Guo, J., Ding, Y., Liu, X.: Ptq4sam: Post-training quantization for segment anything. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (2024)
work page 2024
-
[48]
In: Advances in Neural Information Processing Systems (2024)
Chen, Z., Fang, G., Ma, X., Wang, X.: Slim- sam: 0.1% data makes segment anything slim. In: Advances in Neural Information Processing Systems (2024)
work page 2024
-
[49]
arXiv preprint arXiv:2306.06211 (2023)
Zhang, C., Puspitasari, F.D., Zheng, S., Li, C., Qiao, Y., Kang, T., Shan, X., Zhang, C., Qin, C., Rameau, F., et al.: A survey on segment anything model (sam): Vision foun- dation model meets prompt engineering. arXiv preprint arXiv:2306.06211 (2023)
-
[50]
arXiv preprint arXiv:2305.08196 (2023)
Zhang, C., Liu, L., Cui, Y., Huang, G., Lin, W., Yang, Y., Hu, Y.: A compre- hensive survey on segment anything model for vision and beyond. arXiv preprint arXiv:2305.08196 (2023)
-
[51]
In: 2023 IEEE International Conference on Bioinfor- matics and Biomedicine (BIBM) (2023)
Zhang, L., Deng, X., Lu, Y.: Segment any- thing model (sam) for medical image seg- mentation: A preliminary review. In: 2023 IEEE International Conference on Bioinfor- matics and Biomedicine (BIBM) (2023)
work page 2023
-
[52]
Computers in Biology and Medicine 171, 108238 (2024)
Zhang, Y., Shen, Z., Jiao, R.: Segment any- thing model for medical image segmenta- tion: Current applications and future direc- tions. Computers in Biology and Medicine 171, 108238 (2024)
work page 2024
-
[53]
arXiv preprint arXiv:2408.08315 (2024)
Zhang, C., Cui, Y., Lin, W., Huang, G., Rong, Y., Liu, L., Shan, S.: Segment any- thing for videos: A systematic survey. arXiv preprint arXiv:2408.08315 (2024)
-
[54]
arXiv preprint arXiv:2408.12889 (2024)
Zhang, Y., Shen, Z.: Unleashing the potential of sam2 for biomedical images and videos: A survey. arXiv preprint arXiv:2408.12889 (2024)
-
[55]
IEEE Transactions on Pattern Analysis and Machine Intelligence (2024)
Papa, L., Russo, P., Amerini, I., Zhou, L.: A survey on efficient vision transformers: algo- rithms, techniques, and performance bench- marking. IEEE Transactions on Pattern Analysis and Machine Intelligence (2024)
work page 2024
-
[56]
In: Proceedings of the AAAI Conference on Artificial Intelligence (2023)
Xu, C., McAuley, J.: A survey on model compression and acceleration for pretrained language models. In: Proceedings of the AAAI Conference on Artificial Intelligence (2023)
work page 2023
-
[57]
ACM Computing Surveys 55(9), 1–35 (2023)
Liu, P., Yuan, W., Fu, J., Jiang, Z., Hayashi, H., Neubig, G.: Pre-train, prompt, and predict: A systematic survey of prompt- ing methods in natural language process- ing. ACM Computing Surveys 55(9), 1–35 (2023)
work page 2023
-
[58]
In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition
He, K., Chen, X., Xie, S., Li, Y., Doll´ ar, P., Girshick, R.: Masked autoencoders are scal- able vision learners. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition
-
[59]
In: Inter- national Conference on Machine Learning (2023)
Ryali, C., Hu, Y.-T., Bolya, D., Wei, C., Fan, H., Huang, P.-Y., Aggarwal, V., Chowd- hury, A., Poursaeed, O., Hoffman, J., et 32 al.: Hiera: A hierarchical vision transformer without the bells-and-whistles. In: Inter- national Conference on Machine Learning (2023)
work page 2023
-
[60]
arXiv preprint arXiv:2304.08506 (2023)
Hu, C., Xia, T., Ju, S., Li, X.: When sam meets medical images: An investiga- tion of segment anything model (sam) on multi-phase liver tumor segmentation. arXiv preprint arXiv:2304.08506 (2023)
-
[61]
arXiv preprint arXiv:2304.04738 (2023)
Mohapatra, S., Gosai, A., Schlaug, G.: Sam vs bet: A comparative study for brain extraction and segmentation of magnetic resonance images using deep learning. arXiv preprint arXiv:2304.04738 (2023)
-
[62]
In: IS&T Inter- national Symposium on Electronic Imaging (2025)
Deng, R., Cui, C., Liu, Q., Yao, T., Reme- dios, L.W., Bao, S., Landman, B.A., Whe- less, L.E., Coburn, L.A., Wilson, K.T., et al.: Segment anything model (sam) for dig- ital pathology: Assess zero-shot segmenta- tion on whole slide imaging. In: IS&T Inter- national Symposium on Electronic Imaging (2025)
work page 2025
-
[63]
In: Medical Imaging 2024: Computer-Aided Diagnosis (2024)
Li, Y., Hu, M., Yang, X.: Polyp-sam: Trans- fer sam for polyp segmentation. In: Medical Imaging 2024: Computer-Aided Diagnosis (2024)
work page 2024
-
[64]
arXiv preprint arXiv:2304.13973 (2023)
Hu, M., Li, Y., Yang, X.: Skinsam: Empowering skin cancer segmentation with segment anything model. arXiv preprint arXiv:2304.13973 (2023)
-
[65]
arXiv preprint arXiv:2306.06370 (2023)
Shaharabany, T., Dahan, A., Giryes, R., Wolf, L.: Autosam: Adapting sam to medical images by overloading the prompt encoder. arXiv preprint arXiv:2306.06370 (2023)
-
[66]
In: Proceed- ings of the Computer Vision and Pattern Recognition Conference (CVPR) (2025)
Konwer, A., Yang, Z., Bas, E., Xiao, C., Prasanna, P., Bhatia, P., Kass-Hout, T.: Enhancing sam with efficient prompting and preference optimization for semi-supervised medical image segmentation. In: Proceed- ings of the Computer Vision and Pattern Recognition Conference (CVPR) (2025)
work page 2025
-
[67]
arXiv preprint arXiv:2312.06316 (2023)
Zhang, Y., Cheng, Y., Qi, Y.: Semisam: Exploring sam for enhancing semi- supervised medical image segmentation with extremely limited annotations. arXiv preprint arXiv:2312.06316 (2023)
-
[68]
arXiv preprint arXiv:2412.16085 (2024)
Ma, J., Li, F., Kim, S., Asakereh, R., Le, B.-H., Nguyen-Vu, D.-K., Pfefferle, A., Wei, M., Gao, R., Lyu, D., et al.: Efficient med- sams: Segment anything in medical images on laptop. arXiv preprint arXiv:2412.16085 (2024)
-
[69]
In: Medical Image Segmentation Challenge (2024)
Pfefferle, A., Purucker, L., Hutter, F.: Daft: data-aware fine-tuning of foundation models for efficient and effective medical image seg- mentation. In: Medical Image Segmentation Challenge (2024)
work page 2024
-
[70]
In: Medical Image Segmentation Challenge (2024)
Wei, M., Chen, S., Wu, S., Xu, D.: Rep- medsam: Towards real-time and universal medical image segmentation. In: Medical Image Segmentation Challenge (2024)
work page 2024
-
[71]
In: Medical Image Segmen- tation Challenge (2024)
Gao, R., Lyu, D., Staring, M.: Swin- litemedsam: A lightweight box-based seg- ment anything model for large-scale medical image datasets. In: Medical Image Segmen- tation Challenge (2024)
work page 2024
-
[72]
In: Medical Imaging 2025: Ultrasonic Imaging and Tomography (2025)
Hu, M., Yang, X.: Breastlightsam: a lightweight pipeline for fast and accurate breast cancer diagnosis and tumor segmen- tation. In: Medical Imaging 2025: Ultrasonic Imaging and Tomography (2025)
work page 2025
-
[73]
Machine Intelli- gence Research 21, 617–630 (2024)
Ji, W., Li, J., Bi, Q., Liu, T., Li, W., Cheng, L.: Segment anything is not always perfect: An investigation of sam on differ- ent real-world applications. Machine Intelli- gence Research 21, 617–630 (2024)
work page 2024
-
[74]
arXiv preprint arXiv:2304.07764 (2023)
Giannakis, I., Bhardwaj, A., Sam, L., Leon- tidis, G.: Deep learning universal crater detection using segment anything model (sam). arXiv preprint arXiv:2304.07764 (2023)
-
[75]
Li, Y., Wang, D., Yuan, C., Li, H., Hu, J.: Enhancing agricultural image segmenta- tion with an agricultural segment anything model adapter. Sensors 23(18), 7884 (2023)
work page 2023
-
[76]
arXiv preprint arXiv:2305.10724 (2023)
Cao, Y., Xu, X., Sun, C., Cheng, Y., Du, Z., Gao, L., Shen, W.: Segment any anomaly 33 without training via hybrid prompt regu- larization. arXiv preprint arXiv:2305.10724 (2023)
-
[77]
International Journal of Applied Earth Observation and Geoinformation 124, 103540 (2023)
Osco, L.P., Wu, Q., Lemos, E.L., Gon¸ calves, W.N., Ramos, A.P.M., Li, J., Junior, J.M.: The segment anything model (sam) for remote sensing applications: From zero to one shot. International Journal of Applied Earth Observation and Geoinformation 124, 103540 (2023)
work page 2023
-
[78]
In: Advances in Neural Information Processing Systems (2023)
Wang, D., Zhang, J., Du, B., Xu, M., Liu, L., Tao, D., Zhang, L.: Samrs: Scaling- up remote sensing segmentation dataset with segment anything model. In: Advances in Neural Information Processing Systems (2023)
work page 2023
-
[79]
In: Proceedings of the Computer Vision and Pattern Recognition Conference (CVPR) (2025)
Shan, Z., Liu, Y., Zhou, L., Yan, C., Wang, H., Xie, X.: Ros-sam: High-quality interac- tive segmentation for remote sensing moving object. In: Proceedings of the Computer Vision and Pattern Recognition Conference (CVPR) (2025)
work page 2025
-
[80]
arXiv preprint arXiv:2304.06790 (2023)
Yu, T., Feng, R., Feng, R., Liu, J., Jin, X., Zeng, W., Chen, Z.: Inpaint anything: Segment anything meets image inpainting. arXiv preprint arXiv:2304.06790 (2023)
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.