3D Segment Anything Model with Visual Mamba for Diagnosing Placenta Accreta Spectrum

Dunjin Chen; Fang He; Lili Du; Lulu Peng; Pingping Zhang; Tianyu Yan; Ting Song; Yuliang Zhang

arxiv: 2606.00489 · v2 · pith:JWH44GTLnew · submitted 2026-05-30 · 💻 cs.CV

3D Segment Anything Model with Visual Mamba for Diagnosing Placenta Accreta Spectrum

Yuliang Zhang , Fang He , Lulu Peng , Tianyu Yan , Pingping Zhang , Ting Song , Lili Du , Dunjin Chen This is my paper

Pith reviewed 2026-06-28 19:06 UTC · model grok-4.3

classification 💻 cs.CV

keywords Placenta Accreta SpectrumMRI segmentation3D Segment Anything ModelVisual MambaMedical image analysisLesion isolationPAS diagnosis3D medical imaging

0 comments

The pith

A 3D-adapted Segment Anything Model with Mamba modules segments uterine lesions in MRI and improves placenta accreta spectrum diagnosis when the masks are multiplied back into the original images.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper creates the first public MRI dataset for placenta accreta spectrum that includes both pixel-level lesion masks and diagnostic labels. It then builds 3DSAMba by taking the Segment Anything Model into three dimensions, adding a lightweight adapter to bring in medical-domain knowledge, inserting Multi-Level Aggregation Mamba blocks to combine features from different encoder depths, and using a Fusion State Space Model to merge encoder and decoder scales. The resulting masks are multiplied element-wise with the input scan so that only the lesion region remains for the final classifier. A reader should care because the disease is life-threatening yet often diagnosed late in hospitals that lack specialist radiologists. If the approach works, it supplies an automated second reader that could be deployed where expertise is scarce.

Core claim

We establish the first MRI-based PAS dataset with fine-grained segmentation and classification annotations. We propose 3DSAMba, a novel feature learning framework for effective lesion segmentation. We first design a 3D Segment Anything Model (SAM) and incorporate medical domain information into the model through an efficient adapter mechanism. In addition, we introduce a Multi-Level Aggregation Mamba (MLAM) to aggregate feature maps across different levels and a Fusion State Space Model (FSSM) to fuse multi-scale features from both the encoder and decoder. Finally, we apply segmentation masks to the original MRI images through element-wise multiplication, effectively isolating lesion areas f

What carries the argument

The 3DSAMba pipeline: a 3D SAM equipped with a medical adapter, followed by MLAM for cross-level aggregation and FSSM for encoder-decoder fusion, whose output masks are multiplied element-wise with the input MRI to isolate lesions before classification.

If this is right

The framework significantly improves PAS diagnostic performance on the new MRI dataset.
Automatic lesion segmentation followed by mask multiplication isolates the relevant areas and raises classification accuracy.
The released dataset supplies both segmentation and classification labels for future method development.
The same adapter-plus-Mamba design can be applied to other 3D medical volumes where the Segment Anything Model needs domain adaptation.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The element-wise multiplication step treats segmentation quality as a direct proxy for classification gain, which could be tested by ablating the mask quality while holding the classifier fixed.
Because the method isolates lesions before classification, it may reduce the impact of surrounding anatomy that varies across patients or scanners.
The approach could be extended to longitudinal MRI studies to track lesion changes over pregnancy without retraining the entire model.

Load-bearing premise

The assumption that the masks produced by the adapted 3D SAM and Mamba modules, when multiplied element-wise with the raw MRI, produce a measurably more accurate downstream PAS classification than the unmasked images or alternative segmentations.

What would settle it

A side-by-side comparison of PAS classification accuracy on a held-out test set when the classifier receives the raw MRI versus the element-wise masked MRI produced by 3DSAMba, with reported sensitivity, specificity, and statistical significance.

Figures

Figures reproduced from arXiv: 2606.00489 by Dunjin Chen, Fang He, Lili Du, Lulu Peng, Pingping Zhang, Tianyu Yan, Ting Song, Yuliang Zhang.

**Figure 2.** Figure 2: Visualization of the original MRI data and PAS lesion areas in both [PITH_FULL_IMAGE:figures/full_fig_p004_2.png] view at source ↗

**Figure 3.** Figure 3: Representative MRI slices and lesion overlays for the three PAS subtypes stratified by quantitative severity. Green, orange, and red overlays denote [PITH_FULL_IMAGE:figures/full_fig_p005_3.png] view at source ↗

**Figure 4.** Figure 4: Distribution of lesion volumes across three PAS severity groups. [PITH_FULL_IMAGE:figures/full_fig_p005_4.png] view at source ↗

**Figure 5.** Figure 5: Illustration of our proposed framework for PAS diagnosis. It includes a lesion segmentation model and a simple classification network to localize the [PITH_FULL_IMAGE:figures/full_fig_p006_5.png] view at source ↗

**Figure 6.** Figure 6: Illustration of our proposed 3DSAMba. It includes three main components: 3D Segment Anything Model (SAM) Encoder, Multi-Level Aggregation [PITH_FULL_IMAGE:figures/full_fig_p006_6.png] view at source ↗

**Figure 7.** Figure 7: Illustration of our proposed adapters in each Transformer block. [PITH_FULL_IMAGE:figures/full_fig_p007_7.png] view at source ↗

**Figure 8.** Figure 8: Illustration of our proposed MLAM. I1 = SSM ([F3, F6, F9, F12]), I2 = SSM ([F6, F9, F12, F3]), I3 = SSM ([F9, F12, F3, F6]), I4 = SSM ([F12, F3, F6, F9]), (7) where [,] is the concatenation operation. SSM is the state space model [13], [47] Fk ∈ R T ×D is the output of the k-th Transformer layer and k ∈ {3, 6, 9, 12}. Ij ∈ R 4T ×D represents the scanned results from different orders and j ∈ {1, 2, 3, 4}. T… view at source ↗

**Figure 10.** Figure 10: Visual comparison of predicted masks with different methods in the axial plane. The white areas indicate correctly predicted regions. The red areas [PITH_FULL_IMAGE:figures/full_fig_p009_10.png] view at source ↗

**Figure 11.** Figure 11: Visual comparison of predicted masks with different methods in the coronal plane. The white areas indicate correctly predicted regions. The red [PITH_FULL_IMAGE:figures/full_fig_p010_11.png] view at source ↗

**Figure 12.** Figure 12: Visual comparison of predicted masks with different methods in the sagittal plane. The white areas indicate correctly predicted regions. The red [PITH_FULL_IMAGE:figures/full_fig_p010_12.png] view at source ↗

**Figure 13.** Figure 13: Representative failure cases on the PAS test set. The white areas indicate correctly predicted regions. The red areas represent redundant predictions, [PITH_FULL_IMAGE:figures/full_fig_p011_13.png] view at source ↗

**Figure 15.** Figure 15: Visualization comparison between fully fine-tuning and adapters. [PITH_FULL_IMAGE:figures/full_fig_p011_15.png] view at source ↗

**Figure 16.** Figure 16: Performance with different projection dimensions in adapters. [PITH_FULL_IMAGE:figures/full_fig_p011_16.png] view at source ↗

**Figure 17.** Figure 17: Performance comparison with different compression rates in MLAM. [PITH_FULL_IMAGE:figures/full_fig_p012_17.png] view at source ↗

**Figure 18.** Figure 18: Performance comparison with different layers in MLAM. [PITH_FULL_IMAGE:figures/full_fig_p012_18.png] view at source ↗

**Figure 19.** Figure 19: Visual comparison with different layers used in MLAM. [PITH_FULL_IMAGE:figures/full_fig_p012_19.png] view at source ↗

**Figure 20.** Figure 20: Visual comparison with (a) Baseline model; (b) +Adapters; (c) +DSCM; (d) +FSSM; (e) Final model. [PITH_FULL_IMAGE:figures/full_fig_p013_20.png] view at source ↗

read the original abstract

Placenta Accreta Spectrum (PAS) is a rare but highly dangerous obstetric disease. Early and accurate PAS diagnosis is critical for maternal health. Traditional PAS diagnosis relies on experienced doctors by analyzing the cesarean history and Magnetic Resonance Imaging (MRI) data. However, district-level hospitals often lack the expertise and resources for accurate PAS diagnosis. To address these challenges, we establish the first MRI-based PAS dataset, which includes both fine-grained segmentation and classification annotations. Meanwhile, diagnosing PAS can be significantly enhanced by segmenting lesion areas from MRI images of the uterus. To achieve automatic PAS diagnosis, we propose 3DSAMba, a novel feature learning framework for effective lesion segmentation. More specifically, we first design a 3D Segment Anything Model (SAM) and incorporate medical domain information into the model through an efficient adapter mechanism. In addition, we introduce a Multi-Level Aggregation Mamba (MLAM) to aggregate feature maps across different levels and a Fusion State Space Model (FSSM) to fuse multi-scale features from both the encoder and decoder. Finally, we apply segmentation masks to the original MRI images through element-wise multiplication, effectively isolating lesion areas for more accurate PAS diagnosis. Extensive experiments validate that our framework significantly improves the PAS diagnostic performance. To facilitate further research in PAS diagnosis, we have released the dataset and source code at https://github.com/Drchip61/PASD.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

Releases the first annotated MRI dataset for placenta accreta spectrum plus code, but claims big diagnostic gains from the 3DSAMba pipeline with no numbers, baselines, or validation details anywhere in the abstract.

read the letter

The paper's real contribution is the new MRI dataset for PAS with both segmentation and classification labels, plus public code. That fills a gap for a rare but serious condition where smaller hospitals often lack specialists. They adapt SAM to 3D with a medical adapter, add MLAM to pull features across levels, and FSSM to combine encoder-decoder scales, then multiply the mask back onto the original scan to feed classification.

Releasing the data is the part that actually helps the field. The rest is mostly stitching together established pieces—3D SAM extensions and Mamba blocks—for this specific obstetric task.

The problem is the performance story. The abstract says extensive experiments show significant improvement in PAS diagnosis, yet supplies zero metrics, no baselines, no dataset size, no error bars, and no protocol. You cannot tell whether the mask-multiplication step adds anything measurable or whether the whole thing beats simpler approaches. That leaves the central claim unevaluable from the text provided.

The work is aimed at medical imaging groups working on segmentation or obstetric applications who might want the dataset. It is not a methods breakthrough, but the data release and clinical framing make it worth sending out for review so the experiments and dataset quality can be checked properly.

Referee Report

1 major / 0 minor

Summary. The paper introduces the first public MRI-based dataset for Placenta Accreta Spectrum (PAS) diagnosis, containing both fine-grained segmentation and classification annotations. It proposes the 3DSAMba framework, which adapts a 3D Segment Anything Model (SAM) via an efficient adapter to incorporate medical domain knowledge, adds a Multi-Level Aggregation Mamba (MLAM) module to aggregate features across levels, and a Fusion State Space Model (FSSM) to fuse multi-scale encoder-decoder features. Segmentation masks are then multiplied element-wise with the input MRI volumes to isolate lesion regions for improved downstream PAS classification. The authors state that extensive experiments demonstrate significant performance gains and release both the dataset and source code.

Significance. If the claimed performance gains are substantiated with quantitative results, this work would be significant for medical image analysis in obstetrics: it supplies the first public MRI dataset for a rare, high-stakes condition and demonstrates a practical way to combine SAM-style prompting with state-space models for 3D volumetric segmentation. The explicit release of data and code is a clear strength that supports reproducibility and follow-on research.

major comments (1)

[Abstract] Abstract: The central claim that 'Extensive experiments validate that our framework significantly improves the PAS diagnostic performance' is presented without any numerical results, baseline comparisons, dataset size, validation protocol, or statistical measures. Because the abstract supplies no evidence for the performance improvement that underpins the entire contribution, the claim cannot be evaluated from the provided text.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for the constructive comment on the abstract. We address it point by point below.

read point-by-point responses

Referee: [Abstract] Abstract: The central claim that 'Extensive experiments validate that our framework significantly improves the PAS diagnostic performance' is presented without any numerical results, baseline comparisons, dataset size, validation protocol, or statistical measures. Because the abstract supplies no evidence for the performance improvement that underpins the entire contribution, the claim cannot be evaluated from the provided text.

Authors: We agree that the abstract should include quantitative evidence to support the central claim. In the revised manuscript we will update the abstract to report the dataset size, key segmentation metrics (e.g., Dice score), classification performance (e.g., accuracy or AUC), comparisons against baselines, and the validation protocol used. This change will allow readers to directly evaluate the reported improvements. revision: yes

Circularity Check

0 steps flagged

No significant circularity detected

full rationale

The paper describes a 3DSAMba framework (3D SAM adapter + MLAM + FSSM) whose outputs are used via element-wise multiplication on MRI inputs to improve PAS classification, with the improvement asserted via experiments on a released dataset. No equations, fitted parameters renamed as predictions, self-citations, or uniqueness theorems appear in the provided text that reduce any claimed result to a definition or input by construction. The argument is therefore self-contained and relies on external empirical validation rather than internal reductions.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Central claim rests on the unverified effectiveness of the newly introduced adapter, MLAM, and FSSM components for 3D medical segmentation; no independent evidence or derivation for these modules is supplied in the abstract.

pith-pipeline@v0.9.1-grok · 5800 in / 1116 out tokens · 30619 ms · 2026-06-28T19:06:29.744707+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

74 extracted references · 2 canonical work pages

[1]

Placenta accreta spectrum: pathophysiology and evidence-based anatomy for prenatal ultrasound imaging,

E. Jauniaux, S. Collins, and G. J. Burton, “Placenta accreta spectrum: pathophysiology and evidence-based anatomy for prenatal ultrasound imaging,”American journal of obstetrics and gynecology, vol. 218, no. 1, pp. 75–87, 2018

2018
[2]

Placenta accreta spectrum: diagnosis and management,

B. Poljak, D. Khairudin, N. W. Jones, and A. K. Agten, “Placenta accreta spectrum: diagnosis and management,”Obstetrics, Gynaecology & Reproductive Medicine, vol. 33, no. 8, pp. 232–238, 2023

2023
[3]

Machine learning analysis of mri-derived texture features to predict placenta accreta spectrum in patients with placenta previa,

V . Romeo, C. Ricciardi, R. Cuocolo, A. Stanzione, F. Verde, L. Sarno, G. Improta, P. P. Mainenti, M. D’Armiento, A. Brunettiet al., “Machine learning analysis of mri-derived texture features to predict placenta accreta spectrum in patients with placenta previa,”Magnetic resonance imaging, vol. 64, pp. 71–76, 2019

2019
[4]

U-net: Convolutional networks for biomedical image segmentation,

O. Ronneberger, P. Fischer, and T. Brox, “U-net: Convolutional networks for biomedical image segmentation,” inMICCAI. Springer, 2015, pp. 234–241

2015
[5]

3d deeply supervised network for automatic liver segmentation from ct volumes,

Q. Dou, H. Chen, Y . Jin, L. Yu, J. Qin, and P.-A. Heng, “3d deeply supervised network for automatic liver segmentation from ct volumes,” inMICCAI, 2016, pp. 149–157. 14 IEEE TRANSACTIONS ON IMAGE PROCESSING

2016
[6]

Automatic multi-organ segmentation on abdominal ct with dense v-networks,

E. Gibson, F. Giganti, Y . Hu, E. Bonmati, S. Bandula, K. Gurusamy, B. Davidson, S. P. Pereira, M. J. Clarkson, and D. C. Barratt, “Automatic multi-organ segmentation on abdominal ct with dense v-networks,”TIP, vol. 37, no. 8, pp. 1822–1834, 2018

2018
[7]

H-denseunet: hybrid densely connected unet for liver and tumor segmentation from ct volumes,

X. Li, H. Chen, X. Qi, Q. Dou, C.-W. Fu, and P.-A. Heng, “H-denseunet: hybrid densely connected unet for liver and tumor segmentation from ct volumes,”TMI, vol. 37, no. 12, pp. 2663–2674, 2018

2018
[8]

V olumetric convnets with mixed residual connections for automated prostate segmentation from 3d mr images,

L. Yu, X. Yang, H. Chen, J. Qin, and P. A. Heng, “V olumetric convnets with mixed residual connections for automated prostate segmentation from 3d mr images,” inAAAI, vol. 31, no. 1, 2017, pp. 66–72

2017
[9]

Attention is all you need,

A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, Ł. Kaiser, and I. Polosukhin, “Attention is all you need,”NeurIPS, vol. 30, 2017

2017
[10]

Transformers in vision: A survey,

S. Khan, M. Naseer, M. Hayat, S. W. Zamir, F. S. Khan, and M. Shah, “Transformers in vision: A survey,”ACM computing surveys, vol. 54, no. 10s, pp. 1–41, 2022

2022
[11]

Segment anything,

A. Kirillov, E. Mintun, N. Ravi, H. Mao, C. Rolland, L. Gustafson, T. Xiao, S. Whitehead, A. C. Berg, W.-Y . Loet al., “Segment anything,” inICCV, 2023, pp. 4015–4026

2023
[12]

Efficiently modeling long sequences with structured state spaces,

A. Gu, K. Goel, and C. Re, “Efficiently modeling long sequences with structured state spaces,” inICLR, 2022, pp. 1–32

2022
[13]

Mamba: Linear-time sequence modeling with selective state spaces,

A. Gu and T. Dao, “Mamba: Linear-time sequence modeling with selective state spaces,”arXiv, 2023

2023
[14]

Vmamba: Visual state space model,

Y . Liu, Y . Tian, Y . Zhao, H. Yu, L. Xie, Y . Wang, Q. Ye, J. Jiao, and Y . Liu, “Vmamba: Visual state space model,”NeurIPS, vol. 37, pp. 103 031–103 063, 2024

2024
[15]

Placenta accreta spectrum,

A. C. of Obstetricians, Gynecologistset al., “Placenta accreta spectrum,” American journal of obstetrics and gynecology, vol. 219, no. 6, pp. B2– B16, 2018

2018
[16]

Placenta accreta spectrum,

R. M. Silver and D. W. Branch, “Placenta accreta spectrum,”NEJM, vol. 378, no. 16, pp. 1529–1536, 2018

2018
[17]

Placenta accreta spectrum diagnosis challenges and controversies in current obstetrics: a review,

A. Arakaza, L. Zou, and J. Zhu, “Placenta accreta spectrum diagnosis challenges and controversies in current obstetrics: a review,”Interna- tional Journal of Women’s Health, pp. 635–654, 2023

2023
[18]

Placenta accreta spectrum among women with twin gestations,

H. E. Miller, S. A. Leonard, K. A. Fox, D. A. Carusi, and D. J. Lyell, “Placenta accreta spectrum among women with twin gestations,” Obstetrics & Gynecology, vol. 137, no. 1, pp. 132–138, 2021

2021
[19]

Placenta accreta: spectrum of us and mr imaging findings,

W. C. Baughman, J. E. Corteville, and R. R. Shah, “Placenta accreta: spectrum of us and mr imaging findings,”Radiographics, vol. 28, no. 7, pp. 1905–1916, 2008

1905
[20]

Predicting placenta accreta spectrum: validation of the placenta accreta index,

S. K. Happe, C. S. Yule, C. Y . Spong, C. E. Wells, J. S. Dashe, E. Moschos, M. W. Rac, D. D. McIntire, and D. M. Twickler, “Predicting placenta accreta spectrum: validation of the placenta accreta index,” Journal of Ultrasound in Medicine, vol. 40, no. 8, pp. 1523–1532, 2021

2021
[21]

Magnetic resonance imaging of placenta accreta spectrum: a step-by-step approach,

S. Srisajjakul, P. Prapaisilp, and S. Bangchokdee, “Magnetic resonance imaging of placenta accreta spectrum: a step-by-step approach,”Korean journal of radiology, vol. 22, no. 2, p. 198, 2020

2020
[22]

The use of magnetic resonance imaging to predict placenta previa with placenta accreta spectrum,

H. Ishibashi, M. Miyamoto, H. Shinmoto, S. Soga, H. Matsuura, S. Kakimoto, H. Iwahashi, T. Sakamoto, T. Hada, R. Suzukiet al., “The use of magnetic resonance imaging to predict placenta previa with placenta accreta spectrum,”Acta Obstetricia et Gynecologica Scandinavica, vol. 99, no. 12, pp. 1657–1665, 2020

2020
[23]

Review of mri imaging for placenta accreta spectrum: pathophysiologic insights, imaging signs, and recent developments,

H. Kapoor, M. Hanaoka, A. Dawkins, and A. Khurana, “Review of mri imaging for placenta accreta spectrum: pathophysiologic insights, imaging signs, and recent developments,”Placenta, vol. 104, pp. 31– 39, 2021

2021
[24]

Diagnosis of placenta accreta spectrum in high-risk women using ultrasonography or magnetic resonance imaging: systematic review and meta-analysis,

M. De Oliveira Carniello, L. Oliveira Brito, L. Sarian, and J. Bennini, “Diagnosis of placenta accreta spectrum in high-risk women using ultrasonography or magnetic resonance imaging: systematic review and meta-analysis,”Ultrasound in Obstetrics & Gynecology, vol. 59, no. 4, pp. 428–436, 2022

2022
[25]

Mri of the placenta accreta spectrum (pas) disorder: radiomics analysis correlates with surgical and pathological outcome,

Q. N. Do, M. A. Lewis, Y . Xi, A. J. Madhuranthakam, S. K. Happe, J. S. Dashe, R. E. Lenkinski, A. Khan, and D. M. Twickler, “Mri of the placenta accreta spectrum (pas) disorder: radiomics analysis correlates with surgical and pathological outcome,”Journal of Magnetic Resonance Imaging, vol. 51, no. 3, pp. 936–946, 2020

2020
[26]

Unet++: A nested u-net architecture for medical image segmentation,

Z. Zhou, M. M. Rahman Siddiquee, N. Tajbakhsh, and J. Liang, “Unet++: A nested u-net architecture for medical image segmentation,” inMICCAI. Springer, 2018, pp. 3–11

2018
[27]

Automatically designing cnn architectures for medical image segmentation,

A. Mortazi and U. Bagci, “Automatically designing cnn architectures for medical image segmentation,” inMLMI. Springer, 2018, pp. 98–106

2018
[28]

Drinet for medical image segmentation,

L. Chen, P. Bentley, K. Mori, K. Misawa, M. Fujiwara, and D. Rueckert, “Drinet for medical image segmentation,”TIP, vol. 37, no. 11, pp. 2453– 2462, 2018

2018
[29]

Encoder- decoder with atrous separable convolution for semantic image segmen- tation,

L.-C. Chen, Y . Zhu, G. Papandreou, F. Schroff, and H. Adam, “Encoder- decoder with atrous separable convolution for semantic image segmen- tation,” inECCV, 2018, pp. 801–818

2018
[30]

Resunet++: An advanced architecture for medical image segmentation,

D. Jha, P. H. Smedsrud, M. A. Riegler, D. Johansen, T. De Lange, P. Halvorsen, and H. D. Johansen, “Resunet++: An advanced architecture for medical image segmentation,” inISM. IEEE, 2019, pp. 225–2255

2019
[31]

Ce-net: Context encoder network for 2d medical image segmentation,

Z. Gu, J. Cheng, H. Fu, K. Zhou, H. Hao, Y . Zhao, T. Zhang, S. Gao, and J. Liu, “Ce-net: Context encoder network for 2d medical image segmentation,”TIP, vol. 38, no. 10, pp. 2281–2292, 2019

2019
[32]

Doubleu-net: A deep convolutional neural network for medical image segmentation,

D. Jha, M. A. Riegler, D. Johansen, P. Halvorsen, and H. D. Johansen, “Doubleu-net: A deep convolutional neural network for medical image segmentation,” inISCMS. IEEE, 2020, pp. 558–564

2020
[33]

Adaresu-net: Multiobjective adaptive convolutional neural network for medical image segmentation,

M. Baldeon-Calisto and S. K. Lai-Yuen, “Adaresu-net: Multiobjective adaptive convolutional neural network for medical image segmentation,” Neurocomputing, vol. 392, pp. 325–340, 2020

2020
[34]

Deep learning ap- proach for medical image analysis,

A. A. Adegun, S. Viriri, and R. O. Ogundokun, “Deep learning ap- proach for medical image analysis,”Computational Intelligence and Neuroscience, vol. 2021, no. 1, p. 6215281, 2021

2021
[35]

Unext: Mlp-based rapid medical image segmentation network,

J. M. J. Valanarasu and V . M. Patel, “Unext: Mlp-based rapid medical image segmentation network,” inMICCAI. Springer, 2022, pp. 23–33

2022
[36]

An image is worth 16x16 words: Transformers for image recognition at scale,

D. Alexey, “An image is worth 16x16 words: Transformers for image recognition at scale,”ICLR, pp. 1–22, 2021

2021
[37]

Swin transformer: Hierarchical vision transformer using shifted windows,

Z. Liu, Y . Lin, Y . Cao, H. Hu, Y . Wei, Z. Zhang, S. Lin, and B. Guo, “Swin transformer: Hierarchical vision transformer using shifted windows,” inCVPR, 2021, pp. 10 012–10 022

2021
[38]

Cotr: Efficiently bridging cnn and transformer for 3d medical image segmentation,

Y . Xie, J. Zhang, C. Shen, and Y . Xia, “Cotr: Efficiently bridging cnn and transformer for 3d medical image segmentation,” inMICCAI. Springer, 2021, pp. 171–180

2021
[39]

Transunet: Transformers make strong encoders for medical image segmentation,

J. Chen, Y . Lu, Q. Yu, X. Luo, E. Adeli, Y . Wang, L. Lu, A. L. Yuille, and Y . Zhou, “Transunet: Transformers make strong encoders for medical image segmentation,”arXiv, 2021

2021
[40]

Unetr: Transformers for 3d medical image segmentation,

A. Hatamizadeh, Y . Tang, V . Nath, D. Yang, A. Myronenko, B. Land- man, H. R. Roth, and D. Xu, “Unetr: Transformers for 3d medical image segmentation,” inWACV, 2022, pp. 574–584

2022
[41]

Swin unetr: Swin transformers for semantic segmentation of brain tumors in mri images,

A. Hatamizadeh, V . Nath, Y . Tang, D. Yang, H. R. Roth, and D. Xu, “Swin unetr: Swin transformers for semantic segmentation of brain tumors in mri images,” inMICCAI. Springer, 2021, pp. 272–284

2021
[42]

Ma-sam: Modality-agnostic sam adaptation for 3d medical image segmentation,

C. Chen, J. Miao, D. Wu, A. Zhong, Z. Yan, S. Kim, J. Hu, Z. Liu, L. Sun, X. Liet al., “Ma-sam: Modality-agnostic sam adaptation for 3d medical image segmentation,”MIA, vol. 98, p. 103310, 2024

2024
[43]

3dsam-adapter: Holistic adaptation of sam from 2d to 3d for promptable tumor segmentation,

S. Gong, Y . Zhong, W. Ma, J. Li, Z. Wang, J. Zhang, P.-A. Heng, and Q. Dou, “3dsam-adapter: Holistic adaptation of sam from 2d to 3d for promptable tumor segmentation,”MIA, vol. 98, p. 103324, 2024

2024
[44]

Input augmentation with sam: Boosting medical image segmentation with segmentation foundation model,

Y . Zhang, T. Zhou, S. Wang, P. Liang, Y . Zhang, and D. Z. Chen, “Input augmentation with sam: Boosting medical image segmentation with segmentation foundation model,” inMICCAI. Springer, 2023, pp. 129–139

2023
[45]

Sam3d: Segment anything model in volu- metric medical images,

N.-T. Bui, D.-H. Hoang, M.-T. Tran, G. Doretto, D. Adjeroh, B. Patel, A. Choudhary, and N. Le, “Sam3d: Segment anything model in volu- metric medical images,” inISBI. IEEE, 2024, pp. 1–4

2024
[46]

Auto- prompting sam for mobile friendly 3d medical image segmentation,

C. Li, P. Khanduri, Y . Qiang, R. I. Sultan, I. Chetty, and D. Zhu, “Auto- prompting sam for mobile friendly 3d medical image segmentation,” arXiv, 2023

2023
[47]

Selective structured state-spaces for long-form video understanding,

J. Wang, W. Zhu, P. Wang, X. Yu, L. Liu, M. Omar, and R. Hamid, “Selective structured state-spaces for long-form video understanding,” inCVPR, 2023, pp. 6387–6397

2023
[48]

U-mamba: Enhancing long-range depen- dency for biomedical image segmentation,

J. Ma, F. Li, and B. Wang, “U-mamba: Enhancing long-range depen- dency for biomedical image segmentation,”arXiv, 2024

2024
[49]

Mamba-unet: Unet-like pure visual mamba for medical image segmentation.arXiv preprint arXiv:2402.05079, 2024

Z. Wang, J.-Q. Zheng, Y . Zhang, G. Cui, and L. Li, “Mamba-unet: Unet- like pure visual mamba for medical image segmentation,”arXiv preprint arXiv:2402.05079, 2024

work page arXiv 2024
[50]

Lkm-unet: Large kernel vision mamba unet for medical image segmentation,

J. Wang, J. Chen, D. Chen, and J. Wu, “Lkm-unet: Large kernel vision mamba unet for medical image segmentation,” inMICCAI. Springer, 2024, pp. 360–370

2024
[51]

Vm-unet- v2: rethinking vision mamba unet for medical image segmentation,

M. Zhang, Y . Yu, S. Jin, L. Gu, T. Ling, and X. Tao, “Vm-unet- v2: rethinking vision mamba unet for medical image segmentation,” in ISBRA. Springer, 2024, pp. 335–346

2024
[52]

Segmamba: Long-range sequential modeling mamba for 3d medical image segmentation,

Z. Xing, T. Ye, Y . Yang, G. Liu, and L. Zhu, “Segmamba: Long-range sequential modeling mamba for 3d medical image segmentation,” in MICCAI. Springer, 2024, pp. 578–588

2024
[53]

Swin-umamba†: Adapting mamba-based vision foundation models for medical image segmentation,

J. Liu, H. Yang, H.-Y . Zhou, L. Yu, Y . Liang, Y . Yu, S. Zhang, H. Zheng, and S. Wang, “Swin-umamba†: Adapting mamba-based vision foundation models for medical image segmentation,”TIP, pp. 1–1, 2024

2024
[54]

H-vmunet: High-order vision mamba unet for medical image segmentation,

R. Wu, Y . Liu, P. Liang, and Q. Chang, “H-vmunet: High-order vision mamba unet for medical image segmentation,”Neurocomputing, p. 129447, 2025

2025
[55]

Frequency- enhanced multi-granularity context network for efficient vertebrae seg- mentation,

J. Shi, T. You, P. Zhang, H. Zhang, R. Xu, and H. Li, “Frequency- enhanced multi-granularity context network for efficient vertebrae seg- mentation,” inMICCAI, 2025, pp. 206–216. 15

2025
[56]

A comprehensive analysis of mamba for 3d volumetric medical image segmentation,

C. Wanget al., “A comprehensive analysis of mamba for 3d volumetric medical image segmentation,”Pattern Recognition, 2026

2026
[57]

Batch normalization: Accelerating deep network training by reducing internal covariate shift,

S. Ioffe and C. Szegedy, “Batch normalization: Accelerating deep network training by reducing internal covariate shift,” inICML. pmlr, 2015, pp. 448–456

2015
[58]

Convergence analysis of two-layer neural networks with relu activation,

Y . Li and Y . Yuan, “Convergence analysis of two-layer neural networks with relu activation,” inNeurIPS, 2017, pp. 597–607

2017
[59]

Lora: Low-rank adaptation of large language models

E. J. Hu, Y . Shen, P. Wallis, Z. Allen-Zhu, Y . Li, S. Wang, L. Wang, W. Chenet al., “Lora: Low-rank adaptation of large language models.” ICLR, vol. 1, no. 2, p. 3, 2022

2022
[60]

Fantastic animals and where to find them: Segment any marine animal with dual sam,

P. Zhang, T. Yan, Y . Liu, and H. Lu, “Fantastic animals and where to find them: Segment any marine animal with dual sam,” inCVPR, 2024, pp. 2578–2587

2024
[61]

Layer normalization,

J. L. Ba, J. R. Kiros, and G. E. Hinton, “Layer normalization,”STAT, vol. 1050, p. 21, 2016

2016
[62]

Multilayer perceptrons,

L. B. Almeida, “Multilayer perceptrons,” inHandbook of Neural Com- putation. CRC Press, 2020, pp. C1–2

2020
[63]

Activation functions: comparison of trends in practice and research for deep learning,

C. E. Nwankpa, W. Ijomah, A. Gachagan, and S. Marshall, “Activation functions: comparison of trends in practice and research for deep learning,” inICCST, 2021, pp. 124–133

2021
[64]

Depth-wise separable convolutions and multi-level pooling for an efficient spatial cnn-based steganalysis,

R. Zhang, F. Zhu, J. Liu, and G. Liu, “Depth-wise separable convolutions and multi-level pooling for an efficient spatial cnn-based steganalysis,” TIFS, vol. 15, pp. 1138–1150, 2019

2019
[65]

Sigmoid activation function in selecting the best model of artificial neural networks,

H. Pratiwi, A. P. Windarto, S. Susliansyah, R. R. Aria, S. Susilowati, L. K. Rahayu, Y . Fitriani, A. Merdekawati, and I. R. Rahadjeng, “Sigmoid activation function in selecting the best model of artificial neural networks,” inJournal of Physics: Conference Series, vol. 1471, no. 1. IOP Publishing, 2020, p. 012010

2020
[66]

The state of the art in kidney and kidney tumor segmentation in contrast-enhanced ct imaging: Results of the kits19 challenge,

N. Heller, F. Isensee, K. H. Maier-Hein, X. Hou, C. Xie, F. Li, Y . Nan, G. Mu, Z. Lin, M. Hanet al., “The state of the art in kidney and kidney tumor segmentation in contrast-enhanced ct imaging: Results of the kits19 challenge,”MIA, vol. 67, p. 101821, 2021

2021
[67]

Medical image segmentation review: The success of u-net,

R. Azad, E. K. Aghdam, A. Rauland, Y . Jia, A. H. Avval, A. Bozorgpour, S. Karimijafarbigloo, J. P. Cohen, E. Adeli, and D. Merhof, “Medical image segmentation review: The success of u-net,”TPAMI, pp. 10 076– 10 095, 2024

2024
[68]

Adam: A method for stochastic optimization,

K. Diederik, “Adam: A method for stochastic optimization,”arXiv, 2014

2014
[69]

3d u-net: learning dense volumetric segmentation from sparse annota- tion,

¨O. C ¸ ic ¸ek, A. Abdulkadir, S. S. Lienkamp, T. Brox, and O. Ronneberger, “3d u-net: learning dense volumetric segmentation from sparse annota- tion,” inMICC. Springer, 2016, pp. 424–432

2016
[70]

nnu-net: a self-configuring method for deep learning-based biomedical image segmentation,

F. Isensee, P. F. Jaeger, S. A. Kohl, J. Petersen, and K. H. Maier-Hein, “nnu-net: a self-configuring method for deep learning-based biomedical image segmentation,”Nature methods, vol. 18, no. 2, pp. 203–211, 2021

2021
[71]

Normalnet: A voxel-based cnn for 3d object classification and retrieval,

C. Wang, M. Cheng, F. Sohel, M. Bennamoun, and J. Li, “Normalnet: A voxel-based cnn for 3d object classification and retrieval,”Neurocom- puting, vol. 323, pp. 139–147, 2019

2019
[72]

Transformer-based factorized encoder for classification of pneumoco- niosis on 3d ct images,

Y . Huang, Y . Si, B. Hu, Y . Zhang, S. Wu, D. Wu, and Q. Wang, “Transformer-based factorized encoder for classification of pneumoco- niosis on 3d ct images,”CBM, vol. 150, p. 106137, 2022

2022
[73]

Video swin transformer,

Z. Liu, J. Ning, Y . Cao, Y . Wei, Z. Zhang, S. Lin, and H. Hu, “Video swin transformer,” inCVPR, 2022, pp. 3202–3211

2022
[74]

Medmamba: Vision mamba for medical image classification,

Y . Yue and Z. Li, “Medmamba: Vision mamba for medical image classification,”arXiv preprint arXiv:2403.03849, 2024

work page arXiv 2024

[1] [1]

Placenta accreta spectrum: pathophysiology and evidence-based anatomy for prenatal ultrasound imaging,

E. Jauniaux, S. Collins, and G. J. Burton, “Placenta accreta spectrum: pathophysiology and evidence-based anatomy for prenatal ultrasound imaging,”American journal of obstetrics and gynecology, vol. 218, no. 1, pp. 75–87, 2018

2018

[2] [2]

Placenta accreta spectrum: diagnosis and management,

B. Poljak, D. Khairudin, N. W. Jones, and A. K. Agten, “Placenta accreta spectrum: diagnosis and management,”Obstetrics, Gynaecology & Reproductive Medicine, vol. 33, no. 8, pp. 232–238, 2023

2023

[3] [3]

Machine learning analysis of mri-derived texture features to predict placenta accreta spectrum in patients with placenta previa,

V . Romeo, C. Ricciardi, R. Cuocolo, A. Stanzione, F. Verde, L. Sarno, G. Improta, P. P. Mainenti, M. D’Armiento, A. Brunettiet al., “Machine learning analysis of mri-derived texture features to predict placenta accreta spectrum in patients with placenta previa,”Magnetic resonance imaging, vol. 64, pp. 71–76, 2019

2019

[4] [4]

U-net: Convolutional networks for biomedical image segmentation,

O. Ronneberger, P. Fischer, and T. Brox, “U-net: Convolutional networks for biomedical image segmentation,” inMICCAI. Springer, 2015, pp. 234–241

2015

[5] [5]

3d deeply supervised network for automatic liver segmentation from ct volumes,

Q. Dou, H. Chen, Y . Jin, L. Yu, J. Qin, and P.-A. Heng, “3d deeply supervised network for automatic liver segmentation from ct volumes,” inMICCAI, 2016, pp. 149–157. 14 IEEE TRANSACTIONS ON IMAGE PROCESSING

2016

[6] [6]

Automatic multi-organ segmentation on abdominal ct with dense v-networks,

E. Gibson, F. Giganti, Y . Hu, E. Bonmati, S. Bandula, K. Gurusamy, B. Davidson, S. P. Pereira, M. J. Clarkson, and D. C. Barratt, “Automatic multi-organ segmentation on abdominal ct with dense v-networks,”TIP, vol. 37, no. 8, pp. 1822–1834, 2018

2018

[7] [7]

H-denseunet: hybrid densely connected unet for liver and tumor segmentation from ct volumes,

X. Li, H. Chen, X. Qi, Q. Dou, C.-W. Fu, and P.-A. Heng, “H-denseunet: hybrid densely connected unet for liver and tumor segmentation from ct volumes,”TMI, vol. 37, no. 12, pp. 2663–2674, 2018

2018

[8] [8]

V olumetric convnets with mixed residual connections for automated prostate segmentation from 3d mr images,

L. Yu, X. Yang, H. Chen, J. Qin, and P. A. Heng, “V olumetric convnets with mixed residual connections for automated prostate segmentation from 3d mr images,” inAAAI, vol. 31, no. 1, 2017, pp. 66–72

2017

[9] [9]

Attention is all you need,

A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, Ł. Kaiser, and I. Polosukhin, “Attention is all you need,”NeurIPS, vol. 30, 2017

2017

[10] [10]

Transformers in vision: A survey,

S. Khan, M. Naseer, M. Hayat, S. W. Zamir, F. S. Khan, and M. Shah, “Transformers in vision: A survey,”ACM computing surveys, vol. 54, no. 10s, pp. 1–41, 2022

2022

[11] [11]

Segment anything,

A. Kirillov, E. Mintun, N. Ravi, H. Mao, C. Rolland, L. Gustafson, T. Xiao, S. Whitehead, A. C. Berg, W.-Y . Loet al., “Segment anything,” inICCV, 2023, pp. 4015–4026

2023

[12] [12]

Efficiently modeling long sequences with structured state spaces,

A. Gu, K. Goel, and C. Re, “Efficiently modeling long sequences with structured state spaces,” inICLR, 2022, pp. 1–32

2022

[13] [13]

Mamba: Linear-time sequence modeling with selective state spaces,

A. Gu and T. Dao, “Mamba: Linear-time sequence modeling with selective state spaces,”arXiv, 2023

2023

[14] [14]

Vmamba: Visual state space model,

Y . Liu, Y . Tian, Y . Zhao, H. Yu, L. Xie, Y . Wang, Q. Ye, J. Jiao, and Y . Liu, “Vmamba: Visual state space model,”NeurIPS, vol. 37, pp. 103 031–103 063, 2024

2024

[15] [15]

Placenta accreta spectrum,

A. C. of Obstetricians, Gynecologistset al., “Placenta accreta spectrum,” American journal of obstetrics and gynecology, vol. 219, no. 6, pp. B2– B16, 2018

2018

[16] [16]

Placenta accreta spectrum,

R. M. Silver and D. W. Branch, “Placenta accreta spectrum,”NEJM, vol. 378, no. 16, pp. 1529–1536, 2018

2018

[17] [17]

Placenta accreta spectrum diagnosis challenges and controversies in current obstetrics: a review,

A. Arakaza, L. Zou, and J. Zhu, “Placenta accreta spectrum diagnosis challenges and controversies in current obstetrics: a review,”Interna- tional Journal of Women’s Health, pp. 635–654, 2023

2023

[18] [18]

Placenta accreta spectrum among women with twin gestations,

H. E. Miller, S. A. Leonard, K. A. Fox, D. A. Carusi, and D. J. Lyell, “Placenta accreta spectrum among women with twin gestations,” Obstetrics & Gynecology, vol. 137, no. 1, pp. 132–138, 2021

2021

[19] [19]

Placenta accreta: spectrum of us and mr imaging findings,

W. C. Baughman, J. E. Corteville, and R. R. Shah, “Placenta accreta: spectrum of us and mr imaging findings,”Radiographics, vol. 28, no. 7, pp. 1905–1916, 2008

1905

[20] [20]

Predicting placenta accreta spectrum: validation of the placenta accreta index,

S. K. Happe, C. S. Yule, C. Y . Spong, C. E. Wells, J. S. Dashe, E. Moschos, M. W. Rac, D. D. McIntire, and D. M. Twickler, “Predicting placenta accreta spectrum: validation of the placenta accreta index,” Journal of Ultrasound in Medicine, vol. 40, no. 8, pp. 1523–1532, 2021

2021

[21] [21]

Magnetic resonance imaging of placenta accreta spectrum: a step-by-step approach,

S. Srisajjakul, P. Prapaisilp, and S. Bangchokdee, “Magnetic resonance imaging of placenta accreta spectrum: a step-by-step approach,”Korean journal of radiology, vol. 22, no. 2, p. 198, 2020

2020

[22] [22]

The use of magnetic resonance imaging to predict placenta previa with placenta accreta spectrum,

H. Ishibashi, M. Miyamoto, H. Shinmoto, S. Soga, H. Matsuura, S. Kakimoto, H. Iwahashi, T. Sakamoto, T. Hada, R. Suzukiet al., “The use of magnetic resonance imaging to predict placenta previa with placenta accreta spectrum,”Acta Obstetricia et Gynecologica Scandinavica, vol. 99, no. 12, pp. 1657–1665, 2020

2020

[23] [23]

Review of mri imaging for placenta accreta spectrum: pathophysiologic insights, imaging signs, and recent developments,

H. Kapoor, M. Hanaoka, A. Dawkins, and A. Khurana, “Review of mri imaging for placenta accreta spectrum: pathophysiologic insights, imaging signs, and recent developments,”Placenta, vol. 104, pp. 31– 39, 2021

2021

[24] [24]

Diagnosis of placenta accreta spectrum in high-risk women using ultrasonography or magnetic resonance imaging: systematic review and meta-analysis,

M. De Oliveira Carniello, L. Oliveira Brito, L. Sarian, and J. Bennini, “Diagnosis of placenta accreta spectrum in high-risk women using ultrasonography or magnetic resonance imaging: systematic review and meta-analysis,”Ultrasound in Obstetrics & Gynecology, vol. 59, no. 4, pp. 428–436, 2022

2022

[25] [25]

Mri of the placenta accreta spectrum (pas) disorder: radiomics analysis correlates with surgical and pathological outcome,

Q. N. Do, M. A. Lewis, Y . Xi, A. J. Madhuranthakam, S. K. Happe, J. S. Dashe, R. E. Lenkinski, A. Khan, and D. M. Twickler, “Mri of the placenta accreta spectrum (pas) disorder: radiomics analysis correlates with surgical and pathological outcome,”Journal of Magnetic Resonance Imaging, vol. 51, no. 3, pp. 936–946, 2020

2020

[26] [26]

Unet++: A nested u-net architecture for medical image segmentation,

Z. Zhou, M. M. Rahman Siddiquee, N. Tajbakhsh, and J. Liang, “Unet++: A nested u-net architecture for medical image segmentation,” inMICCAI. Springer, 2018, pp. 3–11

2018

[27] [27]

Automatically designing cnn architectures for medical image segmentation,

A. Mortazi and U. Bagci, “Automatically designing cnn architectures for medical image segmentation,” inMLMI. Springer, 2018, pp. 98–106

2018

[28] [28]

Drinet for medical image segmentation,

L. Chen, P. Bentley, K. Mori, K. Misawa, M. Fujiwara, and D. Rueckert, “Drinet for medical image segmentation,”TIP, vol. 37, no. 11, pp. 2453– 2462, 2018

2018

[29] [29]

Encoder- decoder with atrous separable convolution for semantic image segmen- tation,

L.-C. Chen, Y . Zhu, G. Papandreou, F. Schroff, and H. Adam, “Encoder- decoder with atrous separable convolution for semantic image segmen- tation,” inECCV, 2018, pp. 801–818

2018

[30] [30]

Resunet++: An advanced architecture for medical image segmentation,

D. Jha, P. H. Smedsrud, M. A. Riegler, D. Johansen, T. De Lange, P. Halvorsen, and H. D. Johansen, “Resunet++: An advanced architecture for medical image segmentation,” inISM. IEEE, 2019, pp. 225–2255

2019

[31] [31]

Ce-net: Context encoder network for 2d medical image segmentation,

Z. Gu, J. Cheng, H. Fu, K. Zhou, H. Hao, Y . Zhao, T. Zhang, S. Gao, and J. Liu, “Ce-net: Context encoder network for 2d medical image segmentation,”TIP, vol. 38, no. 10, pp. 2281–2292, 2019

2019

[32] [32]

Doubleu-net: A deep convolutional neural network for medical image segmentation,

D. Jha, M. A. Riegler, D. Johansen, P. Halvorsen, and H. D. Johansen, “Doubleu-net: A deep convolutional neural network for medical image segmentation,” inISCMS. IEEE, 2020, pp. 558–564

2020

[33] [33]

Adaresu-net: Multiobjective adaptive convolutional neural network for medical image segmentation,

M. Baldeon-Calisto and S. K. Lai-Yuen, “Adaresu-net: Multiobjective adaptive convolutional neural network for medical image segmentation,” Neurocomputing, vol. 392, pp. 325–340, 2020

2020

[34] [34]

Deep learning ap- proach for medical image analysis,

A. A. Adegun, S. Viriri, and R. O. Ogundokun, “Deep learning ap- proach for medical image analysis,”Computational Intelligence and Neuroscience, vol. 2021, no. 1, p. 6215281, 2021

2021

[35] [35]

Unext: Mlp-based rapid medical image segmentation network,

J. M. J. Valanarasu and V . M. Patel, “Unext: Mlp-based rapid medical image segmentation network,” inMICCAI. Springer, 2022, pp. 23–33

2022

[36] [36]

An image is worth 16x16 words: Transformers for image recognition at scale,

D. Alexey, “An image is worth 16x16 words: Transformers for image recognition at scale,”ICLR, pp. 1–22, 2021

2021

[37] [37]

Swin transformer: Hierarchical vision transformer using shifted windows,

Z. Liu, Y . Lin, Y . Cao, H. Hu, Y . Wei, Z. Zhang, S. Lin, and B. Guo, “Swin transformer: Hierarchical vision transformer using shifted windows,” inCVPR, 2021, pp. 10 012–10 022

2021

[38] [38]

Cotr: Efficiently bridging cnn and transformer for 3d medical image segmentation,

Y . Xie, J. Zhang, C. Shen, and Y . Xia, “Cotr: Efficiently bridging cnn and transformer for 3d medical image segmentation,” inMICCAI. Springer, 2021, pp. 171–180

2021

[39] [39]

Transunet: Transformers make strong encoders for medical image segmentation,

J. Chen, Y . Lu, Q. Yu, X. Luo, E. Adeli, Y . Wang, L. Lu, A. L. Yuille, and Y . Zhou, “Transunet: Transformers make strong encoders for medical image segmentation,”arXiv, 2021

2021

[40] [40]

Unetr: Transformers for 3d medical image segmentation,

A. Hatamizadeh, Y . Tang, V . Nath, D. Yang, A. Myronenko, B. Land- man, H. R. Roth, and D. Xu, “Unetr: Transformers for 3d medical image segmentation,” inWACV, 2022, pp. 574–584

2022

[41] [41]

Swin unetr: Swin transformers for semantic segmentation of brain tumors in mri images,

A. Hatamizadeh, V . Nath, Y . Tang, D. Yang, H. R. Roth, and D. Xu, “Swin unetr: Swin transformers for semantic segmentation of brain tumors in mri images,” inMICCAI. Springer, 2021, pp. 272–284

2021

[42] [42]

Ma-sam: Modality-agnostic sam adaptation for 3d medical image segmentation,

C. Chen, J. Miao, D. Wu, A. Zhong, Z. Yan, S. Kim, J. Hu, Z. Liu, L. Sun, X. Liet al., “Ma-sam: Modality-agnostic sam adaptation for 3d medical image segmentation,”MIA, vol. 98, p. 103310, 2024

2024

[43] [43]

3dsam-adapter: Holistic adaptation of sam from 2d to 3d for promptable tumor segmentation,

S. Gong, Y . Zhong, W. Ma, J. Li, Z. Wang, J. Zhang, P.-A. Heng, and Q. Dou, “3dsam-adapter: Holistic adaptation of sam from 2d to 3d for promptable tumor segmentation,”MIA, vol. 98, p. 103324, 2024

2024

[44] [44]

Input augmentation with sam: Boosting medical image segmentation with segmentation foundation model,

Y . Zhang, T. Zhou, S. Wang, P. Liang, Y . Zhang, and D. Z. Chen, “Input augmentation with sam: Boosting medical image segmentation with segmentation foundation model,” inMICCAI. Springer, 2023, pp. 129–139

2023

[45] [45]

Sam3d: Segment anything model in volu- metric medical images,

N.-T. Bui, D.-H. Hoang, M.-T. Tran, G. Doretto, D. Adjeroh, B. Patel, A. Choudhary, and N. Le, “Sam3d: Segment anything model in volu- metric medical images,” inISBI. IEEE, 2024, pp. 1–4

2024

[46] [46]

Auto- prompting sam for mobile friendly 3d medical image segmentation,

C. Li, P. Khanduri, Y . Qiang, R. I. Sultan, I. Chetty, and D. Zhu, “Auto- prompting sam for mobile friendly 3d medical image segmentation,” arXiv, 2023

2023

[47] [47]

Selective structured state-spaces for long-form video understanding,

J. Wang, W. Zhu, P. Wang, X. Yu, L. Liu, M. Omar, and R. Hamid, “Selective structured state-spaces for long-form video understanding,” inCVPR, 2023, pp. 6387–6397

2023

[48] [48]

U-mamba: Enhancing long-range depen- dency for biomedical image segmentation,

J. Ma, F. Li, and B. Wang, “U-mamba: Enhancing long-range depen- dency for biomedical image segmentation,”arXiv, 2024

2024

[49] [49]

Mamba-unet: Unet-like pure visual mamba for medical image segmentation.arXiv preprint arXiv:2402.05079, 2024

Z. Wang, J.-Q. Zheng, Y . Zhang, G. Cui, and L. Li, “Mamba-unet: Unet- like pure visual mamba for medical image segmentation,”arXiv preprint arXiv:2402.05079, 2024

work page arXiv 2024

[50] [50]

Lkm-unet: Large kernel vision mamba unet for medical image segmentation,

J. Wang, J. Chen, D. Chen, and J. Wu, “Lkm-unet: Large kernel vision mamba unet for medical image segmentation,” inMICCAI. Springer, 2024, pp. 360–370

2024

[51] [51]

Vm-unet- v2: rethinking vision mamba unet for medical image segmentation,

M. Zhang, Y . Yu, S. Jin, L. Gu, T. Ling, and X. Tao, “Vm-unet- v2: rethinking vision mamba unet for medical image segmentation,” in ISBRA. Springer, 2024, pp. 335–346

2024

[52] [52]

Segmamba: Long-range sequential modeling mamba for 3d medical image segmentation,

Z. Xing, T. Ye, Y . Yang, G. Liu, and L. Zhu, “Segmamba: Long-range sequential modeling mamba for 3d medical image segmentation,” in MICCAI. Springer, 2024, pp. 578–588

2024

[53] [53]

Swin-umamba†: Adapting mamba-based vision foundation models for medical image segmentation,

J. Liu, H. Yang, H.-Y . Zhou, L. Yu, Y . Liang, Y . Yu, S. Zhang, H. Zheng, and S. Wang, “Swin-umamba†: Adapting mamba-based vision foundation models for medical image segmentation,”TIP, pp. 1–1, 2024

2024

[54] [54]

H-vmunet: High-order vision mamba unet for medical image segmentation,

R. Wu, Y . Liu, P. Liang, and Q. Chang, “H-vmunet: High-order vision mamba unet for medical image segmentation,”Neurocomputing, p. 129447, 2025

2025

[55] [55]

Frequency- enhanced multi-granularity context network for efficient vertebrae seg- mentation,

J. Shi, T. You, P. Zhang, H. Zhang, R. Xu, and H. Li, “Frequency- enhanced multi-granularity context network for efficient vertebrae seg- mentation,” inMICCAI, 2025, pp. 206–216. 15

2025

[56] [56]

A comprehensive analysis of mamba for 3d volumetric medical image segmentation,

C. Wanget al., “A comprehensive analysis of mamba for 3d volumetric medical image segmentation,”Pattern Recognition, 2026

2026

[57] [57]

Batch normalization: Accelerating deep network training by reducing internal covariate shift,

S. Ioffe and C. Szegedy, “Batch normalization: Accelerating deep network training by reducing internal covariate shift,” inICML. pmlr, 2015, pp. 448–456

2015

[58] [58]

Convergence analysis of two-layer neural networks with relu activation,

Y . Li and Y . Yuan, “Convergence analysis of two-layer neural networks with relu activation,” inNeurIPS, 2017, pp. 597–607

2017

[59] [59]

Lora: Low-rank adaptation of large language models

E. J. Hu, Y . Shen, P. Wallis, Z. Allen-Zhu, Y . Li, S. Wang, L. Wang, W. Chenet al., “Lora: Low-rank adaptation of large language models.” ICLR, vol. 1, no. 2, p. 3, 2022

2022

[60] [60]

Fantastic animals and where to find them: Segment any marine animal with dual sam,

P. Zhang, T. Yan, Y . Liu, and H. Lu, “Fantastic animals and where to find them: Segment any marine animal with dual sam,” inCVPR, 2024, pp. 2578–2587

2024

[61] [61]

Layer normalization,

J. L. Ba, J. R. Kiros, and G. E. Hinton, “Layer normalization,”STAT, vol. 1050, p. 21, 2016

2016

[62] [62]

Multilayer perceptrons,

L. B. Almeida, “Multilayer perceptrons,” inHandbook of Neural Com- putation. CRC Press, 2020, pp. C1–2

2020

[63] [63]

Activation functions: comparison of trends in practice and research for deep learning,

C. E. Nwankpa, W. Ijomah, A. Gachagan, and S. Marshall, “Activation functions: comparison of trends in practice and research for deep learning,” inICCST, 2021, pp. 124–133

2021

[64] [64]

Depth-wise separable convolutions and multi-level pooling for an efficient spatial cnn-based steganalysis,

R. Zhang, F. Zhu, J. Liu, and G. Liu, “Depth-wise separable convolutions and multi-level pooling for an efficient spatial cnn-based steganalysis,” TIFS, vol. 15, pp. 1138–1150, 2019

2019

[65] [65]

Sigmoid activation function in selecting the best model of artificial neural networks,

H. Pratiwi, A. P. Windarto, S. Susliansyah, R. R. Aria, S. Susilowati, L. K. Rahayu, Y . Fitriani, A. Merdekawati, and I. R. Rahadjeng, “Sigmoid activation function in selecting the best model of artificial neural networks,” inJournal of Physics: Conference Series, vol. 1471, no. 1. IOP Publishing, 2020, p. 012010

2020

[66] [66]

The state of the art in kidney and kidney tumor segmentation in contrast-enhanced ct imaging: Results of the kits19 challenge,

N. Heller, F. Isensee, K. H. Maier-Hein, X. Hou, C. Xie, F. Li, Y . Nan, G. Mu, Z. Lin, M. Hanet al., “The state of the art in kidney and kidney tumor segmentation in contrast-enhanced ct imaging: Results of the kits19 challenge,”MIA, vol. 67, p. 101821, 2021

2021

[67] [67]

Medical image segmentation review: The success of u-net,

R. Azad, E. K. Aghdam, A. Rauland, Y . Jia, A. H. Avval, A. Bozorgpour, S. Karimijafarbigloo, J. P. Cohen, E. Adeli, and D. Merhof, “Medical image segmentation review: The success of u-net,”TPAMI, pp. 10 076– 10 095, 2024

2024

[68] [68]

Adam: A method for stochastic optimization,

K. Diederik, “Adam: A method for stochastic optimization,”arXiv, 2014

2014

[69] [69]

3d u-net: learning dense volumetric segmentation from sparse annota- tion,

¨O. C ¸ ic ¸ek, A. Abdulkadir, S. S. Lienkamp, T. Brox, and O. Ronneberger, “3d u-net: learning dense volumetric segmentation from sparse annota- tion,” inMICC. Springer, 2016, pp. 424–432

2016

[70] [70]

nnu-net: a self-configuring method for deep learning-based biomedical image segmentation,

F. Isensee, P. F. Jaeger, S. A. Kohl, J. Petersen, and K. H. Maier-Hein, “nnu-net: a self-configuring method for deep learning-based biomedical image segmentation,”Nature methods, vol. 18, no. 2, pp. 203–211, 2021

2021

[71] [71]

Normalnet: A voxel-based cnn for 3d object classification and retrieval,

C. Wang, M. Cheng, F. Sohel, M. Bennamoun, and J. Li, “Normalnet: A voxel-based cnn for 3d object classification and retrieval,”Neurocom- puting, vol. 323, pp. 139–147, 2019

2019

[72] [72]

Transformer-based factorized encoder for classification of pneumoco- niosis on 3d ct images,

Y . Huang, Y . Si, B. Hu, Y . Zhang, S. Wu, D. Wu, and Q. Wang, “Transformer-based factorized encoder for classification of pneumoco- niosis on 3d ct images,”CBM, vol. 150, p. 106137, 2022

2022

[73] [73]

Video swin transformer,

Z. Liu, J. Ning, Y . Cao, Y . Wei, Z. Zhang, S. Lin, and H. Hu, “Video swin transformer,” inCVPR, 2022, pp. 3202–3211

2022

[74] [74]

Medmamba: Vision mamba for medical image classification,

Y . Yue and Z. Li, “Medmamba: Vision mamba for medical image classification,”arXiv preprint arXiv:2403.03849, 2024

work page arXiv 2024