Recognition: unknown
SAMamba3D: adapting Segment Anything for generalizable 3D segmentation of multiphase pore-scale images
Pith reviewed 2026-05-09 20:05 UTC · model grok-4.3
The pith
SAMamba3D adapts a largely frozen SAM encoder with Mamba-based volumetric context modeling and progressive cross-scale feature interaction to achieve generalizable 3D segmentation of multiphase pore-scale images.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
SAMamba3D is a parameter-efficient framework that adapts the Segment Anything Model for 3D pore-scale segmentation by coupling its largely frozen encoder with Mamba-based volumetric context modeling and progressive cross-scale feature interaction. For sandstone and carbonate datasets with different fluids, wettability, and scanning conditions, it matches or outperforms current 3D baselines while reducing the need for case-specific retraining. The resulting segmentations preserve physically meaningful descriptors including fluid saturation, connectivity, and interface morphology.
What carries the argument
The SAMamba3D framework, which couples a largely frozen SAM encoder with Mamba-based volumetric context modeling and progressive cross-scale feature interaction to extend 2D boundary priors into generalizable 3D segmentation.
If this is right
- Matches or outperforms current 3D segmentation baselines on sandstone and carbonate datasets under varied fluid, wettability, and scanning conditions.
- Reduces the need for case-specific retraining when rock type, fluid pattern, or acquisition conditions change.
- Produces segmentations that preserve fluid saturation, connectivity, and interface morphology.
- Supports faster and more reliable analysis of large 3D multiphase images.
Where Pith is reading between the lines
- The same frozen-encoder plus efficient 3D module pattern could be applied to other volumetric imaging domains such as materials or biological samples if the Mamba and cross-scale components transfer well.
- Minimizing retraining opens the possibility of on-the-fly segmentation during ongoing experiments where scanner settings shift.
- Additional tests on datasets with greater resolution or noise extremes would clarify the boundary of the claimed generalizability.
Load-bearing premise
Coupling a largely frozen SAM encoder with Mamba-based volumetric context modeling and progressive cross-scale feature interaction produces generalizable 3D segmentations across varying rock types, fluids, and acquisition conditions without extensive retraining.
What would settle it
A new multiphase pore-scale dataset from an unseen rock type, fluid combination, or scanning condition where SAMamba3D segmentation accuracy or physical descriptor preservation falls below that of a fully retrained 3D baseline.
Figures
read the original abstract
Reliable segmentation of multiphase pore-scale X-ray images of rocks is necessary to quantify fluid saturation, connectivity, and interfacial geometry. However, current 3D segmentation methods are typically dataset-specific, requiring retraining or extensive fine-tuning whenever rock type, fluid pattern, scanner, or acquisition conditions change. Foundation models such as the Segment Anything Model (SAM) provide strong 2D boundary priors, but they are not directly applicable to 3D data. We present SAMamba3D, a parameter-efficient framework that adapts a largely frozen SAM encoder to generalizable 3D pore-scale segmentation by coupling it with Mamba-based volumetric context modeling and progressive cross-scale feature interaction. For sandstone and carbonate datasets, with different fluids, wettability, and scanning conditions, SAMamba3D matches or outperforms current 3D baselines while reducing the need for case-specific retraining. The resulting segmented images preserve physically meaningful descriptors, including fluid saturation, connectivity, and interface morphology, enabling more reliable and rapid analysis of large 3D multiphase images.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript introduces SAMamba3D, a parameter-efficient framework that adapts a largely frozen SAM encoder for 3D segmentation of multiphase pore-scale X-ray images by coupling it with Mamba-based volumetric context modeling and progressive cross-scale feature interaction. It claims that, for sandstone and carbonate datasets with varying fluids, wettability, and scanning conditions, the method matches or outperforms existing 3D baselines, reduces the need for case-specific retraining, and preserves physically meaningful descriptors such as fluid saturation, connectivity, and interface morphology.
Significance. If the empirical results and generalization claims hold after clarification of the training protocol, the work would be significant for digital rock physics and porous-media analysis. It addresses a practical bottleneck (dataset-specific retraining) by leveraging 2D foundation models in a 3D setting with modest added parameters, potentially enabling faster, more reliable quantification of multiphase images across diverse acquisition conditions.
major comments (2)
- [Methods / Experimental Setup] The central generalization claim (reduced case-specific retraining while matching/outperforming baselines across rock/fluid/scanner variations) is load-bearing but rests on an unverified assumption about the training protocol. The manuscript must explicitly state whether the Mamba volumetric and cross-scale modules were trained once on a mixed or representative set and then applied with minimal change to held-out conditions, or whether separate training runs were performed per dataset/condition. Without this detail (likely in the Methods or Experimental Setup section), the reduction in retraining is not demonstrated beyond standard parameter-efficient fine-tuning.
- [Results / Abstract] The abstract asserts performance parity or improvement and preservation of physical descriptors but supplies no quantitative metrics, specific baselines, error analysis, or dataset sizes. This prevents verification of the central claim; the full results section should include tables or figures with Dice/IoU scores, saturation errors, connectivity metrics, and statistical comparisons against at least two current 3D baselines on each rock type.
minor comments (3)
- The abstract would be strengthened by including one or two key quantitative results (e.g., average Dice improvement or saturation error) to support the performance claims.
- [Figures] Ensure all figure captions explicitly describe the comparison (e.g., which baseline is shown in each panel) and label axes/units for physical descriptors such as saturation or interfacial area.
- [Methods] Clarify the exact number of trainable parameters added by the Mamba and cross-scale modules relative to the frozen SAM encoder; this supports the 'parameter-efficient' claim.
Simulated Author's Rebuttal
We thank the referee for the constructive review and for recognizing the potential significance of SAMamba3D for digital rock physics. We address each major comment below and will revise the manuscript accordingly to strengthen clarity and verifiability.
read point-by-point responses
-
Referee: [Methods / Experimental Setup] The central generalization claim (reduced case-specific retraining while matching/outperforming baselines across rock/fluid/scanner variations) is load-bearing but rests on an unverified assumption about the training protocol. The manuscript must explicitly state whether the Mamba volumetric and cross-scale modules were trained once on a mixed or representative set and then applied with minimal change to held-out conditions, or whether separate training runs were performed per dataset/condition. Without this detail (likely in the Methods or Experimental Setup section), the reduction in retraining is not demonstrated beyond standard parameter-efficient fine-tuning.
Authors: We agree that explicit description of the training protocol is necessary to substantiate the generalization claim. In the reported experiments, the Mamba volumetric context modeling and progressive cross-scale feature interaction modules were trained once on a mixed representative set drawn from both sandstone and carbonate images (encompassing variations in fluids, wettability, and scanning conditions). The trained modules were then applied to held-out test conditions with only minimal or no further fine-tuning. This single-training protocol underpins the reduction in case-specific retraining. We will add a dedicated paragraph in the Methods section (and reference it in the Experimental Setup) that clearly states this training procedure, including the composition of the mixed training set and the minimal adaptation applied to held-out data. revision: yes
-
Referee: [Results / Abstract] The abstract asserts performance parity or improvement and preservation of physical descriptors but supplies no quantitative metrics, specific baselines, error analysis, or dataset sizes. This prevents verification of the central claim; the full results section should include tables or figures with Dice/IoU scores, saturation errors, connectivity metrics, and statistical comparisons against at least two current 3D baselines on each rock type.
Authors: We acknowledge that the abstract is concise and does not contain numerical values, which is standard practice, but we agree that the full results must enable direct verification. The revised manuscript will expand the Results section with consolidated tables (new Table 2) and accompanying figures that report Dice and IoU scores, fluid saturation errors, connectivity metrics (e.g., Euler characteristic and cluster size distributions), and statistical comparisons (paired t-tests) against at least two 3D baselines (3D U-Net and nnU-Net) for each rock type. Dataset sizes, acquisition parameters, and error analyses will be explicitly tabulated. These additions will be cross-referenced from the abstract and discussion to ensure the central claims are quantitatively supported. revision: yes
Circularity Check
No circularity: empirical adaptation of existing models
full rationale
The paper presents SAMamba3D as a parameter-efficient architectural adaptation that couples a frozen SAM encoder with Mamba-based volumetric modeling and cross-scale interaction modules. No equations, derivations, or first-principles results are claimed; performance claims rest on empirical evaluation across sandstone and carbonate datasets with varying conditions. No self-citations are load-bearing for any derivation, no fitted inputs are relabeled as predictions, and no uniqueness theorems or ansatzes are smuggled in. The framework is self-contained as a practical engineering contribution whose validity is tested externally via segmentation metrics rather than by construction from its own inputs.
Axiom & Free-Parameter Ledger
Reference graph
Works this paper leans on
-
[1]
Bashir, M
A. Bashir, M. Ali, S. Patil, M. S. Aljawad, M. Mahmoud, D. Al-Shehri, H. Hoteit, and M. S. Kamal. Comprehensive review of CO2 geological storage: Exploring principles, mechanisms, and prospects.Earth-Science Reviews, 249:104672, 2024. 16 Zhang, R., et al
2024
-
[2]
J. M. Miocic, N. Heinemann, K. Edlmann, J. Scafidi, F. Molaei, and J. Alcalde. Underground hydrogen storage: A review.Geological Society, London, Special Publications, 528(1):73–86, 2023
2023
-
[3]
Wildenschild and A
D. Wildenschild and A. P. Sheppard. X-ray imaging and analysis techniques for quantifying pore-scale structure and processes in subsurface porous medium systems.Advances in Water Resources, 51:217–246, 2013
2013
-
[4]
M. J. Blunt, B. Bijeljic, H. Dong, O. Gharbi, S. Iglauer, P. Mostaghimi, A. Paluszny, and C. Pentland. Pore-scale imaging and modelling.Advances in Water Resources, 51:197–216, 2013
2013
-
[5]
Iassonov, T
P. Iassonov, T. Gebrenegus, and M. Tuller. Segmentation of X-ray computed tomography images of porous materials: A crucial step for characterization and quantitative analysis of pore structures.Water Resources Research, 45:W09415, 2009
2009
-
[6]
Schlüter, A
S. Schlüter, A. Sheppard, K. Brown, and D. Wildenschild. Image processing of multiphase images obtained via X-ray microtomography: A review.Water Resources Research, 50(4):3615–3639, 2014
2014
-
[7]
Scanziani, K
A. Scanziani, K. Singh, M. J. Blunt, and A. Guadagnini. Automatic method for estimation of in situ effective contact angle from X-ray micro tomography images of two-phase flow in porous media.Journal of Colloid and Interface Science, 496:51–59, 2017
2017
-
[8]
Huang, A
R. Huang, A. L. Herring, and A. Sheppard. Effect of saturation and image resolution on representative elementary volume and topological quantification: An experimental study on Bentheimer sandstone using micro-CT.Transport in Porous Media, 137:489–518, 2021
2021
-
[9]
Andrew, B
M. Andrew, B. Bijeljic, and M. J. Blunt. Pore-scale contact angle measurements at reservoir conditions using X-ray microtomography.Advances in Water Resources, 68(1):24–31, 2014
2014
-
[10]
AlRatrout, A
A. AlRatrout, A. Q. Raeini, B. Bijeljic, and M. J. Blunt. Automatic measurement of contact angle in pore-space images.Advances in Water Resources, 109:158–169, 2017
2017
-
[11]
Q. Lin, B. Bijeljic, R. Pini, M. J. Blunt, and S. Krevor. Imaging and measurement of pore-scale interfacial curvature to determine capillary pressure simultaneously with relative permeability.Water Resources Research, 54(9):7046–7060, 2018
2018
-
[12]
Y . Niu, P. Mostaghimi, M. Shabaninejad, P. Swietojanski, and R. T. Armstrong. Digital rock segmentation for petrophysical analysis with reduced user bias using convolutional neural networks.Water Resources Research, 56(2):e2019WR026597, 2020
2020
-
[13]
N. J. Alqahtani, Y . Niu, Y . D. Wang, T. Chung, Z. Lanetc, A. Zhuravljov, R. T. Armstrong, and P. Mostaghimi. Super-resolved segmentation of X-ray images of carbonate rocks using deep learning.Transport in Porous Media, 143(2):497–525, 2022
2022
-
[14]
H. Wang, R. Guo, L. E. Dalton, D. Crandall, S. A. Hosseini, M. Fan, and C. Chen. Comparative assessment of U-net-based deep learning models for segmenting microfractures and pore spaces in digital rocks.SPE Journal, 29(11):5779–5791, 2024
2024
-
[15]
Siavashi, M
J. Siavashi, M. Mahdaviara, M. J. Shojaei, M. Sharifi, and M. J. Blunt. Segmentation of two-phase flow X-ray tomography images to determine contact angle using deep autoencoders.Energy, 288:129698, 2024
2024
-
[16]
Y . Gao, S. Foroughi, Z. Ma, S. Yuan, L. Xiao, B. Bijeljic, and M. J. Blunt. Gradient information enhanced image segmentation and automatic in situ contact angle measurement applied to images of multiphase flow in porous media.Water Resources Research, 60(9):e2023WR036869, 2024
2024
-
[17]
Mahdaviara, M
M. Mahdaviara, M. J. Shojaei, J. Siavashi, M. Sharifi, and M. J. Blunt. Deep learning for multiphase segmentation of X-ray images of gas diffusion layers.Fuel, 345:128180, 2023
2023
-
[18]
Kirillov, E
A. Kirillov, E. Mintun, N. Ravi, H. Mao, C. Rolland, L. Gustafson, T. Xiao, S. Whitehead, A. C. Berg, W.-Y . Lo, P. Dollár, and R. Girshick. Segment anything. InProceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pages 4015–4026, 2023
2023
-
[19]
J. Ma, Y . He, F. Li, L. Han, C. You, and B. Wang. Segment anything in medical images.Nature Communications, 15(1):654, 2024
2024
- [20]
-
[21]
C. Chen, J. Miao, D. Wu, A. Zhong, Z. Yan, S. Kim, J. Hu, Z. Liu, L. Sun, X. Li, T. Liu, P.-A. Heng, and Q. Li. MA-SAM: Modality-agnostic SAM adaptation for 3D medical image segmentation.Medical Image Analysis, 98:103310, 2024
2024
-
[22]
X. Yan, S. Sun, K. Han, T.-T. Le, H. Ma, C. You, and X. Xie. AFTer-SAM: Adapting SAM with axial fusion transformer for medical imaging segmentation. InProceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), pages 7975–7984, 2024. 17 Zhang, R., et al
2024
-
[23]
K. Tang, R. T. Armstrong, P. Mostaghimi, Y . Niu, Q. Meyer, C. Zhao, D. P. Finegan, M. Popeil, K. Singh, H. Menke, A. P. Dimou, T. Bultreys, A. Mascini, M. Knackstedt, and Y . D. Wang. Scaling deep learning for material imaging with a pseudo 3D model for domain transfer.Nature Communications, 16:11293, 2025
2025
-
[24]
M. J. Shojaei, M. Blunt, and B. Bijeljic. X-ray tomography dataset of steady-state two-phase flow in bentheimer sandstone.Digital Rocks Portal, 2023
2023
-
[25]
Q. Lin, B. Bijeljic, S. Berg, R. Pini, M. J. Blunt, and S. Krevor. Minimal surfaces in porous media: Pore-scale imaging of multiphase flow in an altered-wettability Bentheimer sandstone.Physical Review E, 99(6):063105, 2019
2019
-
[26]
L. E. Dalton, K. A. Klise, S. Fuchs, D. Crandall, and A. Goodman. Methods to measure contact angles in scco2-brine-sandstone systems.Advances in water resources, 122:278–290, 2018
2018
-
[27]
Jangda, H
Z. Jangda, H. Menke, A. Busch, S. Geiger, T. Bultreys, and K. Singh. Subsurface hydrogen storage controlled by small-scale rock heterogeneities.International Journal of Hydrogen Energy, 60:1192–1202, 2024
2024
-
[28]
Tawfik, Z
M. Tawfik, Z. T. Karpyn, and S. X. Huang. Comparative study of traditional and deep-learning denoising approaches for image-based petrophysical characterization of porous media.Frontiers in Water, 3:800369, 2022
2022
-
[29]
Scanziani, K
A. Scanziani, K. Singh, T. Bultreys, B. Bijeljic, and M. J. Blunt. In situ characterization of immiscible three-phase flow at the pore scale for a water-wet carbonate rock.Advances in Water Resources, 121:446–455, 2018
2018
-
[30]
Singh, H
K. Singh, H. Menke, M. Andrew, C. Rau, B. Bijeljic, and M. J. Blunt. Time-resolved synchrotron X-ray micro-tomography datasets of drainage and imbibition in carbonate rocks.Scientific Data, 5:180265, 2018
2018
-
[31]
Bultreys
T. Bultreys. Estaillades carbonate# 2.National Science Foundation: Digital Rocks Portal, 2016
2016
-
[32]
H. M. AlZahrani, B. Bijeljic, R. Chai, and M. J. Blunt. Pore-scale analysis and visualization of tertiary cationic surfactant flooding in a complex carbonate.ACS Omega, 10(43):51383–51395, 2025
2025
-
[33]
A. M. Alhammadi, A. AlRatrout, K. Singh, B. Bijeljic, and M. J. Blunt. In situ characterization of mixed-wettability in a reservoir rock at subsurface conditions.Scientific Reports, 7:10753, 2017
2017
-
[34]
Buades, B
A. Buades, B. Coll, and J.-M. Morel. A non-local algorithm for image denoising. InProceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR), volume 2, pages 60–65, 2005
2005
-
[35]
Mamba: Linear-Time Sequence Modeling with Selective State Spaces
A. Gu and T. Dao. Mamba: Linear-time sequence modeling with selective state spaces.arXiv preprint arXiv:2312.00752, 2023
work page internal anchor Pith review arXiv 2023
-
[36]
Perez, F
E. Perez, F. Strub, H. de Vries, V . Dumoulin, and A. Courville. FiLM: Visual reasoning with a general conditioning layer. InProceedings of the AAAI Conference on Artificial Intelligence, volume 32, pages 3942–3951, 2018
2018
-
[37]
E. J. Hu, Y . Shen, P. Wallis, Z. Allen-Zhu, Y . Li, S. Wang, L. Wang, and W. Chen. LoRA: Low-rank adaptation of large language models.arXiv preprint arXiv:2106.09685, 2021
work page internal anchor Pith review Pith/arXiv arXiv 2021
- [38]
-
[39]
T.-Y . Lin, P. Goyal, R. Girshick, K. He, and P. Dollár. Focal loss for dense object detection. InProceedings of the IEEE International Conference on Computer Vision (ICCV), pages 2980–2988, 2017
2017
-
[40]
Z. Ma, B. Bijeljic, G. Wen, K. Tang, Y . Wang, and M. J. Blunt. Super-resolution imaging of multiphase fluid distributions in porous media using deep learning.Transport in Porous Media, 152:85, 2025
2025
-
[41]
Isensee, P
F. Isensee, P. F. Jaeger, S. A. A. Kohl, J. Petersen, and K. H. Maier-Hein. nnU-Net: A self-configuring method for deep learning-based biomedical image segmentation.Nature Methods, 18:203–211, 2021
2021
-
[42]
Myronenko
A. Myronenko. 3D MRI brain tumor segmentation using autoencoder regularization. InBrainlesion: Glioma, Multiple Sclerosis, Stroke and Traumatic Brain Injuries, volume 11384 ofLecture Notes in Computer Science, pages 311–320, 2019
2019
-
[43]
Hatamizadeh, V
A. Hatamizadeh, V . Nath, Y . Tang, D. Yang, H. R. Roth, and D. Xu. Swin UNETR: Swin transformers for semantic segmentation of brain tumors in MRI images. InBrainlesion: Glioma, Multiple Sclerosis, Stroke and Traumatic Brain Injuries, volume 12962 ofLecture Notes in Computer Science, pages 272–284, 2022
2022
- [44]
-
[45]
Y . Gao, A. Q. Raeini, A. M. Selem, I. Bondino, M. J. Blunt, and B. Bijeljic. Pore-scale imaging with measurement of relative permeability and capillary pressure on the same reservoir sandstone sample under water-wet and mixed-wet conditions.Advances in Water Resources, 146:103786, 2020. 18 Zhang, R., et al
2020
-
[46]
Alhosani, B
A. Alhosani, B. Bijeljic, and M. J. Blunt. Pore-scale imaging and analysis of wettability order, trapping and displacement in three-phase flow in porous media with various wettabilities.Transport in Porous Media, 140:59–84, 2021
2021
-
[47]
Y . Gao, A. Q. Raeini, A. M. Selem, I. Bondino, M. J. Blunt, and B. Bijeljic. Pore-scale imaging with measurement of relative permeability and capillary pressure on the same reservoir sandstone sample under water-wet and mixed-wet conditions.Advances in Water Resources, 146:103786, 2020
2020
-
[48]
Q. Lin, B. Bijeljic, S. Berg, R. Pini, M. J. Blunt, and S. Krevor. Minimal surfaces in porous media: Pore-scale imaging of multiphase flow in an altered-wettability bentheimer sandstone.Physical Review E, 99(6):063105, 2019
2019
-
[49]
R. T. Armstrong, M. L. Porter, and D. Wildenschild. Linking pore-scale interfacial curvature to column-scale capillary pressure.Advances in Water resources, 46:55–62, 2012
2012
-
[50]
L. Zhu, B. Bijeljic, and M. J. Blunt. Generation of pore-space images using improved pyramid wasserstein generative adversarial networks.Advances in Water Resources, 190:104748, 2024
2024
-
[51]
L. Zhu, B. Bijeljic, and M. J. Blunt. Diffusion model-based generation of three-dimensional multiphase pore-scale images.Transport in Porous Media, 152:22, 2025. 19
2025
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.