Recognition: no theorem link
Balancing Efficiency and Restoration: Lightweight Mamba-Based Model for CT Metal Artifact Reduction
Pith reviewed 2026-05-10 18:43 UTC · model grok-4.3
The pith
A lightweight Mamba-based UNet reduces metal artifacts in CT images while preserving tissue structures and using few resources.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
MARMamba is a streamlined UNet architecture that incorporates multi-scale Mamba as its core module. Within MS-Mamba, a flip mamba block captures comprehensive contextual information by analyzing images from multiple orientations. The average maximum feed-forward network then integrates critical features with average features to suppress the artifacts. This combination eliminates metal artifacts of different sizes from standard CT images alone, without sinogram data, while keeping original anatomical structures intact.
What carries the argument
The multi-scale Mamba (MS-Mamba) module, which uses a flip mamba block to gather multi-orientation context and an average maximum feed-forward network to combine features for artifact suppression.
Load-bearing premise
The flip mamba block combined with the average maximum feed-forward network inside MS-Mamba gathers enough contextual information to suppress artifacts without damaging organ and tissue structures.
What would settle it
Running the model on a new CT dataset with varied metal implants and finding blurred or distorted anatomical structures in the output compared to ground truth would disprove the central claim.
Figures
read the original abstract
In computed tomography imaging, metal implants frequently generate severe artifacts that compromise image quality and hinder diagnostic accuracy. There are three main challenges in the existing methods: the deterioration of organ and tissue structures, dependence on sinogram data, and an imbalance between resource use and restoration efficiency. Addressing these issues, we introduce MARMamba, which effectively eliminates artifacts caused by metals of different sizes while maintaining the integrity of the original anatomical structures of the image. Furthermore, this model only focuses on CT images affected by metal artifacts, thus negating the requirement for additional input data. The model is a streamlined UNet architecture, which incorporates multi-scale Mamba (MS-Mamba) as its core module. Within MS-Mamba, a flip mamba block captures comprehensive contextual information by analyzing images from multiple orientations. Subsequently, the average maximum feed-forward network integrates critical features with average features to suppress the artifacts. This combination allows MARMamba to reduce artifacts efficiently. The experimental results demonstrate that our model excels in reducing metal artifacts, offering distinct advantages over other models. It also strikes an optimal balance between computational demands, memory usage, and the number of parameters, highlighting its practical utility in the real world. The code of the presented model is available at: https://github.com/RICKand-MORTY/MARMamba.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper introduces MARMamba, a lightweight UNet architecture incorporating multi-scale Mamba (MS-Mamba) modules. Within MS-Mamba, a flip mamba block captures multi-orientation contextual information and an average maximum feed-forward network integrates features to suppress metal artifacts in CT images. The model claims to eliminate artifacts from metals of varying sizes while preserving anatomical structures, operate solely on artifact-affected images without sinogram data, and achieve an optimal balance between restoration quality and computational efficiency (memory, parameters, demands). Experimental results are asserted to demonstrate superiority over other models.
Significance. If the claims are substantiated with proper controls, this work could advance practical metal artifact reduction in clinical CT by providing an image-only, lightweight Mamba-based alternative that avoids common structural degradation and high resource costs of prior CNN or sinogram-dependent methods. Public code release aids reproducibility.
major comments (2)
- [Experimental Results] Experimental Results section: only aggregate PSNR/SSIM metrics on artifacted images are referenced, with no isolated tests on clean CT volumes or non-artifact regions to measure introduced HU changes, edge preservation, or structural fidelity. This validation is required to support the load-bearing claim that the flip mamba block plus average-max FFN in MS-Mamba selectively suppresses artifacts without deteriorating organ/tissue structures.
- [Abstract] Abstract and Methods: the assertion of 'distinct advantages' and 'optimal balance' between computational demands, memory, and parameters lacks any reported quantitative values, baseline comparisons, or ablation studies on the MS-Mamba components, preventing verification of the efficiency and performance claims.
minor comments (1)
- [Abstract] The abstract would benefit from including at least one key numerical result (e.g., PSNR improvement or parameter count) to make the performance claims concrete.
Simulated Author's Rebuttal
We thank the referee for the constructive comments on our manuscript. These suggestions highlight areas where additional validation and quantification can strengthen the presentation of our results. We address each major comment below and indicate the revisions we will make.
read point-by-point responses
-
Referee: [Experimental Results] Experimental Results section: only aggregate PSNR/SSIM metrics on artifacted images are referenced, with no isolated tests on clean CT volumes or non-artifact regions to measure introduced HU changes, edge preservation, or structural fidelity. This validation is required to support the load-bearing claim that the flip mamba block plus average-max FFN in MS-Mamba selectively suppresses artifacts without deteriorating organ/tissue structures.
Authors: We agree that explicit validation on clean CT volumes is valuable to directly quantify any potential HU deviations or structural changes in non-artifact regions. Our current evaluation follows standard metal artifact reduction benchmarks that focus on artifacted images, where PSNR/SSIM improvements and visual comparisons already indicate preservation of anatomy. To address this point rigorously, we will add experiments applying the model to clean volumes and reporting HU statistics, edge preservation metrics, and structural similarity in non-artifact areas. These results will be incorporated into the revised Experimental Results section. revision: yes
-
Referee: [Abstract] Abstract and Methods: the assertion of 'distinct advantages' and 'optimal balance' between computational demands, memory, and parameters lacks any reported quantitative values, baseline comparisons, or ablation studies on the MS-Mamba components, preventing verification of the efficiency and performance claims.
Authors: We acknowledge that the Abstract would benefit from explicit quantitative support for the efficiency claims. The full manuscript contains comparative results against baselines in the Experimental Results section, but we agree that highlighting specific numbers (parameters, memory footprint, inference time) and component ablations directly in the Abstract and Methods would improve clarity. We will revise the Abstract to include key quantitative values and add a concise ablation study on the flip mamba block and average-max FFN within the Methods or Experiments section. revision: yes
Circularity Check
No circularity: empirical DL model with experimental validation
full rationale
The paper introduces MARMamba, a UNet-style architecture with MS-Mamba blocks (flip Mamba + average-max FFN) for CT metal artifact reduction. All central claims rest on training the model on artifacted CT images and reporting aggregate metrics (PSNR/SSIM) plus efficiency numbers against baselines. No derivation chain, fitted parameters renamed as predictions, self-citation load-bearing uniqueness theorems, or ansatz smuggling appears; the architecture choices are presented as design decisions validated by results rather than self-referential equations. This is the standard non-circular pattern for empirical computer-vision papers.
Axiom & Free-Parameter Ledger
free parameters (1)
- network weights and training hyperparameters
invented entities (3)
-
MS-Mamba module
no independent evidence
-
flip mamba block
no independent evidence
-
average maximum feed-forward network
no independent evidence
Reference graph
Works this paper leans on
-
[1]
InDuDoNet: An interpretable dual domain network for CT metal artifact reduction,
H. Wang, Y . Li, H. Zhang, J. Chen, K. Ma, D. Meng, and Y . Zheng, “InDuDoNet: An interpretable dual domain network for CT metal artifact reduction,” inProc. Int. Conf. Med. Image Comput. Comput.- Assisted Intervention (MICCAI), Sep. 2021, pp. 107–118
2021
-
[2]
Advances in metal artifact reduction in ct images: A review of traditional and novel metal artifact reduction techniques,
M. Selles, J. A. van Osch, M. Maas, M. F. Boomsma, and R. H. Wellenberg, “Advances in metal artifact reduction in ct images: A review of traditional and novel metal artifact reduction techniques,”Eur . J. Radiol., vol. 170, p. 111276, Jan. 2024
2024
-
[3]
Advancements in supervised deep learning for metal artifact reduction in computed tomography: A systematic review,
C. E. Kleber, R. Karius, L. E. Naessens, C. O. Van Toledo, J. A. C. van Osch, M. F. Boomsma, J. W. Heemskerk, and A. J. van der Molen, “Advancements in supervised deep learning for metal artifact reduction in computed tomography: A systematic review,”Eur . J. Radiol., vol. 181, p. 111732, Dec. 2024
2024
-
[4]
Reduction of ct artifacts caused by metallic implants
W. A. Kalender, R. Hebel, and J. Ebersberger, “Reduction of ct artifacts caused by metallic implants.”Radiol., vol. 164, no. 2, pp. 576–577, Aug. 1987
1987
-
[5]
Nor- malized metal artifact reduction (NMAR) in computed tomography,
E. Meyer, R. Raupach, M. Lell, B. Schmidt, and M. Kachelrieß, “Nor- malized metal artifact reduction (NMAR) in computed tomography,” Med. Phys., vol. 37, no. 10, pp. 5482–5493, Sep. 2010
2010
-
[6]
InDuDoNet+: A deep unfolding dual domain network for metal artifact reduction in CT images,
H. Wang, Y . Li, H. Zhang, D. Meng, and Y . Zheng, “InDuDoNet+: A deep unfolding dual domain network for metal artifact reduction in CT images,”Med. Image Anal., vol. 85, p. 102729, Apr. 2023
2023
-
[7]
MEPNet: A model- driven equivariant proximal network for joint sparse-view reconstruction and metal artifact reduction in CT images,
H. Wang, M. Zhou, D. Wei, Y . Li, and Y . Zheng, “MEPNet: A model- driven equivariant proximal network for joint sparse-view reconstruction and metal artifact reduction in CT images,” inProc. Int. Conf. Med. Image Comput. Comput.-Assisted Intervention (MICCAI), Oct. 2023, pp. 109–120
2023
-
[8]
Idol-net: An interactive dual-domain parallel network for ct metal artifact reduction,
T. Wang, Z. Lu, Z. Yang, W. Xia, M. Hou, H. Sun, Y . Liu, H. Chen, J. Zhou, and Y . Zhang, “Idol-net: An interactive dual-domain parallel network for ct metal artifact reduction,”IEEE Trans. Radiat. Plasma Med. Sci., vol. 6, no. 8, pp. 874–885, 2022
2022
-
[9]
Irdnet: Iterative relation-based dual-domain network via metal artifact feature guidance for ct metal artifact reduction,
H. Wang, S. Yang, X. Bai, Z. Wang, J. Wu, Y . Lv, and G. Cao, “Irdnet: Iterative relation-based dual-domain network via metal artifact feature guidance for ct metal artifact reduction,”IEEE Trans. Radiat. Plasma Med. Sci., vol. 8, no. 8, pp. 959–972, 2024
2024
-
[10]
Deep learning based projection domain metal segmentation for metal artifact reduction in cone beam computed tomography,
“Deep learning based projection domain metal segmentation for metal artifact reduction in cone beam computed tomography,”IEEE Access, vol. 11, pp. 100 371–100 382, Sep. 2023
2023
-
[11]
Adaptive convolutional dictionary network for CT metal artifact reduction,
H. Wang, Y . Li, D. Meng, and Y . Zheng, “Adaptive convolutional dictionary network for CT metal artifact reduction,” inProc. Int. Joint Conf. Artif. Intell. (IJCAI), Jul. 2022, pp. 1401–1407
2022
-
[12]
Orientation- shared convolution representation for CT metal artifact learning,
H. Wang, Q. Xie, Y . Li, Y . Huang, D. Meng, and Y . Zheng, “Orientation- shared convolution representation for CT metal artifact learning,” in Proc. Int. Conf. Med. Image Comput. Comput.-Assisted Intervention (MICCAI), Sep. 2022, pp. 665–675
2022
-
[13]
OSCNet: Orientation-shared convolutional network for CT metal artifact learning,
H. Wang, Q. Xie, D. Zeng, J. Ma, D. Meng, and Y . Zheng, “OSCNet: Orientation-shared convolutional network for CT metal artifact learning,” IEEE Trans. Med. Imaging, vol. 43, no. 1, pp. 489–502, Jan. 2024
2024
-
[14]
Lwcdnet: An interpretable learning weighted convolutional dictionary network for metal artifact reduction in ct images,
J. Liu, T. Jin, Z. Ye, F. Wu, K. Wang, Z. Wu, Y . Zhang, D. Hu, and Y . Chen, “Lwcdnet: An interpretable learning weighted convolutional dictionary network for metal artifact reduction in ct images,”IEEE Trans. Instrum. Meas., vol. 74, pp. 1–15, Mar. 2025
2025
-
[15]
Deep unsupervised learning using nonequilibrium thermodynamics,
J. Sohl-Dickstein, E. Weiss, N. Maheswaranathan, and S. Ganguli, “Deep unsupervised learning using nonequilibrium thermodynamics,” inProc. Int. Conf. Mach. Learn. (ICML), vol. 37, Jul. 2015, pp. 2256–2265
2015
-
[16]
Denoising diffusion probabilistic models,
J. Ho, A. Jain, and P. Abbeel, “Denoising diffusion probabilistic models,” inProc. Adv. Neural Inf. Process. Syst. (NeurIPS), vol. 33, 2020, pp. 6840–6851
2020
-
[17]
A denoising diffusion probabilistic model for metal artifact reduction in CT,
G. M. Karageorgos, J. Zhang, N. Peters, W. Xia, C. Niu, H. Paganetti, G. Wang, and B. De Man, “A denoising diffusion probabilistic model for metal artifact reduction in CT,”IEEE Trans. Med. Imaging, vol. 43, no. 10, pp. 3521–3532, Oct. 2024
2024
-
[18]
Unsupervised CT metal artifact reduction by plugging diffusion priors in dual domains,
X. Liu, Y . Xie, S. Diao, S. Tan, and X. Liang, “Unsupervised CT metal artifact reduction by plugging diffusion priors in dual domains,”IEEE Trans. Med. Imaging, vol. 43, no. 10, pp. 3533–3545, Oct. 2024
2024
-
[19]
Dual-domain denoising diffusion probabilis- tic model for metal artifact reduction,
W. Xia, C. Niu, G. M. Karageorgos, J. Zhang, N. Peters, H. Paganetti, B. D. Man, and G. Wang, “Dual-domain denoising diffusion probabilis- tic model for metal artifact reduction,”IEEE Trans. Radiat. Plasma Med. Sci., pp. 1–1, Jun. 2025
2025
-
[20]
Denoising diffusion implicit models,
J. Song, C. Meng, and S. Ermon, “Denoising diffusion implicit models,” inProc. Int. Conf. Learn. Represent. (ICLR), 2021
2021
-
[21]
Attention is all you need,
A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, L. u. Kaiser, and I. Polosukhin, “Attention is all you need,” vol. 30, 2017
2017
-
[22]
Unsupervised metal artifacts reduction network for CT images based on efficient transformer,
L. Zhu, Y . Han, X. Xi, L. Li, M. Liu, H. Fu, S. Tan, and B. Yan, “Unsupervised metal artifacts reduction network for CT images based on efficient transformer,”Biomed. Signal Process. Control, vol. 89, p. 105753, Mar. 2024
2024
-
[23]
Dense transformer based enhanced coding network for unsupervised metal artifact reduction,
W. Xie and M. B. Blaschko, “Dense transformer based enhanced coding network for unsupervised metal artifact reduction,” inProc. Int. Conf. Med. Image Comput. Comput.-Assisted Intervention (MICCAI), Oct. 2023, pp. 77–86
2023
-
[24]
Mupo- net: A multilevel dual-domain progressive enhancement network with embedded attention for ct metal artifact reduction,
X. Yao, J. Tan, Z. Deng, D. Xiong, Q. Zhao, and M. Wu, “Mupo- net: A multilevel dual-domain progressive enhancement network with embedded attention for ct metal artifact reduction,” inProc. IEEE Int. Conf. Acoust. Speech Signal Process. (ICASSP), 2025, pp. 1–5
2025
-
[25]
Mamba: Linear-time sequence modeling with selective state spaces,
A. Gu and T. Dao, “Mamba: Linear-time sequence modeling with selective state spaces,” inProc. Conf. Lang. Model. (COLM), Jul. 2024
2024
-
[26]
Transformers are SSMs: Generalized models and efficient algorithms through structured state space duality,
T. Dao and A. Gu, “Transformers are SSMs: Generalized models and efficient algorithms through structured state space duality,” inProc. Int. Conf. Mach. Learn. (ICML), Jul. 2024
2024
-
[27]
Efficiently modeling long sequences with structured state spaces,
A. Gu, K. Goel, and C. R ´e, “Efficiently modeling long sequences with structured state spaces,” inProc. Int. Conf. Learn. Represent. (ICLR), 2022
2022
-
[28]
Vision Mamba: Efficient visual representation learning with bidirectional state space model,
L. Zhu, B. Liao, Q. Zhang, X. Wang, W. Liu, and X. Wang, “Vision Mamba: Efficient visual representation learning with bidirectional state space model,” inProc. Int. Conf. Mach. Learn. (ICML), Jul. 2024. IEEE TRANSACTIONS ON RADIATION AND PLASMA MEDICAL SCIENCES, VOL. 0, NO. 0, JULY 2025 15
2024
-
[29]
U-net: Convolutional networks for biomedical image segmentation,
O. Ronneberger, P. Fischer, and T. Brox, “U-net: Convolutional networks for biomedical image segmentation,” inProc. Int. Conf. Med. Image Comput. Comput.-Assisted Intervention (MICCAI), Nov. 2015, pp. 234– 241
2015
-
[30]
MambaOut: Do we really need mamba for vision?
W. Yu and X. Wang, “MambaOut: Do we really need mamba for vision?” inProc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), Jun. 2025, pp. 4484–4496
2025
-
[31]
Improved techniques for training consistency models,
Y . Song and P. Dhariwal, “Improved techniques for training consistency models,” inProc. Int. Conf. Learn. Represent. (ICLR), Jan. 2024
2024
-
[32]
The unreasonable effectiveness of deep features as a perceptual metric,
R. Zhang, P. Isola, A. A. Efros, E. Shechtman, and O. Wang, “The unreasonable effectiveness of deep features as a perceptual metric,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), 2018, pp. 586–595
2018
-
[33]
K. Yan, X. Wang, L. Lu, and R. M. Summers, “DeepLesion: Auto- mated deep mining, categorization and detection of significant radiology image findings using large-scale clinical lesion annotations,” 2017, arXiv:1710.01766
work page Pith review arXiv 2017
-
[34]
Deep learning to segment pelvic bones: large-scale CT datasets and baseline models,
P. Liu, H. Han, Y . Du, H. Zhu, Y . Li, F. Gu, H. Xiao, J. Li, C. Zhao, L. Xiaoet al., “Deep learning to segment pelvic bones: large-scale CT datasets and baseline models,”Int. J. Comput. Assisted Radiol. Surg., vol. 16, p. 749–756, Apr. 2021
2021
-
[35]
Adam: A method for stochastic optimization,
D. P. Kingma and J. Ba, “Adam: A method for stochastic optimization,” inProc. Int. Conf. Learn. Represent. (ICLR), 2015
2015
-
[36]
SGDR: Stochastic gradient descent with warm restarts,
I. Loshchilov and F. Hutter, “SGDR: Stochastic gradient descent with warm restarts,” inProc. Int. Conf. Learn. Represent. (ICLR), 2017
2017
-
[37]
Restormer: Efficient transformer for high-resolution image restoration,
S. W. Zamir, A. Arora, S. Khan, M. Hayat, F. S. Khan, and M. Yang, “Restormer: Efficient transformer for high-resolution image restoration,” inProc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), 2022, pp. 5718–5729
2022
-
[38]
Marformer: An efficient metal artifact re- duction transformer for dental CBCT images,
Y . Shi, J. Xu, and D. Shen, “Marformer: An efficient metal artifact re- duction transformer for dental CBCT images,” 2024,arXiv:2311.09590
-
[39]
An image is worth 16x16 words: Transformers for image recognition at scale,
A. Dosovitskiy, L. Beyer, A. Kolesnikov, D. Weissenborn, X. Zhai, T. Unterthiner, M. Dehghani, M. Minderer, G. Heigold, S. Gelly, J. Uszkoreit, and N. Houlsby, “An image is worth 16x16 words: Transformers for image recognition at scale,” inProc. Int. Conf. Learn. Represent. (ICLR), 2021
2021
-
[40]
Pvt v2: Improved baselines with pyramid vision transformer,
W. Wang, E. Xie, X. Li, D.-P. Fan, K. Song, D. Liang, T. Lu, P. Luo, and L. Shao, “Pvt v2: Improved baselines with pyramid vision transformer,” Comput. Visual Media, vol. 8, no. 3, pp. 415–424, Mar. 2022
2022
-
[41]
DICDNet: Deep interpretable convolutional dictionary network for metal artifact reduction in CT images,
H. Wang, Y . Li, N. He, K. Ma, D. Meng, and Y . Zheng, “DICDNet: Deep interpretable convolutional dictionary network for metal artifact reduction in CT images,”IEEE Trans. Med. Imaging, vol. 41, no. 4, pp. 869–880, Apr. 2022
2022
-
[42]
Scope of validity of psnr in im- age/video quality assessment,
Q. Huynh-Thu and M. Ghanbari, “Scope of validity of psnr in im- age/video quality assessment,”Electron. Lett., vol. 44, no. 13, pp. 800– 801, Jun. 2008
2008
-
[43]
Image quality assessment: from error visibility to structural similarity,
Z. Wang, A. C. Bovik, H. R. Sheikh, and E. P. Simoncelli, “Image quality assessment: from error visibility to structural similarity,”IEEE Trans. Image Process., vol. 13, no. 4, pp. 600–612, Apr. 2004
2004
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.