CT-DegradBench: A Physics-Informed Benchmark for CT Degradation Detection and Severity Estimation

Aladine Chetouani; Alessandro Bruno; Azeddine Beghdadi; Faouzi Alaya Cheikh; Gorkem Durak; Habib Zaidi; Marie Luong; Marouane Tliba; Nour Aburaed; Ulas Bagci

arxiv: 2605.16431 · v1 · pith:OPLTJ4FUnew · submitted 2026-05-14 · 💻 cs.CV

CT-DegradBench: A Physics-Informed Benchmark for CT Degradation Detection and Severity Estimation

Yousra Nabila Taifour , Marouane Tliba , Zuheng Ming , Marie Luong , Nour Aburaed , Aladine Chetouani , Gorkem Durak , Alessandro Bruno

show 4 more authors

Faouzi Alaya Cheikh Habib Zaidi Ulas Bagci Azeddine Beghdadi

This is my paper

Pith reviewed 2026-05-20 20:13 UTC · model grok-4.3

classification 💻 cs.CV

keywords CT degradationartifact detectionseverity estimationvision-language modelsspectral featuresbenchmark datasettraining-free methodmixed artifacts

0 comments

The pith

SeSpeCT builds a training-free semantic quality axis from radiology text prompts and spectral cues to jointly detect CT artifact types and estimate their severity.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper presents CT-DegradBench as a dataset and evaluation framework that covers multiple CT degradations such as noise, blur, streaking, aliasing, and metal artifacts in both isolated and combined forms. It introduces SeSpeCT, which forms a semantic quality axis inside a multimodal embedding space by feeding radiology-informed text prompts into medical vision-language models, then fuses this axis with frequency-domain features that highlight degradation-specific patterns. This combination supports simultaneous prediction of which artifact is present and how severe it is, all without task-specific training or extra labeled examples. The approach matters because standard image quality metrics often fail to reflect clinical perception and because prior datasets treat each restoration problem in isolation. Experiments indicate that the resulting method beats the tested baselines on both single-degradation and mixed-degradation cases.

Core claim

By constructing a training-free semantic quality axis in the multimodal embedding space from radiology-informed text prompts and combining it with complementary spectral features that capture degradation-specific frequency patterns, SeSpeCT enables joint prediction of artifact type and severity level in CT images under controlled single- and mixed-artifact conditions, outperforming evaluated baselines.

What carries the argument

The semantic quality axis formed in the multimodal embedding space via radiology-informed text prompts, fused with spectral features for degradation-specific frequency analysis.

If this is right

A single experimental framework now supports systematic comparison across multiple degradation families and severity levels.
Joint type-and-severity prediction becomes possible without task-specific fine-tuning or additional labeled data.
Spectral cues supply complementary information that improves performance on both single- and mixed-degradation test cases.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Pre-trained medical vision-language models may already encode perceptual quality dimensions useful for radiology even when used zero-shot.
The same prompt-plus-spectrum recipe could be tested on other modalities that suffer from acquisition artifacts.
Downstream adaptive restoration pipelines could condition their behavior on the detected degradation profile produced by this axis.

Load-bearing premise

Radiology-informed text prompts can produce a reliable semantic quality axis in the multimodal embedding space that tracks degradation type and severity without any fine-tuning or labeled data.

What would settle it

A test set where scores along the constructed semantic quality axis show no correlation with radiologist ratings of degradation severity, or where SeSpeCT accuracy falls below that of baselines on new mixed-degradation examples.

Figures

Figures reproduced from arXiv: 2605.16431 by Aladine Chetouani, Alessandro Bruno, Azeddine Beghdadi, Faouzi Alaya Cheikh, Gorkem Durak, Habib Zaidi, Marie Luong, Marouane Tliba, Nour Aburaed, Ulas Bagci, Yousra Nabila Taifour, Zuheng Ming.

**Figure 1.** Figure 1: CT-DegradBench generation pipeline. A reference CT image is forward-projected to the sinogram domain, where physicsinformed degradations are applied individually or as realistic mixtures with controlled severity. The degraded sinogram is reconstructed via filtered backprojection to obtain the final degraded CT image, while structured metadata are generated in parallel for prompt construction. 2.3. Degrada… view at source ↗

**Figure 2.** Figure 2: Overview of the proposed SeSpeCT framework. The model combines a semantic quality branch derived from a medical vision–language model with frequency-domain descriptors extracted from the Fourier spectrum. The fused representation is used to jointly predict degradation type and severity. Mixed degradations. In clinical CT, multiple artifacts often co-occur due to interacting acquisition conditions, recons… view at source ↗

**Figure 3.** Figure 3: Semantic quality axis in the Vision-Language Model [PITH_FULL_IMAGE:figures/full_fig_p005_3.png] view at source ↗

**Figure 4.** Figure 4: Visualization of mid-severity degradations in the image domain (top) and Fourier domain (bottom). The distinct spectral patterns [PITH_FULL_IMAGE:figures/full_fig_p006_4.png] view at source ↗

**Figure 5.** Figure 5: t-SNE visualization colored by degradation type. Samples form distinct clusters corresponding to different degradation categories, indicating that the learned representation captures degradation-specific characteristics. dataset, this experiment evaluates predictive severity estimation on unseen samples. SeSpeCT is trained for joint degradation analysis using the semantic–spectral representation describ… view at source ↗

**Figure 6.** Figure 6: presents ROC curves for all degradations. The micro-average AUC reaches 0.954, indicating strong overall degradations discrimination. Single degradations achieve near-perfect AUC values in several cases, confirming that the learned representation clearly separates artifact types. Although mixture degradations show slightly lower AUC values, their performance remains high, demonstrating that the model retai… view at source ↗

**Figure 8.** Figure 8: t-SNE visualization distinguishing single and mixed degradations. The representation separates samples containing a single degradation from those with multiple degradations, highlighting its ability to capture compositional corruption patterns [PITH_FULL_IMAGE:figures/full_fig_p011_8.png] view at source ↗

**Figure 7.** Figure 7: shows the same embedding space colored by degradation severity. A gradual transition between severity levels can be observed within several clusters, indicating that the representation captures not only degradation type but also variations in degradation intensity. In particular, blur and streak artifacts display a clear progression from lower to higher severity levels along consistent directions in the e… view at source ↗

read the original abstract

Computed tomography (CT) images are frequently degraded by acquisition artifacts, including noise, blur, streaking, aliasing, and metal artifacts. Yet CT enhancement is still largely evaluated using image quality metrics with limited perceptual and clinical validity, while existing datasets remain focused on isolated restoration tasks, hindering unified benchmarking across diverse degradation types. We present CT-DegradBench, a dataset and benchmark for CT degradation detection and severity estimation under controlled single- and mixed-artifact settings. CT-DegradBench enables systematic evaluation across multiple degradation families and severity levels within a common experimental framework. We further propose SeSpeCT (Semantic-Spectral CT degradation estimation), a framework that combines semantic priors from medical vision-language models with complementary frequency-domain cues for artifact analysis. SeSpeCT constructs a training-free semantic quality axis in the multimodal embedding space using radiology-informed text prompts, without task-specific fine-tuning, and combines it with spectral features that capture degradation-specific frequency patterns. The resulting representation enables joint prediction of artifact type and severity. Experimental results show that SeSpeCT consistently outperforms the evaluated baselines under both single- and mixed-degradation settings. The framework is available at https://github.com/yousranb/CT-DEGRADBENCH.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

CT-DegradBench supplies a controlled mixed-degradation dataset for CT artifacts and pairs it with a training-free semantic-spectral estimator, but the performance edge rests on an unproven alignment between prompt-based embeddings and actual artifact severity.

read the letter

The main things here are a new benchmark called CT-DegradBench that generates single and mixed CT artifacts at graded levels, and SeSpeCT, which pulls semantic direction from a medical vision-language model via radiology prompts and adds it to frequency features for type and severity prediction without any fine-tuning or extra labels. This setup moves past the usual single-artifact test sets and gives a common frame for comparing restoration methods across noise, blur, streaking, aliasing, and metal effects. The training-free construction is a practical choice that keeps the method lightweight and avoids the data collection burden common in this area. The benchmark itself looks like the stronger part of the contribution because it directly tackles the fragmented evaluation practices the abstract describes. The code link is a plus for anyone who wants to check the generation pipeline or run their own tests. The soft spot sits in the central claim that the combined representation outperforms baselines. The abstract states consistent gains under mixed conditions, yet supplies no dataset sizes, no list of exact baselines, no metric values, and no statistical checks. More importantly, the semantic axis built from text prompts may not map cleanly onto physical degradation strength if the underlying VLM saw few degraded CT examples in pretraining; in that case the embedding projection could be driven by unrelated image semantics rather than artifact intensity, leaving the method no better than spectral features alone. That risk is especially relevant for mixed degradations where multiple frequency patterns overlap. This paper is for groups that build or benchmark CT restoration pipelines and need a shared testbed beyond isolated tasks. A reader focused on practical evaluation frameworks would get immediate use from the dataset even if they skip the estimator. It deserves a serious referee because the benchmark addresses a documented gap and the method is simple enough that reviewers can check the embedding assumption with targeted experiments. I would send it for review rather than desk reject.

Referee Report

3 major / 2 minor

Summary. The manuscript introduces CT-DegradBench, a controlled dataset and benchmark for CT degradation detection and severity estimation covering single- and mixed-artifact scenarios (noise, blur, streaking, aliasing, metal). It proposes SeSpeCT, a training-free framework that constructs a semantic quality axis in a medical VLM embedding space using radiology-informed text prompts and fuses it with frequency-domain spectral features to jointly predict artifact type and continuous severity. Experiments report that SeSpeCT outperforms evaluated baselines under both single- and mixed-degradation conditions.

Significance. If the central claims hold, the work supplies a much-needed unified benchmark for CT artifact analysis that moves beyond isolated restoration tasks and limited perceptual metrics. The training-free semantic-spectral fusion approach could reduce reliance on task-specific labeled data in medical imaging pipelines. Credit is due for releasing the dataset and code at the provided GitHub link, which supports reproducibility.

major comments (3)

[§4.3] §4.3 (Semantic Axis Construction): The manuscript does not report any direct validation (e.g., Pearson/Spearman correlation or embedding-space visualization) that the prompt-derived semantic quality axis correlates with ground-truth physical severity levels or artifact strength, particularly for mixed degradations. This alignment is load-bearing for the claim that the combined representation meaningfully encodes both type and severity beyond frequency features alone.
[Table 3] Table 3 (Mixed-degradation results): The reported outperformance margins are presented without statistical significance tests, confidence intervals, or multiple-run variance; given the low number of baselines and the reliance on pre-trained VLMs, it is unclear whether the gains are robust or could be explained by prompt sensitivity.
[§5.1] §5.1 (Baseline comparison): The abstract and experimental section claim consistent superiority, yet the exact set of baselines, their implementation details, and the precise metrics (beyond generic accuracy/MAE) are insufficiently specified to allow independent verification of the central experimental claim.

minor comments (2)

[Figure 2] Figure 2: The frequency spectra plots would benefit from explicit axis labels indicating normalized frequency ranges and clearer annotation of which peaks correspond to which artifact types.
[Eq. 7] Notation: The definition of the combined semantic-spectral feature vector (Eq. 7) uses an ambiguous weighting parameter α whose selection procedure is only described qualitatively; a short sensitivity analysis would improve clarity.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for their thoughtful and constructive comments on our manuscript. We address each of the major comments in detail below and outline the revisions we plan to make to improve the clarity and rigor of the work.

read point-by-point responses

Referee: [§4.3] The manuscript does not report any direct validation (e.g., Pearson/Spearman correlation or embedding-space visualization) that the prompt-derived semantic quality axis correlates with ground-truth physical severity levels or artifact strength, particularly for mixed degradations. This alignment is load-bearing for the claim that the combined representation meaningfully encodes both type and severity beyond frequency features alone.

Authors: We agree that explicit validation of the semantic axis would strengthen the interpretation of our results. While the overall performance of SeSpeCT in predicting severity suggests effective alignment, we did not include direct correlation metrics or visualizations in the original submission. In the revised manuscript, we will add Pearson and Spearman correlation analyses between the semantic quality scores and ground-truth severity levels for single- and mixed-degradation cases. We will also include 2D projections (e.g., PCA or t-SNE) of the embedding space colored by severity to visually demonstrate the correlation. This will be added to §4.3. revision: yes
Referee: [Table 3] The reported outperformance margins are presented without statistical significance tests, confidence intervals, or multiple-run variance; given the low number of baselines and the reliance on pre-trained VLMs, it is unclear whether the gains are robust or could be explained by prompt sensitivity.

Authors: We acknowledge the importance of statistical validation for the reported improvements. The original experiments were run with fixed seeds for reproducibility, but variance across runs was not reported. In the revision, we will conduct experiments over 5 independent runs with different seeds, report mean and standard deviation, and include 95% confidence intervals. We will also perform statistical significance tests (e.g., paired t-tests) against the baselines and report p-values. To address potential prompt sensitivity, we will include a sensitivity analysis by varying the radiology-informed prompts and showing that performance remains stable. These updates will be reflected in Table 3 and the experimental section. revision: yes
Referee: [§5.1] The abstract and experimental section claim consistent superiority, yet the exact set of baselines, their implementation details, and the precise metrics (beyond generic accuracy/MAE) are insufficiently specified to allow independent verification of the central experimental claim.

Authors: We appreciate this feedback on reproducibility. In the revised version, we will expand the description in §5.1 to explicitly list all baselines with their sources, implementation details (e.g., libraries used, hyperparameters), and the precise evaluation metrics used for each experiment (including per-class accuracy for artifact type, MAE for severity, and combined metrics for mixed degradations). We will also provide additional details on how the VLM embeddings were extracted to facilitate independent verification. revision: yes

Circularity Check

0 steps flagged

No significant circularity; method relies on external pre-trained models and empirical evaluation

full rationale

The paper proposes SeSpeCT by combining semantic priors from existing medical vision-language models (via radiology-informed text prompts in a multimodal embedding space) with standard frequency-domain spectral features. Neither the semantic quality axis construction nor the joint prediction of artifact type and severity reduces to a self-definition, fitted parameter renamed as prediction, or self-citation chain within the paper. The outperformance claim is presented as an empirical result on the introduced CT-DegradBench dataset under single- and mixed-degradation settings, not as a quantity derived by construction from quantities defined inside the paper. The framework is self-contained against external benchmarks and pre-trained models, with no load-bearing step that equates the claimed result to its own inputs.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 0 invented entities

The approach rests on the availability and suitability of pre-trained medical vision-language models plus the assumption that frequency-domain patterns are diagnostic of specific artifacts; no new free parameters or invented entities are introduced in the abstract description.

axioms (2)

domain assumption Pre-trained medical vision-language models encode radiology-relevant semantic priors that can be queried via text prompts to form a quality axis.
Invoked when constructing the training-free semantic quality axis from radiology-informed text prompts.
domain assumption Spectral features capture degradation-specific frequency patterns that complement semantic information.
Used to justify combining frequency-domain cues with the embedding-space axis.

pith-pipeline@v0.9.0 · 5805 in / 1172 out tokens · 59363 ms · 2026-05-20T20:13:15.892142+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

SeSpeCT constructs a training-free semantic quality axis in the multimodal embedding space using radiology-informed text prompts... combines it with spectral features that capture degradation-specific frequency patterns.
IndisputableMonolith/Foundation/RealityFromDistinction.lean reality_from_one_distinction unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

We introduce CT-DegradBench, a controlled benchmark for CT degradation detection and severity estimation, covering five common degradation types, calibrated severity levels...

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

44 extracted references · 44 canonical work pages

[1]

Merlin: A vision language founda- tion model for 3d computed tomography.Research Square,

Louis Blankemeier et al. Merlin: A vision language founda- tion model for 3d computed tomography.Research Square,

work page
[2]

Brenner and Eric J

David J. Brenner and Eric J. Hall. Computed tomography— an increasing source of radiation exposure.New England Journal of Medicine, 357(22):2277–2284, 2007. 2

work page 2007
[3]

Reproducible scaling laws for contrastive language-image learning

Mehdi Cherti et al. Reproducible scaling laws for contrastive language-image learning. InCVPR, pages 2818–2829, 2023. 6, 7, 5

work page 2023
[4]

Ai-driven advances in low- dose imaging and enhancement—a review.Diagnostics, 15 (6):689, 2025

Clement David-Olawade et al. Ai-driven advances in low- dose imaging and enhancement—a review.Diagnostics, 15 (6):689, 2025. 1, 6

work page 2025
[5]

Image quality assessment: Unifying struc- ture and texture similarity.IEEE TPAMI, 44(5):2567–2581,

Keyan Ding et al. Image quality assessment: Unifying struc- ture and texture similarity.IEEE TPAMI, 44(5):2567–2581,

work page
[6]

Uclh stroke eit dataset - radiology data (ct), 2017

Nir Goren, Thomas Dowrick, James Avery, and David Holder. Uclh stroke eit dataset - radiology data (ct), 2017. 4

work page 2017
[7]

Generative modeling in sinogram domain for sparse-view ct reconstruction.IEEE Transactions on Ra- diation and Plasma Medical Sciences, 8(2):195–207, 2023

Bing Guan et al. Generative modeling in sinogram domain for sparse-view ct reconstruction.IEEE Transactions on Ra- diation and Plasma Medical Sciences, 8(2):195–207, 2023. 1, 3

work page 2023
[8]

Aapm ct metal artifact reduction grand challenge.Medical Physics, 52(10):e70050, 2025

Eri Haneda et al. Aapm ct metal artifact reduction grand challenge.Medical Physics, 52(10):e70050, 2025. 3, 4

work page 2025
[9]

J. H. Hubbell and S. M. Seltzer. Tables of x-ray mass atten- uation coefficients and mass energy-absorption coefficients,

work page
[10]

Version 1.4, National Institute of Standards and Tech- nology, Gaithersburg, MD. 2

work page
[11]

A systematic review of deep learning- based denoising for low-dose computed tomography from a perceptual quality perspective.Biomedical Engineering Let- ters, 14(6):1153–1173, 2024

Wonjin Kim et al. A systematic review of deep learning- based denoising for low-dose computed tomography from a perceptual quality perspective.Biomedical Engineering Let- ters, 14(6):1153–1173, 2024. 1, 6

work page 2024
[12]

Ct image denoising and deblurring with deep learning: current status and perspectives.IEEE Trans- actions on Radiation and Plasma Medical Sciences, 8(2): 153–172, 2023

Yiming Lei et al. Ct image denoising and deblurring with deep learning: current status and perspectives.IEEE Trans- actions on Radiation and Plasma Medical Sciences, 8(2): 153–172, 2023. 1, 6

work page 2023
[13]

Lodopab-ct, a benchmark dataset for low-dose computed tomography reconstruction.Scien- tific Data, 8(1):109, 2021

Johannes Leuschner, Maximilian Schmidt, Daniel Otero Ba- guer, and Peter Maass. Lodopab-ct, a benchmark dataset for low-dose computed tomography reconstruction.Scien- tific Data, 8(1):109, 2021. 4

work page 2021
[14]

Tu-fg-207a-04: overview of the low dose ct grand challenge.Medical physics, 43(6Part35):3759– 3760, 2016

Cynthia McCollough. Tu-fg-207a-04: overview of the low dose ct grand challenge.Medical physics, 43(6Part35):3759– 3760, 2016. 6, 4

work page 2016
[15]

Advances in ct technology and clinical applica- tions—introductory editorial, 2025

Cynthia H McCollough, Masahiro Jinzaki, and Hatem Alka- dhi. Advances in ct technology and clinical applica- tions—introductory editorial, 2025. 1

work page 2025
[16]

Low-dose ct image and projec- tion dataset.Medical physics, 48(2):902–911, 2021

Taylor R Moen, Baiyu Chen, David R Holmes III, Xinhui Duan, Zhicong Yu, Lifeng Yu, Shuai Leng, Joel G Fletcher, and Cynthia H McCollough. Low-dose ct image and projec- tion dataset.Medical physics, 48(2):902–911, 2021. 4

work page 2021
[17]

D- perceptct: Deep perceptual enhancement for low-dose ct im- ages

Taifour Yousra Nabila, Azeddine Beghdadi, Marie Luong, Zuheng Ming, Habib Zaidi, and Faouzi Alaya Cheikh. D- perceptct: Deep perceptual enhancement for low-dose ct im- ages. InEuropean Workshop on Visual Information Process- ing (EUVIP), pages 1–6, 2025. 1

work page 2025
[18]

Ct image denoising methods for image quality improvement and radiation dose reduction

Rabeya Tus Sadia et al. Ct image denoising methods for image quality improvement and radiation dose reduction. Journal of Applied Clinical Medical Physics, 25(2):e14270,

work page
[19]

Deep learning-based segmentation of ultra-low-dose ct images us- ing an optimized nnu-net model.La Radiologia Medica, pages 1–17, 2025

Yazdan Salimi, Zahra Mansouri, Chang Sun, Amirhossein Sanaat, Mohammadhossein Yazdanpanah, Hossein Shooli, Ren´e Nkoulou, Sana Boudabbous, and Habib Zaidi. Deep learning-based segmentation of ultra-low-dose ct images us- ing an optimized nnu-net model.La Radiologia Medica, pages 1–17, 2025. 1

work page 2025
[20]

Advances in metal artifact reduction in ct images: A review of traditional and novel metal artifact reduction techniques.European Journal of Radiology, 170: 111276, 2024

Mark Selles et al. Advances in metal artifact reduction in ct images: A review of traditional and novel metal artifact reduction techniques.European Journal of Radiology, 170: 111276, 2024. 1

work page 2024
[21]

Sheikh and Alan C

Hamid R. Sheikh and Alan C. Bovik. Image information and visual quality.IEEE TIP, 15(2):430–444, 2006. 6

work page 2006
[22]

Report on the aapm deep- learning sparse-view ct grand challenge.Medical physics, 49 (8):4935–4943, 2022

Emil Y Sidky and Xiaochuan Pan. Report on the aapm deep- learning sparse-view ct grand challenge.Medical physics, 49 (8):4935–4943, 2022. 4

work page 2022
[23]

Find-net: Fourier-integrated network with dictionary kernels for metal artifact reduction

Farid Tasharofi et al. Find-net: Fourier-integrated network with dictionary kernels for metal artifact reduction. InMed- ical Image Computing and Computer-Assisted Intervention (MICCAI), pages 192–201, 2025. 1

work page 2025
[24]

A re- view of deep learning ct reconstruction from incomplete pro- jection data.IEEE Transactions on Radiation and Plasma Medical Sciences, 8(2):138–152, 2023

Tao Wang, Wenjun Xia, Jingfeng Lu, and Yi Zhang. A re- view of deep learning ct reconstruction from incomplete pro- jection data.IEEE Transactions on Radiation and Plasma Medical Sciences, 8(2):138–152, 2023. 1

work page 2023
[25]

Image quality assessment: From error visibility to structural similarity.IEEE TIP, 13(4):600–612,

Zhou Wang et al. Image quality assessment: From error visibility to structural similarity.IEEE TIP, 13(4):600–612,

work page
[26]

Medclip: Contrastive learning from un- paired medical images and text

Zifeng Wang et al. Medclip: Contrastive learning from un- paired medical images and text. InConference on Empirical Methods in Natural Language Processing (EMNLP), page 3876, 2022. 6, 7, 5

work page 2022
[27]

Charting the path forward: Ct image quality assessment - an in-depth review

Siyi Xun, Qiaoyu Li, Xiaohong Liu, et al. Charting the path forward: Ct image quality assessment - an in-depth review. Journal of King Saud University Computer and Information Sciences, 37(5):92, 2025. 1

work page 2025
[28]

Sharpness-aware low-dose ct de- noising using conditional generative adversarial network

Xin Yi and Paul Babyn. Sharpness-aware low-dose ct de- noising using conditional generative adversarial network. Journal of digital imaging, 31(5):655–669, 2018. 4

work page 2018
[29]

Tmaa-net: Tensor-domain multi-planal anti-aliasing network for sparse-view ct image reconstruc- tion.Physics in Medicine & Biology, 69(22):225012, 2024

Sungho Yun et al. Tmaa-net: Tensor-domain multi-planal anti-aliasing network for sparse-view ct image reconstruc- tion.Physics in Medicine & Biology, 69(22):225012, 2024. 1

work page 2024
[30]

A simple low-dose x-ray ct simulation from high-dose scan.IEEE Transactions on Nuclear Science, 62 (5):2226–2233, 2015

Dong Zeng et al. A simple low-dose x-ray ct simulation from high-dose scan.IEEE Transactions on Nuclear Science, 62 (5):2226–2233, 2015. 3

work page 2015
[31]

The unreasonable effectiveness of deep features as a perceptual metric

Richard Zhang et al. The unreasonable effectiveness of deep features as a perceptual metric. InCVPR, pages 586–595,

work page
[32]

A multimodal biomedical foundation model trained from fifteen million image–text pairs.NEJM AI, 2(1), 2024

Sheng Zhang et al. A multimodal biomedical foundation model trained from fifteen million image–text pairs.NEJM AI, 2(1), 2024. 4, 6, 7, 5 CT-DegradBench: A Physics-Informed Benchmark for CT Degradation Detection and Severity Estimation Supplementary Material

work page 2024
[33]

Per-Degradation Classification Performance Table 5 reports classification accuracy and F1-score for each degradation across single distortions (S1–S5) and mix- tures (M1–M5)

Additional Quantitative Evaluation 6.1. Per-Degradation Classification Performance Table 5 reports classification accuracy and F1-score for each degradation across single distortions (S1–S5) and mix- tures (M1–M5). Single degradations achieve consistently high performance, with most classes exceeding 0.98 accu- racy. Blur (S2) and metal artifacts (S5) obt...

work page
[34]

t-SNE Visualization by Degradation Type Figure 5 visualizes the learned embedding space projected using t-SNE and colored by degradation category

Embedding-Space Representation Analysis 7.1. t-SNE Visualization by Degradation Type Figure 5 visualizes the learned embedding space projected using t-SNE and colored by degradation category. Distinct clusters emerge for most degradation types, indicating that the proposed representation encodes artifact-specific char- acteristics. Degradations that intro...

work page
[35]

Branch Contribution Analysis Table 4 evaluates the contribution of each branch in the pro- posed semantic-spectral framework

Extended Ablation Studies 8.1. Branch Contribution Analysis Table 4 evaluates the contribution of each branch in the pro- posed semantic-spectral framework. Ablation results reveal that the two branches play complementary roles in captur- ing degradation characteristics. Removing the semantic quality branch (B sem) causes a drastic drop in classification ...

work page
[36]

Additional Benchmark Details This section provides supplementary details on the prompt design used to construct the semantic quality axis, the phys- ical motivation behind the degradation mixtures in CT- DegradBench, and the relation of CT-DegradBench to ex- isting CT restoration datasets. 9.1. Prompt design for the semantic quality axis Table 7 lists the...

work page
[37]

Axial abdominal CT slice with excellent diagnostic quality, sharp boundaries, clear organ detail, and no visible artifacts

work page
[38]

Abdominal CT slice with severe noise and grainy appearance that reduces visibility of anatomical structures

work page
[39]

Diagnostic abdominal CT with clear anatomical structures, low noise, high contrast, and no streak artifacts

work page
[40]

Abdominal CT slice with strong blur and significant loss of sharpness

work page
[41]

High-quality CT image with sharp edges, clean appearance, and good visibility of abdominal organs

work page
[42]

Abdominal CT slice with strong streak artifacts and reduced diagnostic quality

work page
[43]

Abdominal CT slice with sparse-view aliasing artifacts and dis- torted anatomical structures

work page
[44]

Abdominal CT slice with strong metal artifacts causing bright streaks and severe image corruption. Table 7.Prompt sets used to construct the semantic quality axis.High-quality prompts describe artifact-free diagnostic CT images, while low-quality prompts represent common degradations including noise, blur, streak artifacts, aliasing, and metal artifacts. ...

work page arXiv 1917

[1] [1]

Merlin: A vision language founda- tion model for 3d computed tomography.Research Square,

Louis Blankemeier et al. Merlin: A vision language founda- tion model for 3d computed tomography.Research Square,

work page

[2] [2]

Brenner and Eric J

David J. Brenner and Eric J. Hall. Computed tomography— an increasing source of radiation exposure.New England Journal of Medicine, 357(22):2277–2284, 2007. 2

work page 2007

[3] [3]

Reproducible scaling laws for contrastive language-image learning

Mehdi Cherti et al. Reproducible scaling laws for contrastive language-image learning. InCVPR, pages 2818–2829, 2023. 6, 7, 5

work page 2023

[4] [4]

Ai-driven advances in low- dose imaging and enhancement—a review.Diagnostics, 15 (6):689, 2025

Clement David-Olawade et al. Ai-driven advances in low- dose imaging and enhancement—a review.Diagnostics, 15 (6):689, 2025. 1, 6

work page 2025

[5] [5]

Image quality assessment: Unifying struc- ture and texture similarity.IEEE TPAMI, 44(5):2567–2581,

Keyan Ding et al. Image quality assessment: Unifying struc- ture and texture similarity.IEEE TPAMI, 44(5):2567–2581,

work page

[6] [6]

Uclh stroke eit dataset - radiology data (ct), 2017

Nir Goren, Thomas Dowrick, James Avery, and David Holder. Uclh stroke eit dataset - radiology data (ct), 2017. 4

work page 2017

[7] [7]

Generative modeling in sinogram domain for sparse-view ct reconstruction.IEEE Transactions on Ra- diation and Plasma Medical Sciences, 8(2):195–207, 2023

Bing Guan et al. Generative modeling in sinogram domain for sparse-view ct reconstruction.IEEE Transactions on Ra- diation and Plasma Medical Sciences, 8(2):195–207, 2023. 1, 3

work page 2023

[8] [8]

Aapm ct metal artifact reduction grand challenge.Medical Physics, 52(10):e70050, 2025

Eri Haneda et al. Aapm ct metal artifact reduction grand challenge.Medical Physics, 52(10):e70050, 2025. 3, 4

work page 2025

[9] [9]

J. H. Hubbell and S. M. Seltzer. Tables of x-ray mass atten- uation coefficients and mass energy-absorption coefficients,

work page

[10] [10]

Version 1.4, National Institute of Standards and Tech- nology, Gaithersburg, MD. 2

work page

[11] [11]

A systematic review of deep learning- based denoising for low-dose computed tomography from a perceptual quality perspective.Biomedical Engineering Let- ters, 14(6):1153–1173, 2024

Wonjin Kim et al. A systematic review of deep learning- based denoising for low-dose computed tomography from a perceptual quality perspective.Biomedical Engineering Let- ters, 14(6):1153–1173, 2024. 1, 6

work page 2024

[12] [12]

Ct image denoising and deblurring with deep learning: current status and perspectives.IEEE Trans- actions on Radiation and Plasma Medical Sciences, 8(2): 153–172, 2023

Yiming Lei et al. Ct image denoising and deblurring with deep learning: current status and perspectives.IEEE Trans- actions on Radiation and Plasma Medical Sciences, 8(2): 153–172, 2023. 1, 6

work page 2023

[13] [13]

Lodopab-ct, a benchmark dataset for low-dose computed tomography reconstruction.Scien- tific Data, 8(1):109, 2021

Johannes Leuschner, Maximilian Schmidt, Daniel Otero Ba- guer, and Peter Maass. Lodopab-ct, a benchmark dataset for low-dose computed tomography reconstruction.Scien- tific Data, 8(1):109, 2021. 4

work page 2021

[14] [14]

Tu-fg-207a-04: overview of the low dose ct grand challenge.Medical physics, 43(6Part35):3759– 3760, 2016

Cynthia McCollough. Tu-fg-207a-04: overview of the low dose ct grand challenge.Medical physics, 43(6Part35):3759– 3760, 2016. 6, 4

work page 2016

[15] [15]

Advances in ct technology and clinical applica- tions—introductory editorial, 2025

Cynthia H McCollough, Masahiro Jinzaki, and Hatem Alka- dhi. Advances in ct technology and clinical applica- tions—introductory editorial, 2025. 1

work page 2025

[16] [16]

Low-dose ct image and projec- tion dataset.Medical physics, 48(2):902–911, 2021

Taylor R Moen, Baiyu Chen, David R Holmes III, Xinhui Duan, Zhicong Yu, Lifeng Yu, Shuai Leng, Joel G Fletcher, and Cynthia H McCollough. Low-dose ct image and projec- tion dataset.Medical physics, 48(2):902–911, 2021. 4

work page 2021

[17] [17]

D- perceptct: Deep perceptual enhancement for low-dose ct im- ages

Taifour Yousra Nabila, Azeddine Beghdadi, Marie Luong, Zuheng Ming, Habib Zaidi, and Faouzi Alaya Cheikh. D- perceptct: Deep perceptual enhancement for low-dose ct im- ages. InEuropean Workshop on Visual Information Process- ing (EUVIP), pages 1–6, 2025. 1

work page 2025

[18] [18]

Ct image denoising methods for image quality improvement and radiation dose reduction

Rabeya Tus Sadia et al. Ct image denoising methods for image quality improvement and radiation dose reduction. Journal of Applied Clinical Medical Physics, 25(2):e14270,

work page

[19] [19]

Deep learning-based segmentation of ultra-low-dose ct images us- ing an optimized nnu-net model.La Radiologia Medica, pages 1–17, 2025

Yazdan Salimi, Zahra Mansouri, Chang Sun, Amirhossein Sanaat, Mohammadhossein Yazdanpanah, Hossein Shooli, Ren´e Nkoulou, Sana Boudabbous, and Habib Zaidi. Deep learning-based segmentation of ultra-low-dose ct images us- ing an optimized nnu-net model.La Radiologia Medica, pages 1–17, 2025. 1

work page 2025

[20] [20]

Advances in metal artifact reduction in ct images: A review of traditional and novel metal artifact reduction techniques.European Journal of Radiology, 170: 111276, 2024

Mark Selles et al. Advances in metal artifact reduction in ct images: A review of traditional and novel metal artifact reduction techniques.European Journal of Radiology, 170: 111276, 2024. 1

work page 2024

[21] [21]

Sheikh and Alan C

Hamid R. Sheikh and Alan C. Bovik. Image information and visual quality.IEEE TIP, 15(2):430–444, 2006. 6

work page 2006

[22] [22]

Report on the aapm deep- learning sparse-view ct grand challenge.Medical physics, 49 (8):4935–4943, 2022

Emil Y Sidky and Xiaochuan Pan. Report on the aapm deep- learning sparse-view ct grand challenge.Medical physics, 49 (8):4935–4943, 2022. 4

work page 2022

[23] [23]

Find-net: Fourier-integrated network with dictionary kernels for metal artifact reduction

Farid Tasharofi et al. Find-net: Fourier-integrated network with dictionary kernels for metal artifact reduction. InMed- ical Image Computing and Computer-Assisted Intervention (MICCAI), pages 192–201, 2025. 1

work page 2025

[24] [24]

A re- view of deep learning ct reconstruction from incomplete pro- jection data.IEEE Transactions on Radiation and Plasma Medical Sciences, 8(2):138–152, 2023

Tao Wang, Wenjun Xia, Jingfeng Lu, and Yi Zhang. A re- view of deep learning ct reconstruction from incomplete pro- jection data.IEEE Transactions on Radiation and Plasma Medical Sciences, 8(2):138–152, 2023. 1

work page 2023

[25] [25]

Image quality assessment: From error visibility to structural similarity.IEEE TIP, 13(4):600–612,

Zhou Wang et al. Image quality assessment: From error visibility to structural similarity.IEEE TIP, 13(4):600–612,

work page

[26] [26]

Medclip: Contrastive learning from un- paired medical images and text

Zifeng Wang et al. Medclip: Contrastive learning from un- paired medical images and text. InConference on Empirical Methods in Natural Language Processing (EMNLP), page 3876, 2022. 6, 7, 5

work page 2022

[27] [27]

Charting the path forward: Ct image quality assessment - an in-depth review

Siyi Xun, Qiaoyu Li, Xiaohong Liu, et al. Charting the path forward: Ct image quality assessment - an in-depth review. Journal of King Saud University Computer and Information Sciences, 37(5):92, 2025. 1

work page 2025

[28] [28]

Sharpness-aware low-dose ct de- noising using conditional generative adversarial network

Xin Yi and Paul Babyn. Sharpness-aware low-dose ct de- noising using conditional generative adversarial network. Journal of digital imaging, 31(5):655–669, 2018. 4

work page 2018

[29] [29]

Tmaa-net: Tensor-domain multi-planal anti-aliasing network for sparse-view ct image reconstruc- tion.Physics in Medicine & Biology, 69(22):225012, 2024

Sungho Yun et al. Tmaa-net: Tensor-domain multi-planal anti-aliasing network for sparse-view ct image reconstruc- tion.Physics in Medicine & Biology, 69(22):225012, 2024. 1

work page 2024

[30] [30]

A simple low-dose x-ray ct simulation from high-dose scan.IEEE Transactions on Nuclear Science, 62 (5):2226–2233, 2015

Dong Zeng et al. A simple low-dose x-ray ct simulation from high-dose scan.IEEE Transactions on Nuclear Science, 62 (5):2226–2233, 2015. 3

work page 2015

[31] [31]

The unreasonable effectiveness of deep features as a perceptual metric

Richard Zhang et al. The unreasonable effectiveness of deep features as a perceptual metric. InCVPR, pages 586–595,

work page

[32] [32]

A multimodal biomedical foundation model trained from fifteen million image–text pairs.NEJM AI, 2(1), 2024

Sheng Zhang et al. A multimodal biomedical foundation model trained from fifteen million image–text pairs.NEJM AI, 2(1), 2024. 4, 6, 7, 5 CT-DegradBench: A Physics-Informed Benchmark for CT Degradation Detection and Severity Estimation Supplementary Material

work page 2024

[33] [33]

Per-Degradation Classification Performance Table 5 reports classification accuracy and F1-score for each degradation across single distortions (S1–S5) and mix- tures (M1–M5)

Additional Quantitative Evaluation 6.1. Per-Degradation Classification Performance Table 5 reports classification accuracy and F1-score for each degradation across single distortions (S1–S5) and mix- tures (M1–M5). Single degradations achieve consistently high performance, with most classes exceeding 0.98 accu- racy. Blur (S2) and metal artifacts (S5) obt...

work page

[34] [34]

t-SNE Visualization by Degradation Type Figure 5 visualizes the learned embedding space projected using t-SNE and colored by degradation category

Embedding-Space Representation Analysis 7.1. t-SNE Visualization by Degradation Type Figure 5 visualizes the learned embedding space projected using t-SNE and colored by degradation category. Distinct clusters emerge for most degradation types, indicating that the proposed representation encodes artifact-specific char- acteristics. Degradations that intro...

work page

[35] [35]

Branch Contribution Analysis Table 4 evaluates the contribution of each branch in the pro- posed semantic-spectral framework

Extended Ablation Studies 8.1. Branch Contribution Analysis Table 4 evaluates the contribution of each branch in the pro- posed semantic-spectral framework. Ablation results reveal that the two branches play complementary roles in captur- ing degradation characteristics. Removing the semantic quality branch (B sem) causes a drastic drop in classification ...

work page

[36] [36]

Additional Benchmark Details This section provides supplementary details on the prompt design used to construct the semantic quality axis, the phys- ical motivation behind the degradation mixtures in CT- DegradBench, and the relation of CT-DegradBench to ex- isting CT restoration datasets. 9.1. Prompt design for the semantic quality axis Table 7 lists the...

work page

[37] [37]

Axial abdominal CT slice with excellent diagnostic quality, sharp boundaries, clear organ detail, and no visible artifacts

work page

[38] [38]

Abdominal CT slice with severe noise and grainy appearance that reduces visibility of anatomical structures

work page

[39] [39]

Diagnostic abdominal CT with clear anatomical structures, low noise, high contrast, and no streak artifacts

work page

[40] [40]

Abdominal CT slice with strong blur and significant loss of sharpness

work page

[41] [41]

High-quality CT image with sharp edges, clean appearance, and good visibility of abdominal organs

work page

[42] [42]

Abdominal CT slice with strong streak artifacts and reduced diagnostic quality

work page

[43] [43]

Abdominal CT slice with sparse-view aliasing artifacts and dis- torted anatomical structures

work page

[44] [44]

Abdominal CT slice with strong metal artifacts causing bright streaks and severe image corruption. Table 7.Prompt sets used to construct the semantic quality axis.High-quality prompts describe artifact-free diagnostic CT images, while low-quality prompts represent common degradations including noise, blur, streak artifacts, aliasing, and metal artifacts. ...

work page arXiv 1917