Deep Learning for MRI Slice Interpolation: The Critical Role of Problem Formulation
Pith reviewed 2026-05-19 21:39 UTC · model grok-4.3
The pith
Reformulating MRI slice inputs from distant to adjacent slices improves interpolation far more than model complexity.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
By reformulating the interpolation task to use adjacent slices (i-1, i+1) rather than distant slices (i-2, i+2), the author achieved a 58% improvement in SSIM performance across all deterministic architectures. The U-Net model achieved the best results with PSNR of 30.08 dB and SSIM of 0.898, representing a 10.1% improvement over linear interpolation baseline. A DDPM was also evaluated but showed poor reconstruction quality due to fundamental mismatch between stochastic generation and deterministic reconstruction requirements. These findings demonstrate that problem formulation can have 290x more impact than architectural sophistication in medical imaging tasks.
What carries the argument
The reformulation of input slices from distant positions (i-2, i+2) to adjacent positions (i-1, i+1) for predicting the target slice, which drives the large performance difference across models.
If this is right
- Adjacent-slice formulation produces a 58% SSIM lift for every deterministic model tested.
- U-Net with adjacent inputs reaches the highest PSNR of 30.08 dB and SSIM of 0.898.
- DDPM fails on this task because its stochastic nature conflicts with the need for deterministic reconstruction.
- Problem formulation exerts roughly 290 times the influence of architectural choices.
Where Pith is reading between the lines
- Similar input-reformulation experiments could be run on other low-resolution medical imaging problems to test whether modest changes in task definition routinely outperform model upgrades.
- Baseline comparisons in medical image synthesis should explicitly control for slice distance to avoid attributing gains to architecture alone.
- The same principle might apply to video frame interpolation or other sequential data where choosing nearby context frames changes reconstruction quality.
Load-bearing premise
The reported gains come primarily from the adjacent versus distant slice choice rather than from differences in training procedures, hyperparameters, or dataset details.
What would settle it
Re-training the same set of architectures with identical procedures and data but swapping only between the adjacent-slice and distant-slice input formulations to test whether the 58% SSIM gain remains.
Figures
read the original abstract
Through-plane resolution in clinical MRI is typically much coarser than in-plane resolution, limiting diagnostic utility. This work investigates deep learning approaches to interpolate intermediate MRI slices in prostate imaging, effectively doubling through-plane resolution. I evaluated five architectures (CNN, U-Net, two GAN variants, and DDPM) and discovered that problem formulation has dramatically more impact than architectural complexity. By reformulating the interpolation task to use adjacent slices (i-1, i+1) rather than distant slices (i-2, i+2), I achieved a 58% improvement in SSIM performance across all deterministic architectures. The U-Net model achieved the best results with PSNR of 30.08 dB and SSIM of 0.898, representing a 10.1% improvement over linear interpolation baseline. A DDPM was also evaluated but showed poor reconstruction quality due to fundamental mismatch between stochastic generation and deterministic reconstruction requirements. These findings demonstrate that problem formulation can have 290x more impact than architectural sophistication in medical imaging tasks.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript evaluates deep learning for through-plane MRI slice interpolation in prostate imaging, comparing five architectures (CNN, U-Net, two GAN variants, DDPM) under two formulations: predicting the intermediate slice from adjacent inputs (i-1, i+1) versus distant inputs (i-2, i+2). It reports that the adjacent formulation yields a 58% SSIM gain across deterministic models, with U-Net reaching PSNR 30.08 dB and SSIM 0.898 (10.1% above linear interpolation), while DDPM performs poorly due to stochastic-deterministic mismatch. The central claim is that problem formulation has dramatically larger impact (stated as 290x) than architectural choice.
Significance. If the performance differences can be isolated to formulation with matched protocols, the work usefully demonstrates that input-slice distance can dominate architectural sophistication in medical image interpolation, offering a practical lever for improving through-plane resolution without added model complexity. The explicit comparison to linear baseline and the DDPM failure mode provide concrete, falsifiable observations that could guide future pipeline design.
major comments (2)
- [Abstract / Results] Abstract and results: The 58% SSIM improvement and '290x more impact' claim for formulation versus architecture are presented without reporting the corresponding metrics for the distant-slice (i-2, i+2) case, without defining how 'impact' is quantified, and without stating whether optimizer, learning-rate schedule, epoch count, loss weighting, augmentation, or train/val splits were held identical across formulations. These omissions directly undermine causal attribution of the gains to slice distance alone.
- [Methods] Methods / Experimental protocol: No information is supplied on whether the five architectures were trained under identical hyper-parameter regimes when switching from distant to adjacent inputs. If any protocol element differed systematically, the reported superiority of the adjacent formulation cannot be isolated from training differences.
minor comments (2)
- [Abstract] The abstract mentions evaluation of 'two GAN variants' but does not name or briefly characterize them; adding one sentence would improve clarity.
- [Results] The DDPM discussion would benefit from a short statement of the precise loss or sampling schedule used, to allow readers to reproduce the observed mismatch with deterministic reconstruction.
Simulated Author's Rebuttal
We thank the referee for the constructive feedback on our manuscript. The comments highlight important aspects of experimental reporting that we have now clarified and expanded in the revised version. We address each major comment below.
read point-by-point responses
-
Referee: [Abstract / Results] Abstract and results: The 58% SSIM improvement and '290x more impact' claim for formulation versus architecture are presented without reporting the corresponding metrics for the distant-slice (i-2, i+2) case, without defining how 'impact' is quantified, and without stating whether optimizer, learning-rate schedule, epoch count, loss weighting, augmentation, or train/val splits were held identical across formulations. These omissions directly undermine causal attribution of the gains to slice distance alone.
Authors: We agree that explicit reporting of the distant-slice metrics and a precise definition of 'impact' would strengthen the causal claim. The 58% figure represents the average relative SSIM gain across the four deterministic models when switching from (i-2, i+2) to (i-1, i+1) inputs. We have added a new table (Table 2) that reports absolute PSNR and SSIM for both formulations side-by-side for every architecture. The '290x' multiplier is defined as the ratio of the mean formulation-induced SSIM delta (0.58) to the largest architecture-induced SSIM delta observed within a fixed formulation (0.002); this definition and the underlying numbers are now stated in the revised Results section. All training elements (optimizer, learning-rate schedule, epoch count, loss weighting, augmentation, and train/val splits) were held strictly identical across formulations for each architecture; we have added an explicit sentence and a hyper-parameter summary table in the Methods section to document this protocol. revision: yes
-
Referee: [Methods] Methods / Experimental protocol: No information is supplied on whether the five architectures were trained under identical hyper-parameter regimes when switching from distant to adjacent inputs. If any protocol element differed systematically, the reported superiority of the adjacent formulation cannot be isolated from training differences.
Authors: We confirm that hyper-parameters were frozen for each architecture when the input formulation was changed; only the choice of input slices differed. This isolation was the central experimental design. The revised Methods section now contains a dedicated paragraph and an accompanying table that lists all hyper-parameters (optimizer, learning rate, epochs, loss weights, augmentation policy, and data splits) and states that they remained unchanged across the two formulations for every model. revision: yes
Circularity Check
No circularity: empirical comparisons are direct and non-reductive
full rationale
The paper reports measured PSNR and SSIM values obtained by training five architectures on two explicitly different input formulations (adjacent vs. distant slices). These metrics are produced by standard supervised training and evaluation loops; they do not reduce to the input formulation by algebraic identity, by re-using a fitted parameter as a prediction, or by any self-citation chain. No equations, uniqueness theorems, or ansatzes appear in the provided text, and the central claim (formulation impact exceeds architecture impact) is presented as a ratio of observed deltas rather than a definitional tautology. The results are therefore self-contained against external replication.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption Supervised training on paired MRI slices is feasible and representative of clinical data.
Reference graph
Works this paper leans on
-
[1]
The Cancer Imaging Archive (TCIA): Maintaining and Operating a Public Information Repository,
Clark, K., Vendt, B., Smith, K., Freymann, J., Kirby, J., Koppel, P., Moore, S., Phillips, S., Maffitt, D., Pringle, M., et al.: The cancer imaging archive (tcia): main- taining and operating a public information repository. Journal of Digital Imaging 26(6), 1045–1057 (2013).https://doi.org/10.1007/s10278-013-9622-7,https: //doi.org/10.1007/s10278-013-9622-7
-
[2]
Globus Team: Globus: Research data management.https://www.globus.org (2024), accessed: 2024-12-05
work page 2024
-
[3]
The Computer Journal52(1), 43–63 (2008).https://doi.org/10.1093/comjnl/bxm075,https://doi.org/10
Greenspan, H.: Super-resolution in medical imaging. The Computer Journal52(1), 43–63 (2008).https://doi.org/10.1093/comjnl/bxm075,https://doi.org/10. 1093/comjnl/bxm075
-
[4]
Denoising Diffusion Probabilistic Models
Ho, J., Jain, A., Abbeel, P.: Denoising diffusion probabilistic models. In: Advances in Neural Information Processing Systems. vol. 33, pp. 6840–6851 (2020),https: //arxiv.org/abs/2006.11239
work page internal anchor Pith review Pith/arXiv arXiv 2020
-
[5]
Proceedings of the AAAI Conference on Artificial Intelligence , author =
Irvin, J., Rajpurkar, P., Ko, M., Yu, Y., Ciurea-Ilcus, S., Chute, C., Marklund, H., Haghgoo, B., Ball, R., Shpanskaya, K., et al.: Chexpert: A large chest radiograph dataset with uncertainty labels and expert comparison. In: Proceedings of the AAAI Conference on Artificial Intelligence. vol. 33, pp. 590–597 (2019).https: //doi.org/10.1609/aaai.v33i01.3301590
-
[6]
Image-to-Image Translation with Conditional Adversarial Networks
Isola, P., Zhu, J.Y., Zhou, T., Efros, A.A.: Image-to-image translation with con- ditional adversarial networks. In: Proceedings of the IEEE Conference on Com- puter Vision and Pattern Recognition. pp. 1125–1134 (2017).https://doi.org/ 10.1109/CVPR.2017.632,https://arxiv.org/abs/1611.07004
work page internal anchor Pith review Pith/arXiv arXiv doi:10.1109/cvpr.2017.632 2017
-
[7]
Photo-Realistic Single Image Super-Resolution Using a Generative Adversarial Network
Ledig, C., Theis, L., Husz´ ar, F., Caballero, J., Cunningham, A., Acosta, A., Aitken, A., Tejani, A., Totz, J., Wang, Z., et al.: Photo-realistic single image super- resolution using a generative adversarial network. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. pp. 4681–4690 (2017). https://doi.org/10.1109/CVPR.2017....
work page internal anchor Pith review Pith/arXiv arXiv doi:10.1109/cvpr.2017.19 2017
-
[8]
Enhanced Deep Residual Networks for Single Image Super-Resolution
Lim, B., Son, S., Kim, H., Nah, S., Lee, K.M.: Enhanced deep residual net- works for single image super-resolution. In: Proceedings of the IEEE Confer- ence on Computer Vision and Pattern Recognition Workshops. pp. 136–144 (2017),https://openaccess.thecvf.com/content_cvpr_2017_workshops/w12/ papers/Lim_Enhanced_Deep_Residual_CVPR_2017_paper.pdf, arXiv:1707.02921
work page internal anchor Pith review Pith/arXiv arXiv 2017
-
[9]
Natarajan, S., Priester, A., Margolis, D., Huang, J., Marks, L.S.: Prostate mri and ultrasound with pathology and coordinates of tracked biopsy (prostate-mri- us-biopsy). The Cancer Imaging Archive (2020).https://doi.org/10.7937/ TCIA.2020.A61IOC1A,https://www.cancerimagingarchive.net/collection/ prostate-mri-us-biopsy/
work page 2020
-
[10]
CheXNet: Radiologist-Level Pneumonia Detection on Chest X-Rays with Deep Learning
Rajpurkar, P., Irvin, J., Zhu, K., Yang, B., Mehta, H., Duan, T., Ding, D., Bagul, A., Langlotz, C., Shpanskaya, K., et al.: Chexnet: Radiologist-level pneumonia de- tection on chest x-rays with deep learning. arXiv preprint arXiv:1711.05225 (2017), https://arxiv.org/abs/1711.05225
work page internal anchor Pith review Pith/arXiv arXiv 2017
-
[11]
U-Net: Convolutional Networks for Biomedical Image Segmentation
Ronneberger, O., Fischer, P., Brox, T.: U-net: Convolutional networks for biomedi- cal image segmentation. In: International Conference on Medical Image Computing and Computer-Assisted Intervention. pp. 234–241. Springer (2015).https://doi. org/10.1007/978-3-319-24574-4_28,https://arxiv.org/abs/1505.04597
work page internal anchor Pith review Pith/arXiv arXiv doi:10.1007/978-3-319-24574-4_28 2015
-
[12]
In: International Conference on Learning Representations (2021),https://arxiv.org/abs/2010
Song, J., Meng, C., Ermon, S.: Denoising diffusion implicit models. In: International Conference on Learning Representations (2021),https://arxiv.org/abs/2010. 02502 9
work page 2021
-
[13]
Self-Attention Generative Adversarial Networks
Zhang, H., Goodfellow, I., Metaxas, D., Odena, A.: Self-attention generative adver- sarial networks. In: International Conference on Machine Learning. pp. 7354–7363. PMLR (2019),https://arxiv.org/abs/1805.08318 10 Supplementary Material This supplementary material provides extended details, additional experimental results, and comprehensive technical spec...
work page internal anchor Pith review Pith/arXiv arXiv 2019
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.