Deep Learning for MRI Slice Interpolation: The Critical Role of Problem Formulation

Shamit Savant

arxiv: 2605.16476 · v1 · pith:7SRUL52Jnew · submitted 2026-05-15 · 📡 eess.IV · cs.CV· cs.LG

Deep Learning for MRI Slice Interpolation: The Critical Role of Problem Formulation

Shamit Savant This is my paper

Pith reviewed 2026-05-19 21:39 UTC · model grok-4.3

classification 📡 eess.IV cs.CVcs.LG

keywords MRI slice interpolationdeep learningproblem formulationU-NetSSIMPSNRprostate MRIadjacent slices

0 comments

The pith

Reformulating MRI slice inputs from distant to adjacent slices improves interpolation far more than model complexity.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper establishes that in deep learning for increasing MRI resolution by filling missing slices, the way the task is defined has a much larger effect on results than the choice of neural network. Training models to predict a slice from its immediate neighboring slices rather than from slices two positions away produced a 58 percent gain in SSIM across several architectures. This approach also let a standard U-Net reach a PSNR of 30.08 dB and SSIM of 0.898, beating linear interpolation by 10 percent. The findings indicate that careful problem setup can matter hundreds of times more than adding architectural sophistication for this medical imaging task.

Core claim

By reformulating the interpolation task to use adjacent slices (i-1, i+1) rather than distant slices (i-2, i+2), the author achieved a 58% improvement in SSIM performance across all deterministic architectures. The U-Net model achieved the best results with PSNR of 30.08 dB and SSIM of 0.898, representing a 10.1% improvement over linear interpolation baseline. A DDPM was also evaluated but showed poor reconstruction quality due to fundamental mismatch between stochastic generation and deterministic reconstruction requirements. These findings demonstrate that problem formulation can have 290x more impact than architectural sophistication in medical imaging tasks.

What carries the argument

The reformulation of input slices from distant positions (i-2, i+2) to adjacent positions (i-1, i+1) for predicting the target slice, which drives the large performance difference across models.

If this is right

Adjacent-slice formulation produces a 58% SSIM lift for every deterministic model tested.
U-Net with adjacent inputs reaches the highest PSNR of 30.08 dB and SSIM of 0.898.
DDPM fails on this task because its stochastic nature conflicts with the need for deterministic reconstruction.
Problem formulation exerts roughly 290 times the influence of architectural choices.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Similar input-reformulation experiments could be run on other low-resolution medical imaging problems to test whether modest changes in task definition routinely outperform model upgrades.
Baseline comparisons in medical image synthesis should explicitly control for slice distance to avoid attributing gains to architecture alone.
The same principle might apply to video frame interpolation or other sequential data where choosing nearby context frames changes reconstruction quality.

Load-bearing premise

The reported gains come primarily from the adjacent versus distant slice choice rather than from differences in training procedures, hyperparameters, or dataset details.

What would settle it

Re-training the same set of architectures with identical procedures and data but swapping only between the adjacent-slice and distant-slice input formulations to test whether the 58% SSIM gain remains.

Figures

Figures reproduced from arXiv: 2605.16476 by Shamit Savant.

**Figure 2.** Figure 2: Impact of problem formulation on interpolation quality. [PITH_FULL_IMAGE:figures/full_fig_p004_2.png] view at source ↗

**Figure 3.** Figure 3: PSNR vs SSIM scatter plot across 6,963 test samples. All deep learning [PITH_FULL_IMAGE:figures/full_fig_p006_3.png] view at source ↗

**Figure 4.** Figure 4: Comprehensive qualitative comparison on representative test case. Top [PITH_FULL_IMAGE:figures/full_fig_p006_4.png] view at source ↗

**Figure 5.** Figure 5: Spatial SSIM map for U-Net prediction. Left: U-Net output. Right: Local [PITH_FULL_IMAGE:figures/full_fig_p007_5.png] view at source ↗

read the original abstract

Through-plane resolution in clinical MRI is typically much coarser than in-plane resolution, limiting diagnostic utility. This work investigates deep learning approaches to interpolate intermediate MRI slices in prostate imaging, effectively doubling through-plane resolution. I evaluated five architectures (CNN, U-Net, two GAN variants, and DDPM) and discovered that problem formulation has dramatically more impact than architectural complexity. By reformulating the interpolation task to use adjacent slices (i-1, i+1) rather than distant slices (i-2, i+2), I achieved a 58% improvement in SSIM performance across all deterministic architectures. The U-Net model achieved the best results with PSNR of 30.08 dB and SSIM of 0.898, representing a 10.1% improvement over linear interpolation baseline. A DDPM was also evaluated but showed poor reconstruction quality due to fundamental mismatch between stochastic generation and deterministic reconstruction requirements. These findings demonstrate that problem formulation can have 290x more impact than architectural sophistication in medical imaging tasks.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper reports a solid 58% SSIM lift from switching to adjacent-slice inputs in prostate MRI interpolation, but the claim that formulation outweighs architecture by 290x rests on experiments whose training details are not shown to be matched.

read the letter

The one or two things to know are that reformulating the task to pull from immediately neighboring slices rather than skipping one produces a large reported gain in reconstruction quality, and that this holds across the deterministic models tested. The U-Net reaches 30.08 dB PSNR and 0.898 SSIM, beating linear interpolation by 10.1 percent on the prostate data. That empirical comparison is the concrete new piece here. The work also checks a DDPM and correctly notes that stochastic sampling does not align well with the deterministic nature of slice interpolation, which is a useful negative result to include. The paper does a clean job of keeping the clinical motivation front and center: doubling through-plane resolution in existing prostate scans without new hardware. The numbers are given plainly and the architectures are standard, so the comparison is easy to follow at the level of the abstract. The central claim that formulation matters far more than architecture choice is worth testing, and the 58 percent relative SSIM improvement is a legitimate data point within the medical synthesis literature. The soft spot is exactly the one the stress-test flagged. Nothing in the text confirms that optimizer, learning rate, epoch count, loss weights, augmentation, or train-val splits were held fixed between the adjacent-slice and distant-slice runs. If any of those differed, the performance gap cannot be attributed cleanly to input distance. The 290x multiplier is stated without the supporting arithmetic or baseline comparison that would make it verifiable. No error bars or significance tests appear either, which leaves the strength of the result harder to judge. This is a paper for readers already working on MRI super-resolution or slice synthesis in urology imaging. Someone running similar experiments could pick up the adjacent-slice formulation and try it directly. It is not broad enough to change the wider field, but the result is specific enough that a referee could check the controls and decide whether the formulation effect survives. I would send it to peer review rather than desk-reject, with the explicit request that the authors document the training protocols for both formulations and show the matched numbers side by side.

Referee Report

2 major / 2 minor

Summary. The manuscript evaluates deep learning for through-plane MRI slice interpolation in prostate imaging, comparing five architectures (CNN, U-Net, two GAN variants, DDPM) under two formulations: predicting the intermediate slice from adjacent inputs (i-1, i+1) versus distant inputs (i-2, i+2). It reports that the adjacent formulation yields a 58% SSIM gain across deterministic models, with U-Net reaching PSNR 30.08 dB and SSIM 0.898 (10.1% above linear interpolation), while DDPM performs poorly due to stochastic-deterministic mismatch. The central claim is that problem formulation has dramatically larger impact (stated as 290x) than architectural choice.

Significance. If the performance differences can be isolated to formulation with matched protocols, the work usefully demonstrates that input-slice distance can dominate architectural sophistication in medical image interpolation, offering a practical lever for improving through-plane resolution without added model complexity. The explicit comparison to linear baseline and the DDPM failure mode provide concrete, falsifiable observations that could guide future pipeline design.

major comments (2)

[Abstract / Results] Abstract and results: The 58% SSIM improvement and '290x more impact' claim for formulation versus architecture are presented without reporting the corresponding metrics for the distant-slice (i-2, i+2) case, without defining how 'impact' is quantified, and without stating whether optimizer, learning-rate schedule, epoch count, loss weighting, augmentation, or train/val splits were held identical across formulations. These omissions directly undermine causal attribution of the gains to slice distance alone.
[Methods] Methods / Experimental protocol: No information is supplied on whether the five architectures were trained under identical hyper-parameter regimes when switching from distant to adjacent inputs. If any protocol element differed systematically, the reported superiority of the adjacent formulation cannot be isolated from training differences.

minor comments (2)

[Abstract] The abstract mentions evaluation of 'two GAN variants' but does not name or briefly characterize them; adding one sentence would improve clarity.
[Results] The DDPM discussion would benefit from a short statement of the precise loss or sampling schedule used, to allow readers to reproduce the observed mismatch with deterministic reconstruction.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive feedback on our manuscript. The comments highlight important aspects of experimental reporting that we have now clarified and expanded in the revised version. We address each major comment below.

read point-by-point responses

Referee: [Abstract / Results] Abstract and results: The 58% SSIM improvement and '290x more impact' claim for formulation versus architecture are presented without reporting the corresponding metrics for the distant-slice (i-2, i+2) case, without defining how 'impact' is quantified, and without stating whether optimizer, learning-rate schedule, epoch count, loss weighting, augmentation, or train/val splits were held identical across formulations. These omissions directly undermine causal attribution of the gains to slice distance alone.

Authors: We agree that explicit reporting of the distant-slice metrics and a precise definition of 'impact' would strengthen the causal claim. The 58% figure represents the average relative SSIM gain across the four deterministic models when switching from (i-2, i+2) to (i-1, i+1) inputs. We have added a new table (Table 2) that reports absolute PSNR and SSIM for both formulations side-by-side for every architecture. The '290x' multiplier is defined as the ratio of the mean formulation-induced SSIM delta (0.58) to the largest architecture-induced SSIM delta observed within a fixed formulation (0.002); this definition and the underlying numbers are now stated in the revised Results section. All training elements (optimizer, learning-rate schedule, epoch count, loss weighting, augmentation, and train/val splits) were held strictly identical across formulations for each architecture; we have added an explicit sentence and a hyper-parameter summary table in the Methods section to document this protocol. revision: yes
Referee: [Methods] Methods / Experimental protocol: No information is supplied on whether the five architectures were trained under identical hyper-parameter regimes when switching from distant to adjacent inputs. If any protocol element differed systematically, the reported superiority of the adjacent formulation cannot be isolated from training differences.

Authors: We confirm that hyper-parameters were frozen for each architecture when the input formulation was changed; only the choice of input slices differed. This isolation was the central experimental design. The revised Methods section now contains a dedicated paragraph and an accompanying table that lists all hyper-parameters (optimizer, learning rate, epochs, loss weights, augmentation policy, and data splits) and states that they remained unchanged across the two formulations for every model. revision: yes

Circularity Check

0 steps flagged

No circularity: empirical comparisons are direct and non-reductive

full rationale

The paper reports measured PSNR and SSIM values obtained by training five architectures on two explicitly different input formulations (adjacent vs. distant slices). These metrics are produced by standard supervised training and evaluation loops; they do not reduce to the input formulation by algebraic identity, by re-using a fitted parameter as a prediction, or by any self-citation chain. No equations, uniqueness theorems, or ansatzes appear in the provided text, and the central claim (formulation impact exceeds architecture impact) is presented as a ratio of observed deltas rather than a definitional tautology. The results are therefore self-contained against external replication.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central claim rests on standard deep learning assumptions for supervised image-to-image tasks and the availability of paired high- and low-resolution MRI data for training and evaluation.

axioms (1)

domain assumption Supervised training on paired MRI slices is feasible and representative of clinical data.
The evaluation of interpolation performance assumes access to ground-truth intermediate slices for computing PSNR and SSIM.

pith-pipeline@v0.9.0 · 5705 in / 1408 out tokens · 36493 ms · 2026-05-19T21:39:24.835956+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

13 extracted references · 13 canonical work pages · 7 internal anchors

[1]

The Cancer Imaging Archive (TCIA): Maintaining and Operating a Public Information Repository,

Clark, K., Vendt, B., Smith, K., Freymann, J., Kirby, J., Koppel, P., Moore, S., Phillips, S., Maffitt, D., Pringle, M., et al.: The cancer imaging archive (tcia): main- taining and operating a public information repository. Journal of Digital Imaging 26(6), 1045–1057 (2013).https://doi.org/10.1007/s10278-013-9622-7,https: //doi.org/10.1007/s10278-013-9622-7

work page doi:10.1007/s10278-013-9622-7 2013
[2]

Globus Team: Globus: Research data management.https://www.globus.org (2024), accessed: 2024-12-05

work page 2024
[3]

The Computer Journal52(1), 43–63 (2008).https://doi.org/10.1093/comjnl/bxm075,https://doi.org/10

Greenspan, H.: Super-resolution in medical imaging. The Computer Journal52(1), 43–63 (2008).https://doi.org/10.1093/comjnl/bxm075,https://doi.org/10. 1093/comjnl/bxm075

work page doi:10.1093/comjnl/bxm075 2008
[4]

Denoising Diffusion Probabilistic Models

Ho, J., Jain, A., Abbeel, P.: Denoising diffusion probabilistic models. In: Advances in Neural Information Processing Systems. vol. 33, pp. 6840–6851 (2020),https: //arxiv.org/abs/2006.11239

work page internal anchor Pith review Pith/arXiv arXiv 2020
[5]

Proceedings of the AAAI Conference on Artificial Intelligence , author =

Irvin, J., Rajpurkar, P., Ko, M., Yu, Y., Ciurea-Ilcus, S., Chute, C., Marklund, H., Haghgoo, B., Ball, R., Shpanskaya, K., et al.: Chexpert: A large chest radiograph dataset with uncertainty labels and expert comparison. In: Proceedings of the AAAI Conference on Artificial Intelligence. vol. 33, pp. 590–597 (2019).https: //doi.org/10.1609/aaai.v33i01.3301590

work page doi:10.1609/aaai.v33i01.3301590 2019
[6]

Image-to-Image Translation with Conditional Adversarial Networks

Isola, P., Zhu, J.Y., Zhou, T., Efros, A.A.: Image-to-image translation with con- ditional adversarial networks. In: Proceedings of the IEEE Conference on Com- puter Vision and Pattern Recognition. pp. 1125–1134 (2017).https://doi.org/ 10.1109/CVPR.2017.632,https://arxiv.org/abs/1611.07004

work page internal anchor Pith review Pith/arXiv arXiv doi:10.1109/cvpr.2017.632 2017
[7]

Photo-Realistic Single Image Super-Resolution Using a Generative Adversarial Network

Ledig, C., Theis, L., Husz´ ar, F., Caballero, J., Cunningham, A., Acosta, A., Aitken, A., Tejani, A., Totz, J., Wang, Z., et al.: Photo-realistic single image super- resolution using a generative adversarial network. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. pp. 4681–4690 (2017). https://doi.org/10.1109/CVPR.2017....

work page internal anchor Pith review Pith/arXiv arXiv doi:10.1109/cvpr.2017.19 2017
[8]

Enhanced Deep Residual Networks for Single Image Super-Resolution

Lim, B., Son, S., Kim, H., Nah, S., Lee, K.M.: Enhanced deep residual net- works for single image super-resolution. In: Proceedings of the IEEE Confer- ence on Computer Vision and Pattern Recognition Workshops. pp. 136–144 (2017),https://openaccess.thecvf.com/content_cvpr_2017_workshops/w12/ papers/Lim_Enhanced_Deep_Residual_CVPR_2017_paper.pdf, arXiv:1707.02921

work page internal anchor Pith review Pith/arXiv arXiv 2017
[9]

The Cancer Imaging Archive (2020).https://doi.org/10.7937/ TCIA.2020.A61IOC1A,https://www.cancerimagingarchive.net/collection/ prostate-mri-us-biopsy/

Natarajan, S., Priester, A., Margolis, D., Huang, J., Marks, L.S.: Prostate mri and ultrasound with pathology and coordinates of tracked biopsy (prostate-mri- us-biopsy). The Cancer Imaging Archive (2020).https://doi.org/10.7937/ TCIA.2020.A61IOC1A,https://www.cancerimagingarchive.net/collection/ prostate-mri-us-biopsy/

work page 2020
[10]

CheXNet: Radiologist-Level Pneumonia Detection on Chest X-Rays with Deep Learning

Rajpurkar, P., Irvin, J., Zhu, K., Yang, B., Mehta, H., Duan, T., Ding, D., Bagul, A., Langlotz, C., Shpanskaya, K., et al.: Chexnet: Radiologist-level pneumonia de- tection on chest x-rays with deep learning. arXiv preprint arXiv:1711.05225 (2017), https://arxiv.org/abs/1711.05225

work page internal anchor Pith review Pith/arXiv arXiv 2017
[11]

U-Net: Convolutional Networks for Biomedical Image Segmentation

Ronneberger, O., Fischer, P., Brox, T.: U-net: Convolutional networks for biomedi- cal image segmentation. In: International Conference on Medical Image Computing and Computer-Assisted Intervention. pp. 234–241. Springer (2015).https://doi. org/10.1007/978-3-319-24574-4_28,https://arxiv.org/abs/1505.04597

work page internal anchor Pith review Pith/arXiv arXiv doi:10.1007/978-3-319-24574-4_28 2015
[12]

In: International Conference on Learning Representations (2021),https://arxiv.org/abs/2010

Song, J., Meng, C., Ermon, S.: Denoising diffusion implicit models. In: International Conference on Learning Representations (2021),https://arxiv.org/abs/2010. 02502 9

work page 2021
[13]

Self-Attention Generative Adversarial Networks

Zhang, H., Goodfellow, I., Metaxas, D., Odena, A.: Self-attention generative adver- sarial networks. In: International Conference on Machine Learning. pp. 7354–7363. PMLR (2019),https://arxiv.org/abs/1805.08318 10 Supplementary Material This supplementary material provides extended details, additional experimental results, and comprehensive technical spec...

work page internal anchor Pith review Pith/arXiv arXiv 2019

[1] [1]

The Cancer Imaging Archive (TCIA): Maintaining and Operating a Public Information Repository,

Clark, K., Vendt, B., Smith, K., Freymann, J., Kirby, J., Koppel, P., Moore, S., Phillips, S., Maffitt, D., Pringle, M., et al.: The cancer imaging archive (tcia): main- taining and operating a public information repository. Journal of Digital Imaging 26(6), 1045–1057 (2013).https://doi.org/10.1007/s10278-013-9622-7,https: //doi.org/10.1007/s10278-013-9622-7

work page doi:10.1007/s10278-013-9622-7 2013

[2] [2]

Globus Team: Globus: Research data management.https://www.globus.org (2024), accessed: 2024-12-05

work page 2024

[3] [3]

The Computer Journal52(1), 43–63 (2008).https://doi.org/10.1093/comjnl/bxm075,https://doi.org/10

Greenspan, H.: Super-resolution in medical imaging. The Computer Journal52(1), 43–63 (2008).https://doi.org/10.1093/comjnl/bxm075,https://doi.org/10. 1093/comjnl/bxm075

work page doi:10.1093/comjnl/bxm075 2008

[4] [4]

Denoising Diffusion Probabilistic Models

Ho, J., Jain, A., Abbeel, P.: Denoising diffusion probabilistic models. In: Advances in Neural Information Processing Systems. vol. 33, pp. 6840–6851 (2020),https: //arxiv.org/abs/2006.11239

work page internal anchor Pith review Pith/arXiv arXiv 2020

[5] [5]

Proceedings of the AAAI Conference on Artificial Intelligence , author =

Irvin, J., Rajpurkar, P., Ko, M., Yu, Y., Ciurea-Ilcus, S., Chute, C., Marklund, H., Haghgoo, B., Ball, R., Shpanskaya, K., et al.: Chexpert: A large chest radiograph dataset with uncertainty labels and expert comparison. In: Proceedings of the AAAI Conference on Artificial Intelligence. vol. 33, pp. 590–597 (2019).https: //doi.org/10.1609/aaai.v33i01.3301590

work page doi:10.1609/aaai.v33i01.3301590 2019

[6] [6]

Image-to-Image Translation with Conditional Adversarial Networks

Isola, P., Zhu, J.Y., Zhou, T., Efros, A.A.: Image-to-image translation with con- ditional adversarial networks. In: Proceedings of the IEEE Conference on Com- puter Vision and Pattern Recognition. pp. 1125–1134 (2017).https://doi.org/ 10.1109/CVPR.2017.632,https://arxiv.org/abs/1611.07004

work page internal anchor Pith review Pith/arXiv arXiv doi:10.1109/cvpr.2017.632 2017

[7] [7]

Photo-Realistic Single Image Super-Resolution Using a Generative Adversarial Network

Ledig, C., Theis, L., Husz´ ar, F., Caballero, J., Cunningham, A., Acosta, A., Aitken, A., Tejani, A., Totz, J., Wang, Z., et al.: Photo-realistic single image super- resolution using a generative adversarial network. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. pp. 4681–4690 (2017). https://doi.org/10.1109/CVPR.2017....

work page internal anchor Pith review Pith/arXiv arXiv doi:10.1109/cvpr.2017.19 2017

[8] [8]

Enhanced Deep Residual Networks for Single Image Super-Resolution

Lim, B., Son, S., Kim, H., Nah, S., Lee, K.M.: Enhanced deep residual net- works for single image super-resolution. In: Proceedings of the IEEE Confer- ence on Computer Vision and Pattern Recognition Workshops. pp. 136–144 (2017),https://openaccess.thecvf.com/content_cvpr_2017_workshops/w12/ papers/Lim_Enhanced_Deep_Residual_CVPR_2017_paper.pdf, arXiv:1707.02921

work page internal anchor Pith review Pith/arXiv arXiv 2017

[9] [9]

The Cancer Imaging Archive (2020).https://doi.org/10.7937/ TCIA.2020.A61IOC1A,https://www.cancerimagingarchive.net/collection/ prostate-mri-us-biopsy/

Natarajan, S., Priester, A., Margolis, D., Huang, J., Marks, L.S.: Prostate mri and ultrasound with pathology and coordinates of tracked biopsy (prostate-mri- us-biopsy). The Cancer Imaging Archive (2020).https://doi.org/10.7937/ TCIA.2020.A61IOC1A,https://www.cancerimagingarchive.net/collection/ prostate-mri-us-biopsy/

work page 2020

[10] [10]

CheXNet: Radiologist-Level Pneumonia Detection on Chest X-Rays with Deep Learning

Rajpurkar, P., Irvin, J., Zhu, K., Yang, B., Mehta, H., Duan, T., Ding, D., Bagul, A., Langlotz, C., Shpanskaya, K., et al.: Chexnet: Radiologist-level pneumonia de- tection on chest x-rays with deep learning. arXiv preprint arXiv:1711.05225 (2017), https://arxiv.org/abs/1711.05225

work page internal anchor Pith review Pith/arXiv arXiv 2017

[11] [11]

U-Net: Convolutional Networks for Biomedical Image Segmentation

Ronneberger, O., Fischer, P., Brox, T.: U-net: Convolutional networks for biomedi- cal image segmentation. In: International Conference on Medical Image Computing and Computer-Assisted Intervention. pp. 234–241. Springer (2015).https://doi. org/10.1007/978-3-319-24574-4_28,https://arxiv.org/abs/1505.04597

work page internal anchor Pith review Pith/arXiv arXiv doi:10.1007/978-3-319-24574-4_28 2015

[12] [12]

In: International Conference on Learning Representations (2021),https://arxiv.org/abs/2010

Song, J., Meng, C., Ermon, S.: Denoising diffusion implicit models. In: International Conference on Learning Representations (2021),https://arxiv.org/abs/2010. 02502 9

work page 2021

[13] [13]

Self-Attention Generative Adversarial Networks

Zhang, H., Goodfellow, I., Metaxas, D., Odena, A.: Self-attention generative adver- sarial networks. In: International Conference on Machine Learning. pp. 7354–7363. PMLR (2019),https://arxiv.org/abs/1805.08318 10 Supplementary Material This supplementary material provides extended details, additional experimental results, and comprehensive technical spec...

work page internal anchor Pith review Pith/arXiv arXiv 2019