MRecover: A Conditional Generative Model for Recovering Motion-Corrupted MR images Using AI Generated Contrast
Pith reviewed 2026-05-22 09:06 UTC · model grok-4.3
The pith
A conditional generative model turns routine T1-weighted scans into high-resolution T2-weighted images that recover motion-corrupted hippocampal subfield details.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
MRecover is a conditional generative model that synthesizes T2-weighted turbo spin echo images from T1-weighted inputs using autoregressive slice conditioning; when trained on 577 7T volumes it attains SSIM of 0.84 and FSIM of 0.94 in-domain, while on 416 out-of-domain 3T cases the resulting subfield volumes correlate at r=0.87-0.97 with acquired images, recovering 593 analyzable subjects versus 450 and increasing effect sizes for group differences in hippocampal subfield atrophy from 0.086-0.062 to 0.121-0.100.
What carries the argument
MRecover, a conditional generative model that uses autoregressive slice conditioning to synthesize TSE contrast from T1w images while preserving volumetric consistency across slices.
If this is right
- Motion-corrupted datasets such as ADNI3 retain 31.8 percent more subjects after quality control when synthesized images are used.
- Larger sample sizes produce increased effect sizes for detecting diagnostic group differences in hippocampal subfield atrophy.
- The model generalizes from 7T training data to 3T clinical scans without retraining.
- Volume measurements extracted from synthesized images match those from acquired images at correlations of 0.87 to 0.97.
Where Pith is reading between the lines
- If boundary fidelity holds beyond volume totals, the approach could support more precise longitudinal tracking of subfield atrophy rates.
- Wider adoption might reduce repeat-scan rates for motion-sensitive sequences in memory-clinic workflows.
- Similar synthesis pipelines could be tested on other motion-vulnerable contrasts to improve overall MRI data yield.
Load-bearing premise
That close agreement in measured subfield volumes between synthesized and acquired images guarantees that fine anatomical boundaries have been recovered accurately enough for reliable segmentation.
What would settle it
Expert manual segmentation of motion-free 3T TSE images versus the same subjects' synthesized versions, checking whether boundary placement errors exceed the precision needed to detect the reported atrophy differences.
read the original abstract
Hippocampal subfield segmentation requires high-resolution T2w turbo spin echo (TSE) MRI, yet this sequence is susceptible to motion artifacts, leading to substantial data loss. We developed a conditional generative model (MRecover) that synthesizes routinely acquired T1w images to create TSE images with autoregressive slice conditioning for volumetric consistency. Trained on 7T MRI data (n=577), the model achieved high in-domain fidelity (n=148, SSIM=0.84, FSIM=0.94) and generalized well to out-of-domain 3T data: subfield volumes from synthesized and the as-acquired images closely matched: (n=416, r=0.87-0.97) and yielded 31.8% more analyzable subjects in the motion-affected ADNI3 dataset after quality control (593 vs 450). The synthesized images also achieved larger effect sizes due to increasing the sample size for diagnostic group differences in hippocampal subfield atrophy (whole hippocampus $\epsilon^2$= 0.121-0.100 vs. 0.086-0.062, left-right hemispheres). Project page: https://jinghangli98.github.io/MRecover/
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper introduces MRecover, a conditional generative model that synthesizes high-resolution T2-weighted TSE images from routinely acquired T1-weighted images with autoregressive slice conditioning to recover motion-corrupted data for hippocampal subfield segmentation. Trained on 7T data (n=577), it reports in-domain fidelity metrics (SSIM=0.84, FSIM=0.94 on n=148) and out-of-domain generalization to 3T data with subfield volume correlations (r=0.87-0.97 on n=416). Application to the motion-affected ADNI3 dataset increases the number of analyzable subjects from 450 to 593 after quality control and produces larger effect sizes for diagnostic differences in hippocampal subfield atrophy (whole hippocampus ε²=0.121-0.100 versus 0.086-0.062 for left-right hemispheres).
Significance. If the synthesized images support accurate subfield segmentation, the method could meaningfully reduce data loss in motion-sensitive MRI protocols and increase statistical power for detecting atrophy patterns in Alzheimer's and related studies. The reported gains in sample size and effect sizes on a real-world dataset like ADNI3 indicate practical utility for neuroimaging pipelines that rely on TSE sequences.
major comments (2)
- [Out-of-domain 3T evaluation] Out-of-domain 3T evaluation (n=416): subfield volume correlations (r=0.87-0.97) are reported between synthesized and acquired images, but no Dice scores, Hausdorff distances, or boundary-error statistics are provided for the subfield segmentations on this test set. Because the headline claims of +31.8% more analyzable subjects and larger effect sizes (ε²=0.121-0.100) rest on the assumption that local anatomical boundaries are recovered faithfully rather than merely matching coarse volumes, the absence of these metrics leaves the downstream segmentation reliability unverified.
- [ADNI3 application] ADNI3 application results: the increase from 450 to 593 analyzable subjects and the reported improvement in group-difference effect sizes are presented as direct benefits of the synthesized images, yet these conclusions depend on the untested premise that subfield segmentations on the additional subjects reflect true anatomy rather than plausible but distorted boundaries that could inflate or attenuate the observed ε² values.
minor comments (2)
- [Abstract] The abstract and title refer to 'AI Generated Contrast' while the method description emphasizes conditional generation with autoregressive conditioning; a brief clarification of terminology would improve consistency.
- [Methods] Details on subject-level train/validation/test splits for the 7T training data (n=577) and any steps taken to avoid leakage across slices or subjects would strengthen reproducibility claims.
Simulated Author's Rebuttal
We thank the referee for their constructive comments, which have helped us improve the manuscript. We address the major comments point by point below.
read point-by-point responses
-
Referee: [Out-of-domain 3T evaluation] Out-of-domain 3T evaluation (n=416): subfield volume correlations (r=0.87-0.97) are reported between synthesized and acquired images, but no Dice scores, Hausdorff distances, or boundary-error statistics are provided for the subfield segmentations on this test set. Because the headline claims of +31.8% more analyzable subjects and larger effect sizes (ε²=0.121-0.100) rest on the assumption that local anatomical boundaries are recovered faithfully rather than merely matching coarse volumes, the absence of these metrics leaves the downstream segmentation reliability unverified.
Authors: We thank the referee for highlighting this important point. Although volume correlations provide evidence of overall structural agreement, we agree that metrics assessing local boundary accuracy, such as Dice coefficients and Hausdorff distances for subfield segmentations, would more directly support the reliability of the synthesized images for downstream analysis. We will compute and include these additional metrics in the revised version of the manuscript to verify the boundary fidelity on the out-of-domain 3T set. revision: yes
-
Referee: [ADNI3 application] ADNI3 application results: the increase from 450 to 593 analyzable subjects and the reported improvement in group-difference effect sizes are presented as direct benefits of the synthesized images, yet these conclusions depend on the untested premise that subfield segmentations on the additional subjects reflect true anatomy rather than plausible but distorted boundaries that could inflate or attenuate the observed ε² values.
Authors: We acknowledge that the ADNI3 results rely on the generalization of the model from validated settings. Since the additional subjects in ADNI3 had motion corruption preventing acquisition of usable TSE images, direct comparison to ground truth is inherently not possible. Our approach is supported by the strong out-of-domain performance on 3T data where ground truth is available. To address this concern, we will expand the discussion section to explicitly state the assumptions underlying the ADNI3 analysis and discuss the potential impact of any boundary distortions on the effect sizes. revision: partial
- Direct ground-truth validation of subfield segmentations is not possible for the motion-affected subjects in the ADNI3 application, as no usable TSE reference images are available for those cases.
Circularity Check
No significant circularity; results rely on independent held-out validation
full rationale
The paper trains MRecover on 7T data (n=577) and reports in-domain fidelity plus out-of-domain 3T generalization via direct comparison of synthesized vs. acquired subfield volumes (r=0.87-0.97, n=416) on independently acquired images. These correlations and the downstream claims (larger effect sizes, +31.8% analyzable subjects) are measured against external ground-truth acquisitions rather than reducing to fitted parameters or self-referential definitions. No equations, self-citations, or ansatzes are shown to be load-bearing in a way that forces the headline results by construction. The derivation chain remains self-contained against external benchmarks.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption Synthesized TSE images preserve the anatomical boundaries required for accurate hippocampal subfield segmentation when volume correlations with acquired images exceed 0.87.
Reference graph
Works this paper leans on
-
[1]
Berardo1, Joseph Mettenburg3, Ariel Gildengers4, Howard Aizenstein4, Minjie Wu4, Tamer S
MRecover: A Conditional Generative Model for Recovering Motion-Corrupted MR images Using AI Generated Contrast Jinghang Li#1, Tales Santini#1, Courtney Clark2, Bruno de Almeida1, Cong Chu1, Salem Alkhateeb1, Andrea Sajewski1, Jacob Berardinelli1, Hecheng Jin1, Tobias Campos1, Jeremy J. Berardo1, Joseph Mettenburg3, Ariel Gildengers4, Howard Aizenstein4, M...
work page 1932
-
[2]
On the in-domain 7T validation dataset (n=148), we quantified voxel-wise similarity between synthesized and as-acquired images using the structural similarity index (SSIM)39 and feature similarity index (FSIM)40. The proposed autoregressive (AR) flow-matching model achieved an SSIM of 0.8422 ± 0.0802 and a FSIM of 0.9390 ± 0.0239, outperforming the UNet b...
work page 1932
-
[3]
Flow matching training objective with autoregressive conditioning We implemented the flow matching training objective following37. We adapted the denoising diffusion model from MONAI and incorporated autoregressive conditioning for enhanced cross-slice consistency. Specifically, given a noisy source 𝓍* and a clean target image 𝓍+ the linear interpolation ...
work page 2008
-
[4]
Venhancer: Generative space-time enhancement for video generation
PloS one 14, e0224030 (2019). 15 Mueller, S. G. et al. Subfield atrophy pattern in temporal lobe epilepsy with and without mesial sclerosis detected by high‐resolution MRI at 4 Tesla: Preliminary results. Epilepsia 50, 1474-1483 (2009). 16 Debona, R. et al. Hippocampal subfields volumes and affective symptoms of patients with mesial temporal lobe epilepsy...
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.