Wavelet-Fusion Diffusion Model for Multimodal Brain MRI Synthesis with Modality and Metadata Conditioning

Mangor Pedersen; Muhammad Nabi Yasinzai; Remika Mito

arxiv: 2606.00689 · v1 · pith:ZFDVCHZMnew · submitted 2026-05-30 · 💻 cs.CV

Wavelet-Fusion Diffusion Model for Multimodal Brain MRI Synthesis with Modality and Metadata Conditioning

Muhammad Nabi Yasinzai , Remika Mito , Mangor Pedersen This is my paper

Pith reviewed 2026-06-28 18:40 UTC · model grok-4.3

classification 💻 cs.CV

keywords MRI synthesisdiffusion modelsmultimodal brain imaginglatent diffusionwavelet fusionconditional generationneuroimaging augmentation

0 comments

The pith

A wavelet-fusion variational autoencoder paired with a conditional diffusion model produces synthetic multimodal brain MRI volumes that match real data distributions more closely than prior generators.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper proposes a synthesis method to fill gaps in multimodal MRI datasets that suffer from uneven modality coverage across sites, scanners, and incomplete metadata. It compresses volumes into a latent space with a wavelet-fusion variational autoencoder, then trains a 3D U-Net diffusion model in that space while conditioning on target modality and available demographic or clinical variables. The resulting model is evaluated on distributional alignment metrics and reported to outperform other synthetic MRI generators. A reader would care because better synthetic volumes could support dataset augmentation for downstream neuroimaging AI without requiring new patient scans under every protocol combination.

Core claim

The Wavelet-Fusion Diffusion Model combines a Wavelet-Fusion variational autoencoder with a conditional 3D U-Net diffusion model trained in the learned latent space using explicit modality and metadata conditioning, and it achieved the strongest distributional alignment among the evaluated synthetic MRI generators.

What carries the argument

The Wavelet-Fusion variational autoencoder (WF-VAE) that serves as the latent compressor, combined with modality-and-metadata-conditioned diffusion in the resulting 3D latent space.

If this is right

Synthetic volumes can be generated for any target modality even when that modality is absent from a given subject's acquisition.
Conditioning on metadata enables controlled creation of synthetic cohorts stratified by age, sex, or clinical variables.
Latent-space diffusion reduces the computational cost of sampling compared with voxel-space diffusion while preserving sample fidelity.
The approach supports augmentation of pooled neuroimaging resources whose modality and protocol coverage varies widely.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

If the latent compressor preserves fine anatomical detail across scanners, the same model could be fine-tuned on new sites with minimal additional data.
Metadata conditioning might allow synthesis of rare demographic combinations that are underrepresented in real cohorts, enabling stress-testing of downstream classifiers.
The wavelet-fusion step could be swapped for other multi-scale encoders, opening a route to test whether the performance gain comes mainly from the fusion or from the diffusion stage itself.

Load-bearing premise

Generation quality depends on the autoencoder's reconstruction fidelity and the resulting latent distribution, and the conditional model can generalize across heterogeneous sites, scanners, acquisition protocols, and sparse or inconsistently recorded metadata.

What would settle it

A quantitative comparison on an independent multi-site test set, using metrics such as Fréchet inception distance or maximum mean discrepancy between real and synthetic distributions, in which the proposed model no longer ranks first would falsify the central performance claim.

Figures

Figures reproduced from arXiv: 2606.00689 by Mangor Pedersen, Muhammad Nabi Yasinzai, Remika Mito.

**Figure 3.** Figure 3: Qualitative axial-slice comparison across the four target modalities. Rows correspond to T1w, [PITH_FULL_IMAGE:figures/full_fig_p025_3.png] view at source ↗

**Figure 4.** Figure 4: Qualitative comparison for T1w synthesis. A real BrainScape MRI is compared with synthetic [PITH_FULL_IMAGE:figures/full_fig_p043_4.png] view at source ↗

**Figure 5.** Figure 5: Qualitative comparison for T2w synthesis. The real BrainScape target image is compared with [PITH_FULL_IMAGE:figures/full_fig_p044_5.png] view at source ↗

**Figure 6.** Figure 6: Qualitative comparison for T1Gd synthesis. The real BrainScape target image is compared [PITH_FULL_IMAGE:figures/full_fig_p045_6.png] view at source ↗

**Figure 7.** Figure 7: Qualitative comparison for FLAIR synthesis. The real BrainScape target image is compared [PITH_FULL_IMAGE:figures/full_fig_p046_7.png] view at source ↗

read the original abstract

Multimodal MRI provides complementary information for neuroimaging analysis, where different imaging modalities capture distinct anatomical, tissue, and pathological features that support the development and evaluation of downstream AI applications. Although large-scale structural MRI resources are increasingly available, their modality coverage is often uneven across public and pooled neuroimaging datasets. This uneven modality coverage is further complicated by heterogeneity across sites, scanners, and acquisition protocols, as well as demographic and clinical variables that are often sparse, inconsistently recorded, or unavailable across studies. Synthetic MRI generation can help address this imbalance by synthesizing target-modality volumes for dataset augmentation and controlled synthetic cohort creation. However, many existing MRI synthesis approaches are trained on narrow modality sets or relatively homogeneous cohorts, limiting their applicability to large pooled neuroimaging resources where modality availability, acquisition protocols, and metadata coverage vary substantially across datasets. Diffusion models have become an attractive approach for MRI synthesis because of their strong sample fidelity and diversity, but sampling directly in 3D voxel space is computationally expensive and slow at inference. Latent diffusion improves practicality by synthesizing MRI in a learned, 3D latent space, although generation quality depends on the autoencoder's reconstruction fidelity and the resulting latent distribution. Our approach combines a Wavelet-Fusion variational autoencoder (WF-VAE) latent compressor with a conditional 3D U-Net diffusion model trained in the learned latent space using explicit modality and metadata conditioning. Our proposed Wavelet-Fusion Diffusion Model (WFDM) achieved the strongest distributional alignment among the evaluated synthetic MRI generators.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper introduces a wavelet-fusion VAE plus modality/metadata-conditioned 3D latent diffusion for MRI synthesis, but the abstract states a top alignment claim with zero metrics, datasets, or baselines shown.

read the letter

The concrete new element is the WF-VAE compressor paired with a conditional 3D U-Net diffusion model that takes explicit modality and metadata inputs. That setup targets the real problem of uneven modality coverage and site heterogeneity in pooled neuroimaging data.

The approach makes sense on paper: wavelet fusion in the VAE could preserve multi-scale features better than a standard encoder, and the conditioning tries to make generation work across scanners and sparse metadata. The abstract correctly notes that output quality hinges on the autoencoder's reconstruction fidelity and on cross-site generalization.

The soft spot is the performance claim itself. It says WFDM achieved the strongest distributional alignment among evaluated generators, yet supplies no numbers, no dataset descriptions, no baseline names, and no evaluation details. Without those, the claim cannot be checked. The stress-test note is right that no internal contradiction appears in the given text, but that does not substitute for evidence.

This paper is for researchers already working on latent diffusion or medical image synthesis who want to see one more conditioning trick. A reader outside that niche will not get much. The architecture description might be worth skimming if the full experiments turn out to be reproducible, but the current write-up is too light on results to judge.

I would send it to peer review so the experiments can be examined directly. The idea is straightforward enough that a referee could quickly tell whether the gains are real or just stated.

Referee Report

1 major / 0 minor

Summary. The manuscript proposes the Wavelet-Fusion Diffusion Model (WFDM) for multimodal brain MRI synthesis. It combines a Wavelet-Fusion variational autoencoder (WF-VAE) for latent-space compression with a conditional 3D U-Net diffusion model trained in that latent space, using explicit conditioning on imaging modality and metadata. The central claim is that WFDM achieves the strongest distributional alignment among the evaluated synthetic MRI generators on heterogeneous pooled data.

Significance. If the performance claims are substantiated with appropriate metrics and controls, the method could meaningfully address modality imbalance and site heterogeneity in large-scale neuroimaging resources, supporting dataset augmentation for downstream AI tasks. The explicit metadata conditioning and wavelet-based latent compression are potentially useful design choices for handling sparse or inconsistent clinical variables.

major comments (1)

Abstract: the central performance claim (strongest distributional alignment) is stated without any metrics, datasets, baselines, cross-site splits, or evaluation protocol. This absence is load-bearing because the claim cannot be assessed for soundness or compared to prior work on the basis of the provided text.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for their review and the opportunity to clarify the presentation of our results. We address the major comment below.

read point-by-point responses

Referee: [—] Abstract: the central performance claim (strongest distributional alignment) is stated without any metrics, datasets, baselines, cross-site splits, or evaluation protocol. This absence is load-bearing because the claim cannot be assessed for soundness or compared to prior work on the basis of the provided text.

Authors: We agree that the abstract would be strengthened by including supporting details for the central claim. In the revised version we will add a concise sentence specifying the key distributional alignment metric (e.g., FID or MMD), the heterogeneous pooled datasets used, the main baselines, and a brief note on the cross-site evaluation protocol. This change keeps the abstract within length limits while making the claim assessable on its own. revision: yes

Circularity Check

0 steps flagged

No significant circularity detected

full rationale

The provided abstract and context contain no derivations, equations, or mathematical claims that could form a derivation chain. The central claim is an empirical statement of distributional alignment performance on evaluated generators, with no self-citations, fitted inputs renamed as predictions, or ansatzes invoked. The paper's description of WF-VAE and conditional diffusion is architectural rather than deductive, rendering the content self-contained against external benchmarks with no load-bearing reductions to inputs by construction.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract supplies no information on free parameters, background axioms, or newly postulated entities; all arrays left empty due to lack of technical content.

pith-pipeline@v0.9.1-grok · 5809 in / 1087 out tokens · 28104 ms · 2026-06-28T18:40:54.457773+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

45 extracted references · 5 canonical work pages · 2 internal anchors

[1]

Imaging Neuroscience , volume=

BrainScape: An open-source framework for integrating and preprocessing anatomical MRI datasets , author=. Imaging Neuroscience , volume=. 2025 , publisher=

2025
[2]

Proceedings of the IEEE/CVF conference on computer vision and pattern recognition , pages=

High-resolution image synthesis with latent diffusion models , author=. Proceedings of the IEEE/CVF conference on computer vision and pattern recognition , pages=
[3]

IEEE Journal of Biomedical and Health Informatics , volume=

Conditional diffusion models for semantic 3D brain MRI synthesis , author=. IEEE Journal of Biomedical and Health Informatics , volume=. 2024 , publisher=

2024
[4]

Scientific reports , volume=

Denoising diffusion probabilistic models for 3D medical image generation , author=. Scientific reports , volume=. 2023 , publisher=

2023
[5]

2025 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV) , pages=

Maisi: Medical ai for synthetic imaging , author=. 2025 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV) , pages=. 2025 , organization=

2025
[6]

International Conference on Medical Image Computing and Computer-Assisted Intervention , pages=

Unisyn: A generative foundation model for universal medical image synthesis across mri, ct and pet , author=. International Conference on Medical Image Computing and Computer-Assisted Intervention , pages=. 2025 , organization=

2025
[7]

International Conference on Medical Image Computing and Computer-Assisted Intervention , pages=

Cola-diff: Conditional latent diffusion model for multi-modal mri synthesis , author=. International Conference on Medical Image Computing and Computer-Assisted Intervention , pages=. 2023 , organization=

2023
[8]

IEEE Transactions on Medical Imaging , volume=

Multi-modal modality-masked diffusion network for brain mri synthesis with random modality missing , author=. IEEE Transactions on Medical Imaging , volume=. 2024 , publisher=

2024
[9]

npj Artificial Intelligence , volume=

MU-Diff: a mutual learning diffusion model for synthetic MRI with Application for brain lesions , author=. npj Artificial Intelligence , volume=. 2025 , publisher=

2025
[10]

Medical Image Analysis , volume=

Metadata-conditioned generative models to synthesize anatomically-plausible 3D brain MRIs , author=. Medical Image Analysis , volume=. 2024 , publisher=

2024
[11]

European conference on computer vision , pages=

Perceptual losses for real-time style transfer and super-resolution , author=. European conference on computer vision , pages=. 2016 , organization=

2016
[12]

Nature Communications , volume=

Generative AI enables medical image segmentation in ultra low-data regimes , author=. Nature Communications , volume=. 2025 , publisher=

2025
[13]

Journal of Medical Systems , volume=

Diffusion Models for Neuroimaging Data Augmentation: Assessing Realism and Clinical Relevance , author=. Journal of Medical Systems , volume=. 2025 , publisher=

2025
[14]

Proceedings of the IEEE/CVF International Conference on Computer Vision , pages=

Diffusion-based data augmentation for medical image segmentation , author=. Proceedings of the IEEE/CVF International Conference on Computer Vision , pages=
[15]

International Conference on Medical Image Computing and Computer-Assisted Intervention , pages=

Structure-Aware MRI Translation: Multi-modal Latent Diffusion Model with Arbitrary Missing Modalities , author=. International Conference on Medical Image Computing and Computer-Assisted Intervention , pages=. 2025 , organization=

2025
[16]

Tsinghua Science and Technology , volume=

Diffusion models for medical image computing: A survey , author=. Tsinghua Science and Technology , volume=. 2024 , publisher=

2024
[17]

Radiology , volume=

Generating synthetic data for medical imaging , author=. Radiology , volume=. 2024 , publisher=

2024
[18]

Scientific Reports , volume=

Similarity and quality metrics for MR image-to-image translation , author=. Scientific Reports , volume=. 2025 , publisher=

2025
[19]

IEEE Transactions on Medical Imaging , year=

Privacy-Preserving Latent Diffusion-Based Synthetic Medical Image Generation , author=. IEEE Transactions on Medical Imaging , year=
[20]

NeuroImage , volume=

Reliability assessment of tissue classification algorithms for multi-center and multi-scanner data , author=. NeuroImage , volume=. 2020 , publisher=

2020
[21]

Machine Learning: Science and Technology , volume=

Beware of diffusion models for synthesizing medical images—a comparison with GANs in terms of memorizing brain MRI and chest x-ray images , author=. Machine Learning: Science and Technology , volume=. 2025 , publisher=

2025
[22]

International Conference on Medical Image Computing and Computer-Assisted Intervention , pages=

Physics-informed latent diffusion for multimodal brain mri synthesis , author=. International Conference on Medical Image Computing and Computer-Assisted Intervention , pages=. 2024 , organization=

2024
[23]

Proceedings of the IEEE/CVF Winter conference on applications of computer Vision , pages=

Adaptive latent diffusion model for 3d medical image to image translation: Multi-modal magnetic resonance imaging study , author=. Proceedings of the IEEE/CVF Winter conference on applications of computer Vision , pages=
[24]

arXiv preprint arXiv:2412.16860 , year=

Diffusion-based approaches in medical image generation and analysis , author=. arXiv preprint arXiv:2412.16860 , year=

work page arXiv
[25]

Generative Machine Learning Models in Medical Image Computing , pages=

Deep generative models for 3D medical image synthesis , author=. Generative Machine Learning Models in Medical Image Computing , pages=. 2024 , publisher=

2024
[26]

Medical image analysis , volume=

Diffusion models in medical imaging: A comprehensive survey , author=. Medical image analysis , volume=. 2023 , publisher=

2023
[27]

International Conference on Medical Image Computing and Computer-Assisted Intervention , pages=

Make-a-volume: Leveraging latent diffusion models for cross-modality 3d brain mri synthesis , author=. International Conference on Medical Image Computing and Computer-Assisted Intervention , pages=. 2023 , organization=

2023
[28]

MICCAI workshop on deep generative models , pages=

Wdm: 3d wavelet diffusion models for high-resolution medical image synthesis , author=. MICCAI workshop on deep generative models , pages=. 2024 , organization=

2024
[29]

MICCAI Workshop on Deep Generative Models , pages=

On differentially private 3d medical image synthesis with controllable latent diffusion models , author=. MICCAI Workshop on Deep Generative Models , pages=. 2024 , organization=

2024
[30]

Meta-Radiology , volume=

A survey of emerging applications of diffusion probabilistic models in MRI , author=. Meta-Radiology , volume=. 2024 , publisher=

2024
[31]

Medical Imaging with Deep Learning , pages=

Memory-efficient 3d denoising diffusion models for medical image processing , author=. Medical Imaging with Deep Learning , pages=. 2024 , organization=

2024
[32]

MICCAI workshop on deep generative models , pages=

Brain imaging generation with latent diffusion models , author=. MICCAI workshop on deep generative models , pages=. 2022 , organization=

2022
[33]

arXiv preprint arXiv:2409.16818 , year=

Towards general text-guided image synthesis for customized multimodal brain MRI generation , author=. arXiv preprint arXiv:2409.16818 , year=

work page arXiv
[34]

Advances in Neural Information Processing Systems , volume=

Copycats: the many lives of a publicly available medical imaging dataset , author=. Advances in Neural Information Processing Systems , volume=
[35]

arXiv preprint arXiv:2508.05772 , year=

Maisi-v2: Accelerated 3d high-resolution medical image synthesis with rectified flow and region-specific contrastive loss , author=. arXiv preprint arXiv:2508.05772 , year=

work page arXiv
[36]

2026 , version =

medmetric: Metrics for Synthetic MRI Generation , author =. 2026 , version =

2026
[37]

Auto-Encoding Variational Bayes

Auto-encoding variational bayes , author=. arXiv preprint arXiv:1312.6114 , year=

work page internal anchor Pith review Pith/arXiv arXiv
[38]

Communications of the ACM , volume=

Generative adversarial networks , author=. Communications of the ACM , volume=. 2020 , publisher=

2020
[39]

Advances in neural information processing systems , volume=

Denoising diffusion probabilistic models , author=. Advances in neural information processing systems , volume=
[40]

IEEE journal of biomedical and health informatics , volume=

Hierarchical amortized GAN for 3D high resolution medical image synthesis , author=. IEEE journal of biomedical and health informatics , volume=. 2022 , publisher=

2022
[41]

IEEE transactions on medical imaging , volume=

The multimodal brain tumor image segmentation benchmark (BRATS) , author=. IEEE transactions on medical imaging , volume=. 2014 , publisher=

2014
[42]

Identifying the Best Machine Learning Algorithms for Brain Tumor Segmentation, Progression Assessment, and Overall Survival Prediction in the BRATS Challenge

Identifying the best machine learning algorithms for brain tumor segmentation, progression assessment, and overall survival prediction in the BRATS challenge , author=. arXiv preprint arXiv:1811.02629 , year=

work page internal anchor Pith review Pith/arXiv arXiv
[43]

Wiley Interdisciplinary Reviews: Nanomedicine and Nanobiotechnology , volume=

Gadolinium-based contrast agents for magnetic resonance cancer imaging , author=. Wiley Interdisciplinary Reviews: Nanomedicine and Nanobiotechnology , volume=. 2013 , publisher=

2013
[44]

Proceedings of the IEEE/CVF international conference on computer vision , pages=

Adding conditional control to text-to-image diffusion models , author=. Proceedings of the IEEE/CVF international conference on computer vision , pages=
[45]

Nature , volume=

AI models collapse when trained on recursively generated data , author=. Nature , volume=. 2024 , publisher=

2024

[1] [1]

Imaging Neuroscience , volume=

BrainScape: An open-source framework for integrating and preprocessing anatomical MRI datasets , author=. Imaging Neuroscience , volume=. 2025 , publisher=

2025

[2] [2]

Proceedings of the IEEE/CVF conference on computer vision and pattern recognition , pages=

High-resolution image synthesis with latent diffusion models , author=. Proceedings of the IEEE/CVF conference on computer vision and pattern recognition , pages=

[3] [3]

IEEE Journal of Biomedical and Health Informatics , volume=

Conditional diffusion models for semantic 3D brain MRI synthesis , author=. IEEE Journal of Biomedical and Health Informatics , volume=. 2024 , publisher=

2024

[4] [4]

Scientific reports , volume=

Denoising diffusion probabilistic models for 3D medical image generation , author=. Scientific reports , volume=. 2023 , publisher=

2023

[5] [5]

2025 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV) , pages=

Maisi: Medical ai for synthetic imaging , author=. 2025 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV) , pages=. 2025 , organization=

2025

[6] [6]

International Conference on Medical Image Computing and Computer-Assisted Intervention , pages=

Unisyn: A generative foundation model for universal medical image synthesis across mri, ct and pet , author=. International Conference on Medical Image Computing and Computer-Assisted Intervention , pages=. 2025 , organization=

2025

[7] [7]

International Conference on Medical Image Computing and Computer-Assisted Intervention , pages=

Cola-diff: Conditional latent diffusion model for multi-modal mri synthesis , author=. International Conference on Medical Image Computing and Computer-Assisted Intervention , pages=. 2023 , organization=

2023

[8] [8]

IEEE Transactions on Medical Imaging , volume=

Multi-modal modality-masked diffusion network for brain mri synthesis with random modality missing , author=. IEEE Transactions on Medical Imaging , volume=. 2024 , publisher=

2024

[9] [9]

npj Artificial Intelligence , volume=

MU-Diff: a mutual learning diffusion model for synthetic MRI with Application for brain lesions , author=. npj Artificial Intelligence , volume=. 2025 , publisher=

2025

[10] [10]

Medical Image Analysis , volume=

Metadata-conditioned generative models to synthesize anatomically-plausible 3D brain MRIs , author=. Medical Image Analysis , volume=. 2024 , publisher=

2024

[11] [11]

European conference on computer vision , pages=

Perceptual losses for real-time style transfer and super-resolution , author=. European conference on computer vision , pages=. 2016 , organization=

2016

[12] [12]

Nature Communications , volume=

Generative AI enables medical image segmentation in ultra low-data regimes , author=. Nature Communications , volume=. 2025 , publisher=

2025

[13] [13]

Journal of Medical Systems , volume=

Diffusion Models for Neuroimaging Data Augmentation: Assessing Realism and Clinical Relevance , author=. Journal of Medical Systems , volume=. 2025 , publisher=

2025

[14] [14]

Proceedings of the IEEE/CVF International Conference on Computer Vision , pages=

Diffusion-based data augmentation for medical image segmentation , author=. Proceedings of the IEEE/CVF International Conference on Computer Vision , pages=

[15] [15]

International Conference on Medical Image Computing and Computer-Assisted Intervention , pages=

Structure-Aware MRI Translation: Multi-modal Latent Diffusion Model with Arbitrary Missing Modalities , author=. International Conference on Medical Image Computing and Computer-Assisted Intervention , pages=. 2025 , organization=

2025

[16] [16]

Tsinghua Science and Technology , volume=

Diffusion models for medical image computing: A survey , author=. Tsinghua Science and Technology , volume=. 2024 , publisher=

2024

[17] [17]

Radiology , volume=

Generating synthetic data for medical imaging , author=. Radiology , volume=. 2024 , publisher=

2024

[18] [18]

Scientific Reports , volume=

Similarity and quality metrics for MR image-to-image translation , author=. Scientific Reports , volume=. 2025 , publisher=

2025

[19] [19]

IEEE Transactions on Medical Imaging , year=

Privacy-Preserving Latent Diffusion-Based Synthetic Medical Image Generation , author=. IEEE Transactions on Medical Imaging , year=

[20] [20]

NeuroImage , volume=

Reliability assessment of tissue classification algorithms for multi-center and multi-scanner data , author=. NeuroImage , volume=. 2020 , publisher=

2020

[21] [21]

Machine Learning: Science and Technology , volume=

Beware of diffusion models for synthesizing medical images—a comparison with GANs in terms of memorizing brain MRI and chest x-ray images , author=. Machine Learning: Science and Technology , volume=. 2025 , publisher=

2025

[22] [22]

International Conference on Medical Image Computing and Computer-Assisted Intervention , pages=

Physics-informed latent diffusion for multimodal brain mri synthesis , author=. International Conference on Medical Image Computing and Computer-Assisted Intervention , pages=. 2024 , organization=

2024

[23] [23]

Proceedings of the IEEE/CVF Winter conference on applications of computer Vision , pages=

Adaptive latent diffusion model for 3d medical image to image translation: Multi-modal magnetic resonance imaging study , author=. Proceedings of the IEEE/CVF Winter conference on applications of computer Vision , pages=

[24] [24]

arXiv preprint arXiv:2412.16860 , year=

Diffusion-based approaches in medical image generation and analysis , author=. arXiv preprint arXiv:2412.16860 , year=

work page arXiv

[25] [25]

Generative Machine Learning Models in Medical Image Computing , pages=

Deep generative models for 3D medical image synthesis , author=. Generative Machine Learning Models in Medical Image Computing , pages=. 2024 , publisher=

2024

[26] [26]

Medical image analysis , volume=

Diffusion models in medical imaging: A comprehensive survey , author=. Medical image analysis , volume=. 2023 , publisher=

2023

[27] [27]

International Conference on Medical Image Computing and Computer-Assisted Intervention , pages=

Make-a-volume: Leveraging latent diffusion models for cross-modality 3d brain mri synthesis , author=. International Conference on Medical Image Computing and Computer-Assisted Intervention , pages=. 2023 , organization=

2023

[28] [28]

MICCAI workshop on deep generative models , pages=

Wdm: 3d wavelet diffusion models for high-resolution medical image synthesis , author=. MICCAI workshop on deep generative models , pages=. 2024 , organization=

2024

[29] [29]

MICCAI Workshop on Deep Generative Models , pages=

On differentially private 3d medical image synthesis with controllable latent diffusion models , author=. MICCAI Workshop on Deep Generative Models , pages=. 2024 , organization=

2024

[30] [30]

Meta-Radiology , volume=

A survey of emerging applications of diffusion probabilistic models in MRI , author=. Meta-Radiology , volume=. 2024 , publisher=

2024

[31] [31]

Medical Imaging with Deep Learning , pages=

Memory-efficient 3d denoising diffusion models for medical image processing , author=. Medical Imaging with Deep Learning , pages=. 2024 , organization=

2024

[32] [32]

MICCAI workshop on deep generative models , pages=

Brain imaging generation with latent diffusion models , author=. MICCAI workshop on deep generative models , pages=. 2022 , organization=

2022

[33] [33]

arXiv preprint arXiv:2409.16818 , year=

Towards general text-guided image synthesis for customized multimodal brain MRI generation , author=. arXiv preprint arXiv:2409.16818 , year=

work page arXiv

[34] [34]

Advances in Neural Information Processing Systems , volume=

Copycats: the many lives of a publicly available medical imaging dataset , author=. Advances in Neural Information Processing Systems , volume=

[35] [35]

arXiv preprint arXiv:2508.05772 , year=

Maisi-v2: Accelerated 3d high-resolution medical image synthesis with rectified flow and region-specific contrastive loss , author=. arXiv preprint arXiv:2508.05772 , year=

work page arXiv

[36] [36]

2026 , version =

medmetric: Metrics for Synthetic MRI Generation , author =. 2026 , version =

2026

[37] [37]

Auto-Encoding Variational Bayes

Auto-encoding variational bayes , author=. arXiv preprint arXiv:1312.6114 , year=

work page internal anchor Pith review Pith/arXiv arXiv

[38] [38]

Communications of the ACM , volume=

Generative adversarial networks , author=. Communications of the ACM , volume=. 2020 , publisher=

2020

[39] [39]

Advances in neural information processing systems , volume=

Denoising diffusion probabilistic models , author=. Advances in neural information processing systems , volume=

[40] [40]

IEEE journal of biomedical and health informatics , volume=

Hierarchical amortized GAN for 3D high resolution medical image synthesis , author=. IEEE journal of biomedical and health informatics , volume=. 2022 , publisher=

2022

[41] [41]

IEEE transactions on medical imaging , volume=

The multimodal brain tumor image segmentation benchmark (BRATS) , author=. IEEE transactions on medical imaging , volume=. 2014 , publisher=

2014

[42] [42]

Identifying the Best Machine Learning Algorithms for Brain Tumor Segmentation, Progression Assessment, and Overall Survival Prediction in the BRATS Challenge

Identifying the best machine learning algorithms for brain tumor segmentation, progression assessment, and overall survival prediction in the BRATS challenge , author=. arXiv preprint arXiv:1811.02629 , year=

work page internal anchor Pith review Pith/arXiv arXiv

[43] [43]

Wiley Interdisciplinary Reviews: Nanomedicine and Nanobiotechnology , volume=

Gadolinium-based contrast agents for magnetic resonance cancer imaging , author=. Wiley Interdisciplinary Reviews: Nanomedicine and Nanobiotechnology , volume=. 2013 , publisher=

2013

[44] [44]

Proceedings of the IEEE/CVF international conference on computer vision , pages=

Adding conditional control to text-to-image diffusion models , author=. Proceedings of the IEEE/CVF international conference on computer vision , pages=

[45] [45]

Nature , volume=

AI models collapse when trained on recursively generated data , author=. Nature , volume=. 2024 , publisher=

2024