WaveDiT: Distribution-Aware Wavelet Flow Matching for Efficient 3D Brain MRI Synthesis

Angela Lombardi; Danilo Danese; Giuseppe Fasano; Matteo Attimonelli; Tommaso Di Noia

arxiv: 2606.08670 · v1 · pith:YBQNK6T5new · submitted 2026-06-07 · 💻 cs.CV

WaveDiT: Distribution-Aware Wavelet Flow Matching for Efficient 3D Brain MRI Synthesis

Danilo Danese , Angela Lombardi , Giuseppe Fasano , Matteo Attimonelli , Tommaso Di Noia This is my paper

Pith reviewed 2026-06-27 18:38 UTC · model grok-4.3

classification 💻 cs.CV

keywords 3D MRI synthesiswavelet transformflow matchingheteroscedastic uncertaintybrain imaginggenerative modelsdata augmentation

0 comments

The pith

WaveDiT performs full-resolution 3D brain MRI synthesis on a single GPU by running conditional flow matching inside 3D Haar wavelet coefficient space with band-wise uncertainty prediction.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper claims that moving the generative process into the coefficient space of a 3D Haar discrete wavelet transform, while predicting and conditioning on per-band log-variance, removes the memory and compute barriers that normally force either low-resolution outputs or heavy latent compression. A factorized spatio-depth attention backbone integrates the predicted variance directly into the flow-matching objective so that the model adapts its precision to the heavy-tailed statistics of anatomical detail. On a multi-site cohort the resulting volumes align more closely with real MRI distributions and improve downstream brain-age regression plus region-level anatomical fidelity compared with diffusion, latent, and prior wavelet baselines.

Core claim

The central claim is that a conditional flow-matching model defined directly on 3D Haar wavelet coefficients, equipped with band-wise heteroscedastic uncertainty estimates derived from higher-order wavelet statistics, produces full-resolution brain MRIs under single-GPU memory and time limits while achieving tighter distribution match and stronger performance on brain-age prediction and anatomical segmentation agreement than existing diffusion, latent-diffusion, and wavelet baselines.

What carries the argument

Conditional flow matching inside 3D Haar discrete wavelet transform coefficient space, with predicted log-variance fed into both the flow objective and the conditioning pathway to handle input-dependent variance across wavelet bands.

If this is right

Full-resolution 3D generative augmentation becomes feasible on ordinary single-GPU hardware.
Generated volumes exhibit closer statistical alignment to real multi-site MRI distributions.
Brain-age regression and anatomical region agreement improve over diffusion, latent, and earlier wavelet methods.
The same single-GPU training and inference regime scales to larger cohorts without specialized infrastructure.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The uncertainty-aware wavelet representation could be tested on other volumetric modalities such as CT or PET to check whether the same memory savings appear.
If the band-wise variance modeling proves robust, it might allow direct synthesis at even higher resolutions or with thinner slices without additional hardware.
The approach leaves open whether the same wavelet-flow backbone can be conditioned on non-imaging variables such as age, sex, or disease labels while preserving the reported efficiency gains.

Load-bearing premise

The 3D Haar wavelet coefficient representation together with per-band uncertainty estimates retains enough anatomical detail and distributional properties to support reliable downstream clinical tasks.

What would settle it

Running the same downstream brain-age prediction and region-level segmentation evaluation on the multi-site cohort and finding no improvement (or a clear drop) in accuracy or Dice scores relative to the diffusion and latent baselines would falsify the central claim.

Figures

Figures reproduced from arXiv: 2606.08670 by Angela Lombardi, Danilo Danese, Giuseppe Fasano, Matteo Attimonelli, Tommaso Di Noia.

**Figure 2.** Figure 2: Visual comparison of models. Axial, coronal, and sagittal views of a real [PITH_FULL_IMAGE:figures/full_fig_p006_2.png] view at source ↗

read the original abstract

Large and demographically balanced datasets are essential for reliable neuroimaging biomarkers. Full-resolution 3D brain MRI synthesis can support data augmentation in this setting, but existing approaches either incur prohibitive computational cost at volumetric scale or rely on lossy latent compression that may compromise anatomical detail. As a result, practical 3D generative augmentation often requires specialized compute infrastructure. We propose WaveDiT, a conditional flow matching framework operating in the coefficient space of a 3D Haar Discrete Wavelet Transform. The model combines factorized spatio-depth attention with band-wise heteroscedastic uncertainty modeling derived from higher-order wavelet statistics. Predicted log-variance is integrated directly into both the flow objective and conditioning pathway, enabling adaptive precision consistent with the heavy-tailed and input-dependent variance structure of anatomical detail. This formulation supports full-resolution 3D synthesis under practical memory and time constraints on a single modern GPU. Evaluation on a multi-site cohort demonstrates improved alignment between generated and real MRI distributions, together with enhanced downstream brain age prediction and region-level anatomical agreement relative to diffusion, latent, and wavelet-based baselines. Code is available at https://github.com/sisinflab/WaveDiT

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

WaveDiT puts flow matching into 3D Haar wavelet space with band-wise uncertainty to enable single-GPU full-resolution MRI synthesis, but the abstract gives too little on the actual implementation and results to judge if the downstream gains are real.

read the letter

The main takeaway is that this paper describes WaveDiT, a conditional flow matching model that works in the coefficient space of a 3D Haar discrete wavelet transform. It adds factorized spatio-depth attention and band-wise heteroscedastic uncertainty derived from higher-order wavelet statistics, with the predicted log-variance fed into both the objective and the conditioning. The goal is practical full-resolution 3D brain MRI generation on one modern GPU, plus better distribution match and downstream performance on a multi-site cohort.

What stands out as new is the concrete combination of wavelet-domain flow matching with per-band uncertainty modeling that adapts to the heavy-tailed variance of anatomical detail. Releasing the code is useful. The paper does a reasonable job framing the compute problem with existing diffusion and latent methods and showing an attempt to address it directly with the wavelet basis.

The soft spots are clear from the abstract alone. No equations appear for how the uncertainty term modifies the flow objective or how the band-wise conditioning is implemented. No ablations are mentioned to separate the contribution of the heteroscedastic modeling from the wavelet transform or the attention factorization. The stress-test point about Haar wavelets is reasonable to raise: they are simple but produce blocky artifacts and weak directional selectivity, which could blunt the high-frequency cues that matter for brain-age prediction. If the inverse transform or uncertainty weighting removes those cues, any reported gains in age prediction or region agreement could be artifacts of the evaluation rather than proof that the model succeeded in the wavelet domain. The abstract claims improved alignment and enhanced downstream results over diffusion, latent, and wavelet baselines, but without numbers, statistical tests, or even basic metric values it is impossible to gauge how large or reliable those improvements are.

This is for researchers who need efficient 3D MRI augmentation for neuroimaging tasks. A reader working on practical generative models or wavelet methods in medical imaging would get something out of the architectural choices. The work shows honest engagement with the relevant literature and a testable setup, so it deserves a serious referee even if heavy revision on clarity and evidence will be required.

Referee Report

2 major / 2 minor

Summary. The paper introduces WaveDiT, a conditional flow matching model that operates directly in the coefficient space of a 3D Haar discrete wavelet transform for full-resolution 3D brain MRI synthesis. It employs factorized spatio-depth attention and band-wise heteroscedastic uncertainty modeling (derived from higher-order wavelet statistics) that is integrated into both the flow objective and conditioning. The method is claimed to enable practical single-GPU synthesis while improving distribution alignment, brain-age prediction accuracy, and region-level anatomical agreement over diffusion, latent, and wavelet baselines on a multi-site cohort. Code is released.

Significance. If the central claims hold, the work would provide a practical route to high-resolution 3D generative augmentation for neuroimaging without specialized hardware, directly addressing the need for demographically balanced datasets. The combination of wavelet-domain flow matching with input-dependent uncertainty modeling is a distinctive technical contribution; the public code release further strengthens reproducibility.

major comments (2)

[Evaluation / downstream tasks] The strongest empirical claim (enhanced brain-age prediction and region-level agreement) rests on the assumption that the 3D Haar DWT coefficient space plus band-wise log-variance conditioning preserves the high-frequency anatomical cues that drive age-related structural variation. The manuscript provides no ablation that isolates the contribution of high-frequency sub-bands, no quantitative comparison of frequency content before/after the inverse transform, and no analysis of blocky artifacts known to arise with Haar bases. Without such evidence the downstream gains could be an artifact of the evaluation protocol rather than proof that the generative model succeeded in the wavelet domain.
[Method formulation] The abstract and method description assert that predicted log-variance is integrated into both the flow objective and conditioning pathway, yet no explicit equation is given for the heteroscedastic flow-matching loss or for how the variance modulates the velocity field. This omission makes it impossible to verify that the uncertainty modeling is distribution-aware in the claimed sense or that it is not simply re-weighting the standard CFM objective.

minor comments (2)

[Experiments] The multi-site cohort description should include explicit subject counts per site and scanner parameters to allow assessment of domain-shift handling.
[Figures] Figure captions for qualitative results should state the exact slice location and windowing used so that visual comparisons are reproducible.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive feedback. The comments highlight areas where additional clarity and analysis will strengthen the manuscript. We address each major comment below and will revise accordingly.

read point-by-point responses

Referee: [Evaluation / downstream tasks] The strongest empirical claim (enhanced brain-age prediction and region-level agreement) rests on the assumption that the 3D Haar DWT coefficient space plus band-wise log-variance conditioning preserves the high-frequency anatomical cues that drive age-related structural variation. The manuscript provides no ablation that isolates the contribution of high-frequency sub-bands, no quantitative comparison of frequency content before/after the inverse transform, and no analysis of blocky artifacts known to arise with Haar bases. Without such evidence the downstream gains could be an artifact of the evaluation protocol rather than proof that the generative model succeeded in the wavelet domain.

Authors: We agree that the current evaluation would benefit from explicit isolation of high-frequency contributions and artifact analysis. In the revised manuscript we will add (i) an ablation that systematically masks or removes high-frequency wavelet sub-bands and reports the resulting change in brain-age prediction and region-level metrics, (ii) a quantitative frequency-content comparison (power spectra) of real versus reconstructed volumes before and after the inverse DWT, and (iii) both qualitative examples and quantitative metrics (edge sharpness, local variance) addressing potential blocky artifacts. These additions will allow readers to directly assess whether the observed downstream improvements are attributable to faithful modeling of anatomical detail in the wavelet domain. revision: yes
Referee: [Method formulation] The abstract and method description assert that predicted log-variance is integrated into both the flow objective and conditioning pathway, yet no explicit equation is given for the heteroscedastic flow-matching loss or for how the variance modulates the velocity field. This omission makes it impossible to verify that the uncertainty modeling is distribution-aware in the claimed sense or that it is not simply re-weighting the standard CFM objective.

Authors: We acknowledge that the explicit mathematical formulation is missing from the current text. In the revision we will insert the precise heteroscedastic conditional flow-matching objective, showing how the predicted per-band log-variance enters both the loss (as an adaptive weighting term derived from higher-order wavelet statistics) and the conditioning pathway that modulates the velocity-field prediction. This will make the distribution-aware character of the model verifiable and distinguish it from simple re-weighting of the standard CFM loss. revision: yes

Circularity Check

0 steps flagged

No circularity detected; derivation self-contained

full rationale

The provided abstract and description outline a conditional flow matching model in 3D Haar wavelet coefficient space with band-wise heteroscedastic uncertainty, but contain no equations, self-citations, or derivation steps that reduce by construction to fitted inputs or prior author results. Claims rest on empirical downstream evaluations (brain age prediction, distribution alignment) rather than tautological redefinitions or forced predictions. No load-bearing self-citation chains or ansatz smuggling are identifiable from the given text, making the approach externally falsifiable via the reported multi-site cohort results.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract-only review; no explicit free parameters, axioms, or invented entities can be extracted. The wavelet transform and uncertainty modeling are presented as core but their grounding is not detailed.

pith-pipeline@v0.9.1-grok · 5748 in / 1089 out tokens · 14929 ms · 2026-06-27T18:38:00.238986+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

34 extracted references · 3 canonical work pages

[1]

PLoS biology 20(4), e3001627 (2022)

Benkarim, O., Paquola, C., Park, B.y., Kebets, V., Hong, S.J., Vos de Wael, R., Zhang, S., Yeo, B.T., Eickenberg, M., Ge, T., et al.: Population heterogeneity in clinical cohorts affects the predictive accuracy of brain imaging. PLoS biology 20(4), e3001627 (2022)

2022
[2]

In: ICLR (2024)

Chen, R.T.Q., Lipman, Y.: Flow matching on general geometries. In: ICLR (2024)

2024
[3]

CoRRabs/1904.00625(2019)

Chen, S., Ma, K., Zheng, Y.: Med3d: Transfer learning for 3d medical image anal- ysis. CoRRabs/1904.00625(2019)

Pith/arXiv arXiv 1904
[4]

Scientific Data 11(1), 1330 (Dec 2024)

Chintapalli, S.S., Wang, R., Yang, Z., Tassopoulou, V., Yu, F., Bashyam, V., Erus, G., Chaudhari, P., Shou, H., Davatzikos, C.: Generative models of MRI-derived neuroimaging features and associated dataset of 18,000 samples. Scientific Data 11(1), 1330 (Dec 2024)

2024
[5]

NeuroImage163, 115–124 (2017)

Cole, J.H., Poudel, R.P., Tsagkrasoulis, D., Caan, M.W., Steves, C., Spector, T.D., Montana,G.:Predictingbrainagewithdeeplearningfromrawimagingdataresults in a reliable and heritable biomarker. NeuroImage163, 115–124 (2017)

2017
[6]

In: ICML

Crowson, K., Baumann, S.A., Birch, A., Abraham, T.M., Kaplan, D.Z., Shippole, E.: Scalable high-resolution pixel-space image synthesis with hourglass diffusion transformers. In: ICML. OpenReview.net (2024)

2024
[7]

arXiv preprint (2025)

Danese, D., et al.: Flowlet: Wavelet-based flow matching for efficient 3d brain mri synthesis. arXiv preprint (2025)

2025
[8]

Brain Informatics11(1), 33 (2024)

De Bonis, M.L.N., Fasano, G., Lombardi, A., Ardito, C., Ferrara, A., Di Sciascio, E., Di Noia, T.: Explainable brain age prediction: a comparative evaluation of morphometric and deep learning pipelines. Brain Informatics11(1), 33 (2024)

2024
[9]

NeuroImage224, 117401 (2021) 10 D

Dinsdale, N.K., Bluemke, E., Smith, S.M., Arya, Z., Vidaurre, D., Jenkinson, M., Namburete, A.I.: Learning patterns of the ageing brain in mri using deep convolu- tional networks. NeuroImage224, 117401 (2021) 10 D. Danese et al

2021
[10]

NeuroImage263, 119637 (2022)

Dufumier, B., Grigis, A., Victor, J., Ambroise, C., Frouin, V., Duchesnay, E.: Openbhb: a large-scale multi-site brain mri data-set for age prediction and debiasing. NeuroImage263, 119637 (2022). https://doi.org/10.1016/j.neuroimage.2022.119637, https://baobablab.github.io/bhb/dataset

work page doi:10.1016/j.neuroimage.2022.119637 2022
[11]

NeuroImage47, S102 (2009)

Fonov, V., Evans, A., McKinstry, R., Almli, C., Collins, D.: Unbiased non- linear average age-appropriate brain templates from birth to adulthood. NeuroImage47, S102 (2009). https://doi.org/10.1016/S1053-8119(09)70884-5, https://www.sciencedirect.com/science/article/pii/S1053811909708845, organiza- tion for Human Brain Mapping 2009 Annual Meeting

work page doi:10.1016/s1053-8119(09)70884-5 2009
[12]

In: DGM4MICCAI@MICCAI

Friedrich, P., Wolleb, J., Bieder, F., Durrer, A., Cattin, P.C.: WDM: 3d wavelet diffusion models for high-resolution medical image synthesis. In: DGM4MICCAI@MICCAI. Springer (2024)

2024
[13]

NeuroImage219, 117012 (2020)

Henschel, L., Conjeti, S., Estrada, S., Diers, K., Fischl, B., Reuter, M.: Fastsurfer - A fast and accurate deep learning based neuroimaging pipeline. NeuroImage219, 117012 (2020)

2020
[14]

In: ECCV (10)

Heo, B., Park, S., Han, D., Yun, S.: Rotary position embedding for vision trans- former. In: ECCV (10). Springer (2024)

2024
[15]

In: NeurIPS (2020)

Ho, J., Jain, A., Abbeel, P.: Denoising diffusion probabilistic models. In: NeurIPS (2020)

2020
[16]

doi:https://doi.org/10.1006/nimg.2002.1132

Jenkinson, M., Bannister, P., Brady, M., Smith, S.: Improved optimization for the robust and accurate linear registration and motion correction of brain images. NeuroImage17(2), 825–841 (2002). https://doi.org/10.1006/nimg.2002.1132

work page doi:10.1006/nimg.2002.1132 2002
[17]

Kendall, A., Gal, Y.: What uncertainties do we need in bayesian deep learning for computer vision? In: NIPS. pp. 5574–5584 (2017)

2017
[18]

CoRR (2022)

Khader, F., Mueller-Franzes, G., Arasteh, S.T., Han, T., Haarburger, C., Schulze- Hagen, M., Schad, P., Engelhardt, S., Baeßler, B., Foersch, S., Stegmaier, J., Kuhl, C., Nebelung, S., Kather, J.N., Truhn, D.: Medical diffusion - denoising diffusion probabilistic models for 3d medical image generation. CoRR (2022)

2022
[19]

In: ICLR (2023)

Lipman, Y., Chen, R.T.Q., Ben-Hamu, H., Nickel, M., Le, M.: Flow matching for generative modeling. In: ICLR (2023)

2023
[20]

In: ICLR

Liu, X., Gong, C., Liu, Q.: Flow straight and fast: Learning to generate and transfer data with rectified flow. In: ICLR. OpenReview.net (2023)

2023
[21]

Marcus, D.S., Fotenos, A.F., Csernansky, J.G., Morris, J.C., Buckner, R.L.: Open access series of imaging studies: longitudinal MRI data in nondemented and de- mented older adults. J. Cogn. Neurosci. (2010), sites.wustl.edu/oasisbrains/

2010
[22]

Scientific Reports13(1), 12098 (Jul 2023)

Müller-Franzes, G., Niehues, J.M., Khader, F., Arasteh, S.T., Haarburger, C., Kuhl, C., Wang, T., Han, T., Nolte, T., Nebelung, S., Kather, J.N., Truhn, D.: A multimodal comparison of latent denoising diffusion probabilistic models and gen- erative adversarial networks for medical image synthesis. Scientific Reports13(1), 12098 (Jul 2023)

2023
[23]

Neurology74(3), 201–209 (Jan 2010), https://adni.loni.usc.edu/

Petersen, R.C., Aisen, P.S., Beckett, L.A., Donohue, M.C., Gamst, A.C., Harvey, D.J., Jack, Jr, C.R., Jagust, W.J., Shaw, L.M., Toga, A.W., Trojanowski, J.Q., Weiner, M.W.: Alzheimer’s disease neuroimaging initiative (ADNI): clinical char- acterization. Neurology74(3), 201–209 (Jan 2010), https://adni.loni.usc.edu/

2010
[24]

In: MICCAI Workshop on Deep Generative Models (2022)

Pinaya, W.H., Tudosiu, P.D., Dafflon, J., Da Costa, P.F., Fernandez, V., Nachev, P., Ourselin, S., Cardoso, M.J.: Brain imaging generation with latent diffusion models. In: MICCAI Workshop on Deep Generative Models (2022)

2022
[25]

IEEE Transactions on Cognitive and Developmental Systems (2025) WaveDiT 11

Rahman, M.T., Orka, N.A., Khan, A., Liò, P., Moni, M.A.: Understanding neu- rocognition with deep learning and mri: A systematic review. IEEE Transactions on Cognitive and Developmental Systems (2025) WaveDiT 11

2025
[26]

In: CVPR

Rombach, R., Blattmann, A., Lorenz, D., Esser, P., Ommer, B.: High-resolution image synthesis with latent diffusion models. In: CVPR. IEEE (2022)

2022
[27]

In: ICLR (2022)

Seitzer, M., Tavakoli, A., Antic, D., Martius, G.: On the pitfalls of heteroscedastic uncertainty estimation with probabilistic neural networks. In: ICLR (2022)

2022
[28]

Smith, S.M.: Fast robust automated brain extraction. Hum. Brain Mapp.17(3), 143–155 (Nov 2002)

2002
[29]

Tudosiu, P., Pinaya, W.H.L., Costa, P.F.D., Dafflon, J., Patel, A., Borges, P., Fer- nandez, V., Graham, M.S., Gray, R.J., Nachev, P., Ourselin, S., Cardoso, M.J.: Realistic morphology-preserving generative modelling of the brain. Nat. Mac. In- tell.6(7), 811–819 (2024)

2024
[30]

IEEE Trans

Tustison, N.J., Avants, B.B., Cook, P.A., Zheng, Y., Egan, A., Yushkevich, P.A., Gee, J.C.: N4ITK: improved N3 bias correction. IEEE Trans. Med. Imaging (2010)

2010
[31]

IEEE Trans

Wang, H., Liu, Z., Sun, K., Wang, X., Shen, D., Cui, Z.: 3d meddiffusion: A 3d medical latent diffusion model for controllable and high-quality medical image generation. IEEE Trans. Medical Imaging44(12), 4960–4972 (2025)

2025
[32]

arXiv preprint arXiv:2503.00266 (2025)

Yazdani, M., Medghalchi, Y., Ashrafian, P., Hacihaliloglu, I., Shahriari, D.: Flow matching for medical image synthesis: Bridging the gap between speed and quality. arXiv preprint arXiv:2503.00266 (2025)

arXiv 2025
[33]

In: NeurIPS

Zhang, B., Sennrich, R.: Root mean square layer normalization. In: NeurIPS. pp. 12360–12371 (2019)

2019
[34]

In: MICCAI (2)

Zhang, X., Pak, D.H., Ahn, S.S., Li, X., You, C., Staib, L.H., Sinusas, A.J., Wong, A.L.N., Duncan, J.S.: Heteroscedastic uncertainty estimation framework for unsu- pervised registration. In: MICCAI (2). Springer (2024)

2024

[1] [1]

PLoS biology 20(4), e3001627 (2022)

Benkarim, O., Paquola, C., Park, B.y., Kebets, V., Hong, S.J., Vos de Wael, R., Zhang, S., Yeo, B.T., Eickenberg, M., Ge, T., et al.: Population heterogeneity in clinical cohorts affects the predictive accuracy of brain imaging. PLoS biology 20(4), e3001627 (2022)

2022

[2] [2]

In: ICLR (2024)

Chen, R.T.Q., Lipman, Y.: Flow matching on general geometries. In: ICLR (2024)

2024

[3] [3]

CoRRabs/1904.00625(2019)

Chen, S., Ma, K., Zheng, Y.: Med3d: Transfer learning for 3d medical image anal- ysis. CoRRabs/1904.00625(2019)

Pith/arXiv arXiv 1904

[4] [4]

Scientific Data 11(1), 1330 (Dec 2024)

Chintapalli, S.S., Wang, R., Yang, Z., Tassopoulou, V., Yu, F., Bashyam, V., Erus, G., Chaudhari, P., Shou, H., Davatzikos, C.: Generative models of MRI-derived neuroimaging features and associated dataset of 18,000 samples. Scientific Data 11(1), 1330 (Dec 2024)

2024

[5] [5]

NeuroImage163, 115–124 (2017)

Cole, J.H., Poudel, R.P., Tsagkrasoulis, D., Caan, M.W., Steves, C., Spector, T.D., Montana,G.:Predictingbrainagewithdeeplearningfromrawimagingdataresults in a reliable and heritable biomarker. NeuroImage163, 115–124 (2017)

2017

[6] [6]

In: ICML

Crowson, K., Baumann, S.A., Birch, A., Abraham, T.M., Kaplan, D.Z., Shippole, E.: Scalable high-resolution pixel-space image synthesis with hourglass diffusion transformers. In: ICML. OpenReview.net (2024)

2024

[7] [7]

arXiv preprint (2025)

Danese, D., et al.: Flowlet: Wavelet-based flow matching for efficient 3d brain mri synthesis. arXiv preprint (2025)

2025

[8] [8]

Brain Informatics11(1), 33 (2024)

De Bonis, M.L.N., Fasano, G., Lombardi, A., Ardito, C., Ferrara, A., Di Sciascio, E., Di Noia, T.: Explainable brain age prediction: a comparative evaluation of morphometric and deep learning pipelines. Brain Informatics11(1), 33 (2024)

2024

[9] [9]

NeuroImage224, 117401 (2021) 10 D

Dinsdale, N.K., Bluemke, E., Smith, S.M., Arya, Z., Vidaurre, D., Jenkinson, M., Namburete, A.I.: Learning patterns of the ageing brain in mri using deep convolu- tional networks. NeuroImage224, 117401 (2021) 10 D. Danese et al

2021

[10] [10]

NeuroImage263, 119637 (2022)

Dufumier, B., Grigis, A., Victor, J., Ambroise, C., Frouin, V., Duchesnay, E.: Openbhb: a large-scale multi-site brain mri data-set for age prediction and debiasing. NeuroImage263, 119637 (2022). https://doi.org/10.1016/j.neuroimage.2022.119637, https://baobablab.github.io/bhb/dataset

work page doi:10.1016/j.neuroimage.2022.119637 2022

[11] [11]

NeuroImage47, S102 (2009)

Fonov, V., Evans, A., McKinstry, R., Almli, C., Collins, D.: Unbiased non- linear average age-appropriate brain templates from birth to adulthood. NeuroImage47, S102 (2009). https://doi.org/10.1016/S1053-8119(09)70884-5, https://www.sciencedirect.com/science/article/pii/S1053811909708845, organiza- tion for Human Brain Mapping 2009 Annual Meeting

work page doi:10.1016/s1053-8119(09)70884-5 2009

[12] [12]

In: DGM4MICCAI@MICCAI

Friedrich, P., Wolleb, J., Bieder, F., Durrer, A., Cattin, P.C.: WDM: 3d wavelet diffusion models for high-resolution medical image synthesis. In: DGM4MICCAI@MICCAI. Springer (2024)

2024

[13] [13]

NeuroImage219, 117012 (2020)

Henschel, L., Conjeti, S., Estrada, S., Diers, K., Fischl, B., Reuter, M.: Fastsurfer - A fast and accurate deep learning based neuroimaging pipeline. NeuroImage219, 117012 (2020)

2020

[14] [14]

In: ECCV (10)

Heo, B., Park, S., Han, D., Yun, S.: Rotary position embedding for vision trans- former. In: ECCV (10). Springer (2024)

2024

[15] [15]

In: NeurIPS (2020)

Ho, J., Jain, A., Abbeel, P.: Denoising diffusion probabilistic models. In: NeurIPS (2020)

2020

[16] [16]

doi:https://doi.org/10.1006/nimg.2002.1132

Jenkinson, M., Bannister, P., Brady, M., Smith, S.: Improved optimization for the robust and accurate linear registration and motion correction of brain images. NeuroImage17(2), 825–841 (2002). https://doi.org/10.1006/nimg.2002.1132

work page doi:10.1006/nimg.2002.1132 2002

[17] [17]

Kendall, A., Gal, Y.: What uncertainties do we need in bayesian deep learning for computer vision? In: NIPS. pp. 5574–5584 (2017)

2017

[18] [18]

CoRR (2022)

Khader, F., Mueller-Franzes, G., Arasteh, S.T., Han, T., Haarburger, C., Schulze- Hagen, M., Schad, P., Engelhardt, S., Baeßler, B., Foersch, S., Stegmaier, J., Kuhl, C., Nebelung, S., Kather, J.N., Truhn, D.: Medical diffusion - denoising diffusion probabilistic models for 3d medical image generation. CoRR (2022)

2022

[19] [19]

In: ICLR (2023)

Lipman, Y., Chen, R.T.Q., Ben-Hamu, H., Nickel, M., Le, M.: Flow matching for generative modeling. In: ICLR (2023)

2023

[20] [20]

In: ICLR

Liu, X., Gong, C., Liu, Q.: Flow straight and fast: Learning to generate and transfer data with rectified flow. In: ICLR. OpenReview.net (2023)

2023

[21] [21]

Marcus, D.S., Fotenos, A.F., Csernansky, J.G., Morris, J.C., Buckner, R.L.: Open access series of imaging studies: longitudinal MRI data in nondemented and de- mented older adults. J. Cogn. Neurosci. (2010), sites.wustl.edu/oasisbrains/

2010

[22] [22]

Scientific Reports13(1), 12098 (Jul 2023)

Müller-Franzes, G., Niehues, J.M., Khader, F., Arasteh, S.T., Haarburger, C., Kuhl, C., Wang, T., Han, T., Nolte, T., Nebelung, S., Kather, J.N., Truhn, D.: A multimodal comparison of latent denoising diffusion probabilistic models and gen- erative adversarial networks for medical image synthesis. Scientific Reports13(1), 12098 (Jul 2023)

2023

[23] [23]

Neurology74(3), 201–209 (Jan 2010), https://adni.loni.usc.edu/

Petersen, R.C., Aisen, P.S., Beckett, L.A., Donohue, M.C., Gamst, A.C., Harvey, D.J., Jack, Jr, C.R., Jagust, W.J., Shaw, L.M., Toga, A.W., Trojanowski, J.Q., Weiner, M.W.: Alzheimer’s disease neuroimaging initiative (ADNI): clinical char- acterization. Neurology74(3), 201–209 (Jan 2010), https://adni.loni.usc.edu/

2010

[24] [24]

In: MICCAI Workshop on Deep Generative Models (2022)

Pinaya, W.H., Tudosiu, P.D., Dafflon, J., Da Costa, P.F., Fernandez, V., Nachev, P., Ourselin, S., Cardoso, M.J.: Brain imaging generation with latent diffusion models. In: MICCAI Workshop on Deep Generative Models (2022)

2022

[25] [25]

IEEE Transactions on Cognitive and Developmental Systems (2025) WaveDiT 11

Rahman, M.T., Orka, N.A., Khan, A., Liò, P., Moni, M.A.: Understanding neu- rocognition with deep learning and mri: A systematic review. IEEE Transactions on Cognitive and Developmental Systems (2025) WaveDiT 11

2025

[26] [26]

In: CVPR

Rombach, R., Blattmann, A., Lorenz, D., Esser, P., Ommer, B.: High-resolution image synthesis with latent diffusion models. In: CVPR. IEEE (2022)

2022

[27] [27]

In: ICLR (2022)

Seitzer, M., Tavakoli, A., Antic, D., Martius, G.: On the pitfalls of heteroscedastic uncertainty estimation with probabilistic neural networks. In: ICLR (2022)

2022

[28] [28]

Smith, S.M.: Fast robust automated brain extraction. Hum. Brain Mapp.17(3), 143–155 (Nov 2002)

2002

[29] [29]

Tudosiu, P., Pinaya, W.H.L., Costa, P.F.D., Dafflon, J., Patel, A., Borges, P., Fer- nandez, V., Graham, M.S., Gray, R.J., Nachev, P., Ourselin, S., Cardoso, M.J.: Realistic morphology-preserving generative modelling of the brain. Nat. Mac. In- tell.6(7), 811–819 (2024)

2024

[30] [30]

IEEE Trans

Tustison, N.J., Avants, B.B., Cook, P.A., Zheng, Y., Egan, A., Yushkevich, P.A., Gee, J.C.: N4ITK: improved N3 bias correction. IEEE Trans. Med. Imaging (2010)

2010

[31] [31]

IEEE Trans

Wang, H., Liu, Z., Sun, K., Wang, X., Shen, D., Cui, Z.: 3d meddiffusion: A 3d medical latent diffusion model for controllable and high-quality medical image generation. IEEE Trans. Medical Imaging44(12), 4960–4972 (2025)

2025

[32] [32]

arXiv preprint arXiv:2503.00266 (2025)

Yazdani, M., Medghalchi, Y., Ashrafian, P., Hacihaliloglu, I., Shahriari, D.: Flow matching for medical image synthesis: Bridging the gap between speed and quality. arXiv preprint arXiv:2503.00266 (2025)

arXiv 2025

[33] [33]

In: NeurIPS

Zhang, B., Sennrich, R.: Root mean square layer normalization. In: NeurIPS. pp. 12360–12371 (2019)

2019

[34] [34]

In: MICCAI (2)

Zhang, X., Pak, D.H., Ahn, S.S., Li, X., You, C., Staib, L.H., Sinusas, A.J., Wong, A.L.N., Duncan, J.S.: Heteroscedastic uncertainty estimation framework for unsu- pervised registration. In: MICCAI (2). Springer (2024)

2024