Flow Matching with Optimized Subclass Priors for Medical Image Augmentation

Bernhard Kainz; Felix N\"utzel; Mischa Dombrowski

arxiv: 2605.16469 · v1 · pith:YUXUQNFInew · submitted 2026-05-15 · 📡 eess.IV · cs.CV

Flow Matching with Optimized Subclass Priors for Medical Image Augmentation

Felix N\"utzel , Mischa Dombrowski , Bernhard Kainz This is my paper

Pith reviewed 2026-05-19 21:48 UTC · model grok-4.3

classification 📡 eess.IV cs.CV

keywords flow matchingmedical image augmentationlong-tailed datasetssubclass priorsrare disease imagingchest X-rayCT slicesgenerative models

0 comments

The pith

Partitioning coarse labels into latent submodes and learning subclass-conditioned sources lets flow matching generate more faithful rare medical images while improving downstream classifier accuracy.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper seeks to fix the bias that arises when generative models treat each coarse disease label as a single multi-modal distribution, which pushes rare subtypes through long and inefficient transport paths. It does so by first splitting each label into coherent submodes with Gaussian mixture modeling performed inside the generator's latent space, then training a separate starting distribution for every submode so that each rare subpopulation begins closer to its target. Geometric constraints keep the learned directions from degenerating. A sympathetic reader would care because reliable synthetic examples for tail classes could measurably raise the balanced accuracy of diagnostic classifiers on the very conditions where data are scarcest.

Core claim

The authors show that an offline two-level prior construction—Gaussian mixture partitioning of coarse labels in latent space followed by subclass-conditioned source distributions that re-center and re-scale the starting point per submode, regularized by explicit geometric control on normalized displacement directions—consistently raises tail-class generation fidelity and diversity on long-tailed chest X-ray and CT benchmarks and yields reliable gains in balanced accuracy and macro-F1 when the generated images are used for augmentation.

What carries the argument

subclass-conditioned source distributions re-centered and re-scaled after Gaussian mixture modeling of each coarse label in the generative model's latent space, together with geometric control that concentrates displacement directions around learnable prototypes while capping path-length outliers.

If this is right

Higher fidelity and diversity metrics for the rarest classes in generated medical images.
Consistent lifts in balanced accuracy and macro-F1 when the synthetic samples augment long-tailed training sets.
The same gains appear across both chest X-ray and CT slice modalities.
Reduced dominance of frequent submodes inside each coarse label condition.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same latent-space partitioning step could be inserted into other conditional flow or diffusion pipelines that currently suffer from multi-modal label collapse.
The learned subclasses might themselves serve as an unsupervised route to discovering clinically distinct disease phenotypes within existing diagnostic codes.
Geometric control on displacement directions offers a reusable regularizer for shortening trajectories in any transport-based generative model.
Extending the offline prior construction to 3-D volumes or time-series medical data would test whether the same shortening of paths improves rare-event synthesis in those settings.

Load-bearing premise

Gaussian mixture modeling performed in the latent space will recover coherent and useful subclasses rather than partitions that fail to shorten transport paths or introduce new biases the geometric regularizer cannot correct.

What would settle it

No improvement, or a decline, in FID or IRS scores for tail classes, or in balanced accuracy and macro-F1 for downstream classifiers, when the method is compared to ordinary flow matching on the MIMIC-LT, NIH-LT, or CT-RATE benchmarks.

Figures

Figures reproduced from arXiv: 2605.16469 by Bernhard Kainz, Felix N\"utzel, Mischa Dombrowski.

**Figure 1.** Figure 1: Left: Coarse conditionings and a single source yield a large variance of conditional probability paths. Middle: We obtain finer conditionings by fitting a mixture on the residual directions from the class centers. Right: We obtain better directional alignment by assigning a source to each subclass and optimizing them to a common direction, each bounded by a radial cap. enables controllable pathology synth… view at source ↗

**Figure 2.** Figure 2: Left: Close sources produce noisy directions; too distant sources produce entangled paths. Right: We balance angular concentration and distance to targets [PITH_FULL_IMAGE:figures/full_fig_p004_2.png] view at source ↗

**Figure 3.** Figure 3: Qualitative comparison of real and generated tail-class images. Row 1: MIMICLTtail class; Row 2: NIH-LT tail class; Row 3: CT-RATE tail class. Right: five random draws from a single subclass of our method, illustrating intra-subclass diversity. Discussion. Subclasses are induced in a learned latent space and may partially reflect acquisition-related variation alongside pathology; our confounder probes sho… view at source ↗

read the original abstract

Rare diseases dominate the diagnostic challenge in medical imaging yet are severely underrepresented in clinical datasets, causing classifiers to fail on exactly the conditions where reliable detection matters most. Generative augmentation can supply the missing tail-class coverage, but coarse disease labels aggregate diverse subtypes and acquisition settings into multi-modal conditionals that bias generators toward dominant submodes, while a shared Gaussian source forces rare subpopulations through disproportionately long transport paths. We propose an offline strategy that introduces informative priors at two levels: first, we partition each coarse label into coherent submodes via Gaussian mixture modeling in the generative model's latent space; second, we learn subclass-conditioned source distributions that re-center and re-scale the starting distribution per submode, shortening trajectories and reducing within-subclass dispersion. To prevent degenerate solutions we impose explicit geometric control, moderately concentrating normalized displacement directions around learnable prototypes while capping path-length outliers. On long-tailed chest X-ray (MIMIC-LT, NIH-LT) and CT slice (CT-RATE) benchmarks the proposed method consistently improves tail-class generation fidelity and diversity (FID, IRS) and is a promising augmentation strategy that reliably improves downstream balanced accuracy and macro-F1 over a non-augmented baseline across modalities.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper adds GMM-based subclass partitioning and conditioned sources to flow matching for medical image augmentation on long-tailed data, with reported gains in generation and downstream metrics, but the approach hinges on whether those partitions hold up for the rarest classes.

read the letter

This paper takes flow matching and adds two offline priors to handle the fact that coarse disease labels in medical datasets mix together different subtypes and settings. They run GMM clustering on the latent representations to split each label into submodes, then learn a separate starting distribution for each submode so the flow has shorter, less dispersed paths. Geometric concentration around prototypes is added to keep things from collapsing. On MIMIC-LT, NIH-LT chest X-rays and CT-RATE slices they see better FID and IRS for tail classes plus lifts in balanced accuracy and macro-F1 when the generated images are used for augmentation.

Referee Report

2 major / 2 minor

Summary. The manuscript proposes an offline augmentation strategy for long-tailed medical imaging datasets using flow matching. Coarse labels are partitioned into submodes via GMM clustering in the generative model's latent space; subclass-conditioned source distributions are then learned to re-center and re-scale the base distribution per submode. Explicit geometric regularization (prototype concentration of normalized displacements and path-length capping) is added to avoid degeneracies. On MIMIC-LT, NIH-LT and CT-RATE benchmarks the method reports improved tail-class FID and IRS together with gains in downstream balanced accuracy and macro-F1 relative to a non-augmented baseline.

Significance. If the reported gains prove robust, the approach would offer a practical way to mitigate mode collapse and long transport paths in conditional flow matching for rare-disease imaging, directly addressing a persistent bottleneck in medical-image augmentation pipelines.

major comments (2)

[§3] §3 (Method): The central claim that GMM partitioning of tail-class latents produces coherent, reusable submodes rests on an unverified assumption. With very few samples per tail class, fitting a GMM in high-dimensional latent space is sensitive to initialization, choice of K, and latent noise; no cluster-stability metrics, silhouette scores, or sensitivity analysis to K are supplied, leaving open the possibility that the reported shortening of transport paths is an artifact of arbitrary partitions.
[§4] §4 (Experiments): The abstract states 'consistent gains' on FID, IRS, balanced accuracy and macro-F1, yet supplies no quantitative effect sizes, statistical significance tests, or ablation results on the number of GMM components or concentration strength. Without these, it is impossible to judge whether the improvements are load-bearing or driven by post-hoc hyper-parameter choices.

minor comments (2)

[§3.2] Notation for the normalized displacement directions and the learnable prototypes should be introduced with explicit equations rather than prose descriptions.
[Figure 4] Figure captions for the qualitative generation examples should include the exact number of samples per tail class shown.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their constructive and detailed feedback on our manuscript. We address each major comment below and indicate the changes planned for the revised version.

read point-by-point responses

Referee: [§3] §3 (Method): The central claim that GMM partitioning of tail-class latents produces coherent, reusable submodes rests on an unverified assumption. With very few samples per tail class, fitting a GMM in high-dimensional latent space is sensitive to initialization, choice of K, and latent noise; no cluster-stability metrics, silhouette scores, or sensitivity analysis to K are supplied, leaving open the possibility that the reported shortening of transport paths is an artifact of arbitrary partitions.

Authors: We agree that explicit validation of GMM cluster stability would strengthen the methodological claims, especially given the small sample sizes typical of tail classes. The current manuscript relies on downstream improvements in FID, IRS, and classification metrics to indicate that the discovered submodes are useful, but does not report stability diagnostics. In the revision we will add a sensitivity study over K (including results for K=1 through K=5), average silhouette scores computed across multiple random initializations, and a brief discussion of how the chosen K was selected per dataset. These additions will directly address the concern that partitions may be arbitrary. revision: yes
Referee: [§4] §4 (Experiments): The abstract states 'consistent gains' on FID, IRS, balanced accuracy and macro-F1, yet supplies no quantitative effect sizes, statistical significance tests, or ablation results on the number of GMM components or concentration strength. Without these, it is impossible to judge whether the improvements are load-bearing or driven by post-hoc hyper-parameter choices.

Authors: We acknowledge that the experimental section would benefit from more rigorous statistical reporting and targeted ablations. While the manuscript already compares against a non-augmented baseline across three datasets and multiple metrics, it does not include effect sizes, p-values, or systematic ablations on K and regularization strength. In the revised version we will add (i) relative percentage improvements with standard deviations over multiple runs, (ii) statistical significance tests (paired t-tests or Wilcoxon signed-rank tests with reported p-values), and (iii) ablation tables varying the number of GMM components and the prototype-concentration hyper-parameter. These results will be presented in the main text or supplementary material to demonstrate that the gains are robust rather than hyper-parameter artifacts. revision: yes

Circularity Check

0 steps flagged

No circularity: method composes standard flow-matching with GMM subclassing and geometric regularization without reducing claims to fitted inputs or self-citations.

full rationale

The paper presents an offline augmentation pipeline that first fits a GMM to latent representations of each coarse label, then trains subclass-conditioned source distributions under explicit geometric constraints on displacement directions. These steps are algorithmic choices whose outputs (improved FID, IRS, balanced accuracy) are measured on held-out test sets and compared against a non-augmented baseline. No equation or claim equates a reported performance gain to a quantity defined solely by parameters fitted to the same metric; the derivation chain relies on external benchmarks (MIMIC-LT, NIH-LT, CT-RATE) and does not invoke self-citations as load-bearing uniqueness theorems. The central improvements therefore remain falsifiable and independent of the method's internal fitting procedure.

Axiom & Free-Parameter Ledger

2 free parameters · 1 axioms · 0 invented entities

The approach rests on standard assumptions from generative modeling and clustering; free parameters include the number of mixture components and geometric control strength, which are likely tuned to data.

free parameters (2)

Number of GMM components per coarse label
Determines how many submodes are extracted; chosen or tuned per dataset to capture coherent subpopulations.
Concentration strength for normalized displacement directions
Controls how tightly directions concentrate around prototypes; introduced to prevent degenerate solutions.

axioms (1)

domain assumption Gaussian mixture models applied in the generative latent space can identify coherent submodes within coarse disease labels
Invoked to partition labels before learning subclass priors.

pith-pipeline@v0.9.0 · 5746 in / 1330 out tokens · 32871 ms · 2026-05-19T21:48:41.811518+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

35 extracted references · 35 canonical work pages

[1]

In: WACV’25

Adaloglou, N., Kaiser, T., Michels, F., Kollmann, M.: Rethinking cluster- conditioned diffusion models for label-free image synthesis. In: WACV’25. pp. 3603–3613. IEEE (2025)

work page 2025
[2]

Bao, F., Li, C., Sun, J., Zhu, J.: Why are conditional generative models better than unconditional ones? In: NeurIPS’22 Workshop on Score-Based Methods (2022)

work page 2022
[3]

In: ECCV’22

Boecking, B., Usuyama, N., Bannur, S., Castro, D.C., Schwaighofer, A., Hyland, S., Wetscherek, M., Naumann, T., Nori, A., Alvarez-Valle, J., Poon, H., Oktay, O.: Making the most of text semantics to improve biomedical vision–language processing. In: ECCV’22. pp. 1–21. Springer (2022)

work page 2022
[4]

In: MICCAI’25

Chen, S., Zhou, X., Wang, Y., Huang, Y., Chang, A., Ni, D., Huang, R.: Sub- typing breast lesions via generative augmentation based long-tailed recognition in ultrasound. In: MICCAI’25. LNCS, vol. 15967, pp. 519–529. Springer (2025)

work page 2025
[5]

In: BRIDGE/DeCaF @ MICCAI’25

Dombrowski, M., Kainz, B.: Enabling PSO-secure synthetic data sharing using diversity-aware diffusion models. In: BRIDGE/DeCaF @ MICCAI’25. LNCS, vol. 16135, pp. 25–35. Springer (2026)

work page 2026
[6]

Dombrowski, M., Nützel, F., Kainz, B.: LCMem: A universal model for robust image memorization detection (2025)

work page 2025
[7]

In: CVPR’25

Dombrowski, M., Zhang, W., Cechnicka, S., Reynaud, H., Kainz, B.: Image gener- ation diversity issues and how to tame them. In: CVPR’25. pp. 3029–3039 (2025)

work page 2025
[8]

Hamamci, I.E., Er, S., Almas, F., Simsek, A.G., Esirgun, S.N., Dogan, I., Das- delen, M.F., Durugol, O.F., Wittmann, B., Amiranashvili, T., Simsar, E., Sim- sar, M., Erdemir, E.B., Alanbay, A., Sekuboyina, A., Lafci, B., Bluethgen, C., Ozdemir, M.K., Menze, B.: Developing generalist foundation models from a mul- timodal dataset for 3D computed tomography (2024)

work page 2024
[9]

In: MICCAI-DALI’22

Holste, G., Wang, S., Jiang, Z., Shen, T.C., Shih, G., Summers, R.M., Peng, Y., Wang, Z.: Long-tailed classification ofthorax diseases on chest X-ray: A new bench- mark study. In: MICCAI-DALI’22. pp. 22–32. Springer (2022)

work page 2022
[10]

In: ICCV’23

Hou, C., Zhang, J., Wang, H., Zhou, T.: Subclass-balancing contrastive learning for long-tailed recognition. In: ICCV’23. pp. 5372–5384. IEEE (2023)

work page 2023
[11]

Nützel et al

Issachar, N., Salama, M., Fattal, R., Benaim, S.: Designing a conditional prior distribution for flow-based generative models (2025) 10 F. Nützel et al

work page 2025
[12]

PhysioNet (2024), version 2.1.0

Johnson, A., Lungren, M., Peng, Y., Lu, Z., Mark, R., Berkowitz, S., Horng, S.: MIMIC-CXR-JPG — chest radiographs with structured labels. PhysioNet (2024), version 2.1.0

work page 2024
[13]

Kim, J., Park, J., Jeon, S., Kim, S.: Better source, better flow: Learning condition- dependent source distribution for flow matching (2026)

work page 2026
[14]

Ktena, I., Wiles, O., Albuquerque, I., Rebuffi, S.A., Tanno, R., Roy, A.G., Azizi, S., Belgrave, D., Kohli, P., Cemgil, T., Karthikesalingam, A., Gowal, S.: Generative models improve fairness of medical classifiers under distribution shifts. Nat. Med. 30(4), 1166–1173 (2024)

work page 2024
[15]

TMLR (2026)

Lee, J., Kim, K., Lee, J.: Is there a better source distribution than gaussian? Exploring source distributions for image flow matching. TMLR (2026)

work page 2026
[16]

In: ICLR’22 (2022)

gil Lee, S., Kim, H., Shin, C., Tan, X., Liu, C., Meng, Q., Qin, T., Chen, W., Yoon, S., Liu, T.Y.: PriorGrad: Improving conditional denoising diffusion models with data-dependent adaptive prior. In: ICLR’22 (2022)

work page 2022
[17]

In: ICLR’23 (2023)

Lipman, Y., Chen, R.T.Q., Ben-Hamu, H., Nickel, M., Le, M.: Flow matching for generative modeling. In: ICLR’23 (2023)

work page 2023
[18]

McIntosh-Smith, S., Alam, S.R., Woods, C.: Isambard-ai: a leadership class super- computer optimised specifically for artificial intelligence (2024)

work page 2024
[19]

In: CVPR’24

Miao, Z., Wang, J., Wang, Z., Yang, Z., Wang, L., Qiu, Q., Liu, Z.: Training diffusion models towards diverse image generation with reinforcement learning. In: CVPR’24. pp. 10844–10853 (2024)

work page 2024
[20]

In: CVPR’25

Morshed, M.M., Boddeti, V.: DiverseFlow: Sample-efficient diverse mode coverage in flows. In: CVPR’25. pp. 23303–23312 (2025)

work page 2025
[21]

In: ICLR’24 (2024)

Na, B., Kim, Y., Bae, H., Lee, J.H., Kwon, S.J., Kang, W., chul Moon, I.: Label- noise robust diffusion models. In: ICLR’24 (2024)

work page 2024
[22]

arXiv preprint arXiv:2504.14450 (2025)

Nie, W., Zhang, Z., Wang, W., Lepri, B., Liu, A., Sebe, N.: Causal disentanglement for robust long-tail medical image generation. arXiv preprint arXiv:2504.14450 (2025)

work page arXiv 2025
[23]

Nützel, F., Dombrowski, M., Kainz, B.: GRASP: Guided residual adapters with sample-wise partitioning (2025)

work page 2025
[24]

Oh, H., Choi, S., Baek, J., Kim, D.J., Joung, J.: FlawMatch: Conditional defect image generation via flow matching for improved surface defect classification. Adv. Eng. Inform.68, 103704 (2025)

work page 2025
[25]

In: ICML’23

Pooladian,A.A.,Ben-Hamu,H.,Domingo-Enrich,C.,Amos,B.,Lipman,Y.,Chen, R.T.Q.: Multisample flow matching: Straightening flows with minibatch couplings. In: ICML’23. vol. 202, pp. 28100–28127. PMLR (2023)

work page 2023
[26]

In: CVPR’23

Qin, Y., Zheng, H., Yao, J., Zhou, M., Zhang, Y.: Class-balancing diffusion models. In: CVPR’23. pp. 18434–18443 (2023)

work page 2023
[27]

In: IEEE BIBM’24

Rajaraman, S., Liang, Z., Xue, Z., Antani, S.K.: Addressing class imbalance with latent diffusion-based data augmentation for improving disease classification in pediatric chest X-rays. In: IEEE BIBM’24. pp. 5059–5066. IEEE (2024)

work page 2024
[28]

In: CVPR’22

Sehwag, V., Hazirbas, C., Gordo, A., Ozgenel, F., Ferrer, C.C.: Generating high fidelity data from low-density regions using diffusion models. In: CVPR’22. pp. 11482–11491 (2022)

work page 2022
[29]

In: ECML PKDD’25

da Silva Gonçalves, J., Manduchi, L., Vandenhirtz, M., Vogt, J.E.: TreeDiffusion: Hierarchical generative clustering for conditional diffusion. In: ECML PKDD’25. LNCS, vol. 16013, pp. 447–462. Springer (2026)

work page 2026
[30]

In: NeurIPS’25 Workshop: Reliable ML from Unreliable Data (2025) Flow Matching with Subclass Priors 11

Song, H., Gim, M., Choi, J.: Reweighted flow matching via unbalanced optimal transport for long-tailed generation. In: NeurIPS’25 Workshop: Reliable ML from Unreliable Data (2025) Flow Matching with Subclass Priors 11

work page 2025
[31]

In: ICLR’24 (2024)

Um, S., Lee, S., Ye, J.C.: Don’t play favorites: Minority guidance for diffusion models. In: ICLR’24 (2024)

work page 2024
[32]

In: ECCV’24

Um, S., Ye, J.C.: Self-guided generation of minority samples using diffusion models. In: ECCV’24. pp. 414–430. Springer (2025)

work page 2025
[33]

In: CVPR’17

Wang, X., Peng, Y., Lu, L., Lu, Z., Bagheri, M., Summers, R.M.: ChestX-Ray8: Hospital-scale chest X-ray database and benchmarks on weakly-supervised classi- fication and localization of common thorax diseases. In: CVPR’17. pp. 2097–2106 (2017)

work page 2097
[34]

In: ICLR’25 (2025)

Zhang, H., Liu, Y., Yang, J., Wan, S., Wang, X., Peng, W., Fua, P.: LeFusion: Controllable pathology synthesis via lesion-focused diffusion models. In: ICLR’25 (2025)

work page 2025
[35]

In: ICLR’24 (2024)

Zhang, T., Zheng, H., Yao, J., Wang, X., Zhou, M., Zhang, Y., Wang, Y.: Long- tailed diffusion models with oriented calibration. In: ICLR’24 (2024)

work page 2024

[1] [1]

In: WACV’25

Adaloglou, N., Kaiser, T., Michels, F., Kollmann, M.: Rethinking cluster- conditioned diffusion models for label-free image synthesis. In: WACV’25. pp. 3603–3613. IEEE (2025)

work page 2025

[2] [2]

Bao, F., Li, C., Sun, J., Zhu, J.: Why are conditional generative models better than unconditional ones? In: NeurIPS’22 Workshop on Score-Based Methods (2022)

work page 2022

[3] [3]

In: ECCV’22

Boecking, B., Usuyama, N., Bannur, S., Castro, D.C., Schwaighofer, A., Hyland, S., Wetscherek, M., Naumann, T., Nori, A., Alvarez-Valle, J., Poon, H., Oktay, O.: Making the most of text semantics to improve biomedical vision–language processing. In: ECCV’22. pp. 1–21. Springer (2022)

work page 2022

[4] [4]

In: MICCAI’25

Chen, S., Zhou, X., Wang, Y., Huang, Y., Chang, A., Ni, D., Huang, R.: Sub- typing breast lesions via generative augmentation based long-tailed recognition in ultrasound. In: MICCAI’25. LNCS, vol. 15967, pp. 519–529. Springer (2025)

work page 2025

[5] [5]

In: BRIDGE/DeCaF @ MICCAI’25

Dombrowski, M., Kainz, B.: Enabling PSO-secure synthetic data sharing using diversity-aware diffusion models. In: BRIDGE/DeCaF @ MICCAI’25. LNCS, vol. 16135, pp. 25–35. Springer (2026)

work page 2026

[6] [6]

Dombrowski, M., Nützel, F., Kainz, B.: LCMem: A universal model for robust image memorization detection (2025)

work page 2025

[7] [7]

In: CVPR’25

Dombrowski, M., Zhang, W., Cechnicka, S., Reynaud, H., Kainz, B.: Image gener- ation diversity issues and how to tame them. In: CVPR’25. pp. 3029–3039 (2025)

work page 2025

[8] [8]

Hamamci, I.E., Er, S., Almas, F., Simsek, A.G., Esirgun, S.N., Dogan, I., Das- delen, M.F., Durugol, O.F., Wittmann, B., Amiranashvili, T., Simsar, E., Sim- sar, M., Erdemir, E.B., Alanbay, A., Sekuboyina, A., Lafci, B., Bluethgen, C., Ozdemir, M.K., Menze, B.: Developing generalist foundation models from a mul- timodal dataset for 3D computed tomography (2024)

work page 2024

[9] [9]

In: MICCAI-DALI’22

Holste, G., Wang, S., Jiang, Z., Shen, T.C., Shih, G., Summers, R.M., Peng, Y., Wang, Z.: Long-tailed classification ofthorax diseases on chest X-ray: A new bench- mark study. In: MICCAI-DALI’22. pp. 22–32. Springer (2022)

work page 2022

[10] [10]

In: ICCV’23

Hou, C., Zhang, J., Wang, H., Zhou, T.: Subclass-balancing contrastive learning for long-tailed recognition. In: ICCV’23. pp. 5372–5384. IEEE (2023)

work page 2023

[11] [11]

Nützel et al

Issachar, N., Salama, M., Fattal, R., Benaim, S.: Designing a conditional prior distribution for flow-based generative models (2025) 10 F. Nützel et al

work page 2025

[12] [12]

PhysioNet (2024), version 2.1.0

Johnson, A., Lungren, M., Peng, Y., Lu, Z., Mark, R., Berkowitz, S., Horng, S.: MIMIC-CXR-JPG — chest radiographs with structured labels. PhysioNet (2024), version 2.1.0

work page 2024

[13] [13]

Kim, J., Park, J., Jeon, S., Kim, S.: Better source, better flow: Learning condition- dependent source distribution for flow matching (2026)

work page 2026

[14] [14]

Ktena, I., Wiles, O., Albuquerque, I., Rebuffi, S.A., Tanno, R., Roy, A.G., Azizi, S., Belgrave, D., Kohli, P., Cemgil, T., Karthikesalingam, A., Gowal, S.: Generative models improve fairness of medical classifiers under distribution shifts. Nat. Med. 30(4), 1166–1173 (2024)

work page 2024

[15] [15]

TMLR (2026)

Lee, J., Kim, K., Lee, J.: Is there a better source distribution than gaussian? Exploring source distributions for image flow matching. TMLR (2026)

work page 2026

[16] [16]

In: ICLR’22 (2022)

gil Lee, S., Kim, H., Shin, C., Tan, X., Liu, C., Meng, Q., Qin, T., Chen, W., Yoon, S., Liu, T.Y.: PriorGrad: Improving conditional denoising diffusion models with data-dependent adaptive prior. In: ICLR’22 (2022)

work page 2022

[17] [17]

In: ICLR’23 (2023)

Lipman, Y., Chen, R.T.Q., Ben-Hamu, H., Nickel, M., Le, M.: Flow matching for generative modeling. In: ICLR’23 (2023)

work page 2023

[18] [18]

McIntosh-Smith, S., Alam, S.R., Woods, C.: Isambard-ai: a leadership class super- computer optimised specifically for artificial intelligence (2024)

work page 2024

[19] [19]

In: CVPR’24

Miao, Z., Wang, J., Wang, Z., Yang, Z., Wang, L., Qiu, Q., Liu, Z.: Training diffusion models towards diverse image generation with reinforcement learning. In: CVPR’24. pp. 10844–10853 (2024)

work page 2024

[20] [20]

In: CVPR’25

Morshed, M.M., Boddeti, V.: DiverseFlow: Sample-efficient diverse mode coverage in flows. In: CVPR’25. pp. 23303–23312 (2025)

work page 2025

[21] [21]

In: ICLR’24 (2024)

Na, B., Kim, Y., Bae, H., Lee, J.H., Kwon, S.J., Kang, W., chul Moon, I.: Label- noise robust diffusion models. In: ICLR’24 (2024)

work page 2024

[22] [22]

arXiv preprint arXiv:2504.14450 (2025)

Nie, W., Zhang, Z., Wang, W., Lepri, B., Liu, A., Sebe, N.: Causal disentanglement for robust long-tail medical image generation. arXiv preprint arXiv:2504.14450 (2025)

work page arXiv 2025

[23] [23]

Nützel, F., Dombrowski, M., Kainz, B.: GRASP: Guided residual adapters with sample-wise partitioning (2025)

work page 2025

[24] [24]

Oh, H., Choi, S., Baek, J., Kim, D.J., Joung, J.: FlawMatch: Conditional defect image generation via flow matching for improved surface defect classification. Adv. Eng. Inform.68, 103704 (2025)

work page 2025

[25] [25]

In: ICML’23

Pooladian,A.A.,Ben-Hamu,H.,Domingo-Enrich,C.,Amos,B.,Lipman,Y.,Chen, R.T.Q.: Multisample flow matching: Straightening flows with minibatch couplings. In: ICML’23. vol. 202, pp. 28100–28127. PMLR (2023)

work page 2023

[26] [26]

In: CVPR’23

Qin, Y., Zheng, H., Yao, J., Zhou, M., Zhang, Y.: Class-balancing diffusion models. In: CVPR’23. pp. 18434–18443 (2023)

work page 2023

[27] [27]

In: IEEE BIBM’24

Rajaraman, S., Liang, Z., Xue, Z., Antani, S.K.: Addressing class imbalance with latent diffusion-based data augmentation for improving disease classification in pediatric chest X-rays. In: IEEE BIBM’24. pp. 5059–5066. IEEE (2024)

work page 2024

[28] [28]

In: CVPR’22

Sehwag, V., Hazirbas, C., Gordo, A., Ozgenel, F., Ferrer, C.C.: Generating high fidelity data from low-density regions using diffusion models. In: CVPR’22. pp. 11482–11491 (2022)

work page 2022

[29] [29]

In: ECML PKDD’25

da Silva Gonçalves, J., Manduchi, L., Vandenhirtz, M., Vogt, J.E.: TreeDiffusion: Hierarchical generative clustering for conditional diffusion. In: ECML PKDD’25. LNCS, vol. 16013, pp. 447–462. Springer (2026)

work page 2026

[30] [30]

In: NeurIPS’25 Workshop: Reliable ML from Unreliable Data (2025) Flow Matching with Subclass Priors 11

Song, H., Gim, M., Choi, J.: Reweighted flow matching via unbalanced optimal transport for long-tailed generation. In: NeurIPS’25 Workshop: Reliable ML from Unreliable Data (2025) Flow Matching with Subclass Priors 11

work page 2025

[31] [31]

In: ICLR’24 (2024)

Um, S., Lee, S., Ye, J.C.: Don’t play favorites: Minority guidance for diffusion models. In: ICLR’24 (2024)

work page 2024

[32] [32]

In: ECCV’24

Um, S., Ye, J.C.: Self-guided generation of minority samples using diffusion models. In: ECCV’24. pp. 414–430. Springer (2025)

work page 2025

[33] [33]

In: CVPR’17

Wang, X., Peng, Y., Lu, L., Lu, Z., Bagheri, M., Summers, R.M.: ChestX-Ray8: Hospital-scale chest X-ray database and benchmarks on weakly-supervised classi- fication and localization of common thorax diseases. In: CVPR’17. pp. 2097–2106 (2017)

work page 2097

[34] [34]

In: ICLR’25 (2025)

Zhang, H., Liu, Y., Yang, J., Wan, S., Wang, X., Peng, W., Fua, P.: LeFusion: Controllable pathology synthesis via lesion-focused diffusion models. In: ICLR’25 (2025)

work page 2025

[35] [35]

In: ICLR’24 (2024)

Zhang, T., Zheng, H., Yao, J., Wang, X., Zhou, M., Zhang, Y., Wang, Y.: Long- tailed diffusion models with oriented calibration. In: ICLR’24 (2024)

work page 2024