pith. sign in

arxiv: 2601.15884 · v2 · submitted 2026-01-22 · 💻 cs.CV

Contrast-X: A Multi-Modal Contrast Image Synthesis Benchmark and Universal Modality Flow Matching

Pith reviewed 2026-05-16 11:59 UTC · model grok-4.3

classification 💻 cs.CV
keywords contrast synthesismulti-modal medical imagingflow matchingmissing modalityCTDCE-MRIbenchmark datasetimage-to-image translation
0
0 comments X

The pith

A single flow-matching model in a shared latent space can synthesize contrast-enhanced images from any available subset of non-contrast modalities across organs.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper presents Contrast-X, a paired dataset of non-contrast and contrast-enhanced CT scans covering 10 organs from 1,526 patients plus multi-phase breast MRI from 1,116 patients, all with radiologist-verified phase labels and tumor masks. It introduces FlowMI, a single model that learns a unified multi-modal latent space and uses flow matching to generate the missing contrast-enhanced images regardless of which input modalities are present. This addresses the clinical reality that contrast agents cannot be given to every patient who needs diagnostic imaging. If the approach holds, it would let clinicians produce usable contrast views from whatever scans are already available without retraining for each organ or modality combination.

Core claim

The central claim is that contrast enhancement can be treated as a learnable mapping in a modality-agnostic latent space: FlowMI conditions on any observed non-contrast inputs, transports probability mass via flow matching to the corresponding contrast-enhanced distribution, and produces images whose quality supports downstream lesion analysis and radiologist interpretation even under arbitrary missing-modality patterns and across organs.

What carries the argument

FlowMI, a flow-matching generative model that operates inside a single learned multi-modal latent space conditioned on whichever input modalities are provided.

If this is right

  • The model produces usable contrast-enhanced images under every tested pattern of missing modalities.
  • Cross-organ generalization holds without additional per-organ adaptation.
  • Synthesized images preserve tumor masks sufficiently for downstream lesion detection metrics.
  • Reader studies indicate clinical acceptability on standard image-quality and diagnostic criteria.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • Clinics could reduce contrast-agent exposure for patients with renal impairment by synthesizing the needed views from existing non-contrast acquisitions.
  • The released paired dataset and leaderboard provide a common test bed that future synthesis methods can be measured against directly.
  • The same latent-space flow-matching design might extend to other enhancement tasks such as synthesizing functional from structural sequences in different imaging domains.

Load-bearing premise

That contrast enhancement behaves as a consistent, modality-independent transformation that a single latent space and flow-matching process can capture without organ-specific or modality-specific retraining.

What would settle it

Radiologist reader studies or automated lesion segmentation on synthesized images from a held-out organ or modality combination showing statistically significant drops in diagnostic accuracy compared with real contrast scans.

Figures

Figures reproduced from arXiv: 2601.15884 by Chao Li, Fei Yin, Hao Chen, Jia Wu, Yifan Chen.

Figure 1
Figure 1. Figure 1: Top: representative organ-system with paired contrast and non-contrast scans. The charts imply organ-wise composition and modality balance. Bottom: standardized curation pipeline including data collection, cleaning, pairing, and registration. Abstract—Contrast medium plays a pivotal role in radiological imaging, as it amplifies lesion conspicuity and improves detection for the diagnosis of tumor-related di… view at source ↗
Figure 2
Figure 2. Figure 2: Representative task settings with examples. (a) CT [PITH_FULL_IMAGE:figures/full_fig_p003_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Overview of the proposed FlowMI framework. [PITH_FULL_IMAGE:figures/full_fig_p005_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: CT→CTC (liver). Red circles mark tumor regions. Input CT shows no clear lesion, ground-truth CTC shows bright enhancement. Most methods under-enhance the tumor, while ours recover the correct signal. Alongside each result, the blue residual maps visualize differences from ground truth (darker indicates larger error). where λ and β control the relative contributions. b) Latent Flow Matching.: The key compon… view at source ↗
read the original abstract

Contrast-enhanced imaging is central to oncologic diagnosis, but contrast agents can be contraindicated for many of the patients who need them most. Synthesizing contrast scans from non-contrast inputs is the natural response. Two obstacles stand in the way: no benchmark provides paired contrast data with lesion-level evaluation, and no single model handles the arbitrary missing patterns seen in practice. We introduce Contrast-X, a benchmark of paired contrast-enhanced and non-contrast imaging spanning 10 organs in CT (1{,}526 patients) and multi-phase breast DCE-MRI (1116 patients). Every case carries radiologist-verified phase labels and tumor masks. We further propose FlowMI, a single model that handles arbitrary subsets of available modalities through a unified multi-modal latent space and flow matching. We benchmark a range of missing-modality configurations, reporting standard image-quality metrics, radiologist reader studies, and downstream lesion analysis on the synthesized scans. We further evaluate cross-organ generalization to test whether the model has learned a transferable contrast-enhancement operation. Dataset, code, and leaderboard will be released. Our code are available at https://github.com/YifanChen02/Contrast-X.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

3 major / 3 minor

Summary. The manuscript introduces Contrast-X, a benchmark of paired contrast-enhanced and non-contrast CT scans (1,526 patients across 10 organs) and multi-phase breast DCE-MRI (1,116 patients), each with radiologist-verified phase labels and tumor masks. It proposes FlowMI, a single model that synthesizes contrast-enhanced images from arbitrary non-empty subsets of available modalities via a unified multi-modal latent space and flow matching, with evaluations on image-quality metrics, reader studies, lesion analysis, and cross-organ generalization.

Significance. If the central claims hold, the work would establish a much-needed standardized benchmark with lesion-level ground truth and a flexible single-model approach to missing-modality synthesis, directly addressing clinical constraints on contrast agents. Public release of the dataset, code, and leaderboard would further amplify impact by enabling reproducible research in multi-modal medical image synthesis.

major comments (3)
  1. [FlowMI model description] FlowMI model description: the claim that a single model handles arbitrary modality subsets through a unified latent space and flow matching requires explicit specification of how the flow ODE is conditioned on the modality mask; without this, it is unclear whether the architecture avoids implicit per-pattern specialization for common clinical inputs such as non-contrast CT plus one MRI phase.
  2. [Cross-organ generalization evaluation] Cross-organ generalization evaluation: the reported results on transferability of the contrast-enhancement operation across 10 organs lack statistical significance testing or per-organ breakdowns, which is load-bearing for the universality claim given the anatomical and contrast-uptake diversity.
  3. [Benchmark construction] Benchmark construction: while paired data with tumor masks is a strength, the paper must clarify the exact train/validation/test splits and whether any modality-specific preprocessing was applied, as this directly affects whether the unified latent space truly operates without hidden adaptations.
minor comments (3)
  1. [Abstract] Abstract: 'Our code are available' contains a grammatical error and should read 'Our code is available'.
  2. [Dataset statistics] Dataset statistics: inconsistent number formatting (1{,}526) should be standardized to 1,526 throughout.
  3. [Figure captions] Figure captions: ensure all missing-modality configuration diagrams explicitly label the input mask patterns used during inference.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for the constructive and detailed feedback on our manuscript. We address each major comment point by point below and will incorporate the suggested clarifications and additions into the revised version.

read point-by-point responses
  1. Referee: [FlowMI model description] FlowMI model description: the claim that a single model handles arbitrary modality subsets through a unified latent space and flow matching requires explicit specification of how the flow ODE is conditioned on the modality mask; without this, it is unclear whether the architecture avoids implicit per-pattern specialization for common clinical inputs such as non-contrast CT plus one MRI phase.

    Authors: We appreciate the request for greater specificity on the conditioning mechanism. The flow ODE is conditioned by first encoding the binary modality mask into a fixed-dimensional embedding via a small MLP; this embedding is then added to the sinusoidal time embedding before being fed into the velocity network at every integration step. Because the mask is provided explicitly as input rather than learned per pattern, the same network weights are used for all subsets. In the revision we will add the precise mathematical formulation of the conditioned ODE, a pseudocode listing, and an updated architecture diagram to eliminate any ambiguity. revision: yes

  2. Referee: [Cross-organ generalization evaluation] Cross-organ generalization evaluation: the reported results on transferability of the contrast-enhancement operation across 10 organs lack statistical significance testing or per-organ breakdowns, which is load-bearing for the universality claim given the anatomical and contrast-uptake diversity.

    Authors: We agree that per-organ breakdowns and statistical testing are necessary to substantiate the cross-organ claim. In the revised manuscript we will report mean and standard deviation of all image-quality and lesion-analysis metrics for each of the 10 organs separately. We will also add paired statistical tests (Wilcoxon signed-rank) comparing performance on organs seen during training versus the held-out organs, with p-values and effect sizes. These results will be presented in a new table and discussed in the text. revision: yes

  3. Referee: [Benchmark construction] Benchmark construction: while paired data with tumor masks is a strength, the paper must clarify the exact train/validation/test splits and whether any modality-specific preprocessing was applied, as this directly affects whether the unified latent space truly operates without hidden adaptations.

    Authors: We will add a dedicated subsection detailing the construction protocol. The 1,526 CT and 1,116 MRI cases are split patient-wise in a 70/15/15 ratio for train/validation/test, ensuring no patient overlap across sets. All modalities receive identical preprocessing: intensity clipping to the 0.5–99.5 percentiles, z-score normalization using dataset-wide statistics, and isotropic resampling to 1 mm^{3} voxels. No modality-specific pipelines or augmentations are used. These clarifications will be inserted into the Dataset section and will be accompanied by a supplementary table listing the exact patient counts per split and organ. revision: yes

Circularity Check

0 steps flagged

No circularity in benchmark construction or FlowMI derivation

full rationale

The paper introduces a new paired dataset (Contrast-X) with radiologist-verified labels and proposes FlowMI as a flow-matching model trained directly on that data using a unified latent space. No derivation step reduces by construction to its own inputs, no fitted parameter is relabeled as a prediction, and no load-bearing claim rests on self-citation chains. The central claim is an empirical training result on external data rather than a self-referential identity.

Axiom & Free-Parameter Ledger

1 free parameters · 1 axioms · 1 invented entities

The central claim depends on standard generative modeling assumptions and the availability of paired training data.

free parameters (1)
  • model hyperparameters
    Various parameters in the flow matching model are optimized during training on the dataset.
axioms (1)
  • domain assumption Multi-modal medical images can be embedded into a unified latent space
    The model relies on this to handle arbitrary modality combinations.
invented entities (1)
  • FlowMI model no independent evidence
    purpose: To perform universal modality contrast synthesis
    The model is proposed in this paper without external validation beyond the described experiments.

pith-pipeline@v0.9.0 · 5510 in / 1231 out tokens · 39204 ms · 2026-05-16T11:59:05.326693+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

67 extracted references · 67 canonical work pages · 1 internal anchor

  1. [1]

    Gans trained by a two time-scale update rule converge to a local nash equilibrium,

    Martin Heusel, Hubert Ramsauer, Thomas Unterthiner, Bernhard Nessler, and Sepp Hochreiter, “Gans trained by a two time-scale update rule converge to a local nash equilibrium,”Advances in neural information processing systems, vol. 30, 2017

  2. [2]

    High-resolution image synthesis with latent diffu- sion models,

    Robin Rombach, Andreas Blattmann, Dominik Lorenz, Patrick Esser, and Bj ¨orn Ommer, “High-resolution image synthesis with latent diffu- sion models,” inProceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2022, pp. 10684–10695

  3. [3]

    Scalable diffusion models with transformers,

    William Peebles and Saining Xie, “Scalable diffusion models with transformers,” inICCV, 2023, pp. 4195–4205

  4. [4]

    , author Kabas, B

    Omer F Atli, Bilal Kabas, Fuat Arslan, Arda C Demirtas, Mahmut Yurt, Onat Dalmaz, and Tolga Cukur, “I2i-mamba: Multi-modal medical image synthesis via selective state space modeling,”arXiv preprint arXiv:2405.14022, 2024

  5. [5]

    Multimodal mr synthesis via modality-invariant latent representation,

    Agisilaos Chartsias, Thomas Joyce, Mario Valerio Giuffrida, and Sotirios A. Tsaftaris, “Multimodal mr synthesis via modality-invariant latent representation,”IEEE Transactions on Medical Imaging, vol. 37, pp. 803–814, 3 2018

  6. [6]

    arXiv preprint arXiv:2405.18368 (2024)

    Maria Correia de Verdier, Rachit Saluja, Louis Gagnon, Dominic LaBella, Ujjwall Baid, Nourel Hoda Tahon, Martha Foltyn-Dumitru, Jikai Zhang, Maram Alafif, Saif Baig, et al., “The 2024 brain tumor segmentation (brats) challenge: Glioma segmentation on post-treatment mri,”arXiv preprint arXiv:2405.18368, 2024, https://arxiv.org/abs/2405. 18368

  7. [7]

    Amos: A large-scale abdominal multi-organ benchmark for versatile medical image segmentation,

    Yuanfeng Ji, Haotian Bai, Chongjian Ge, Jie Yang, Ye Zhu, Ruimao Zhang, Zhen Li, Lingyan Zhanng, Wanling Ma, Xiang Wan, et al., “Amos: A large-scale abdominal multi-organ benchmark for versatile medical image segmentation,”Advances in neural information process- ing systems, vol. 35, pp. 36722–36732, 2022

  8. [8]

    The cancer imaging archive (tcia): maintaining and operating a public information repository,

    Kenneth Clark, Bruce Vendt, Kirk Smith, John Freymann, Justin Kirby, Paul Koppel, Stephen Moore, Stanley Phillips, David Maffitt, Michael Pringle, Lawrence Tarbox, and Fred Prior, “The cancer imaging archive (tcia): maintaining and operating a public information repository,” 2013, TCIA website: https://www.cancerimagingarchive.net/

  9. [9]

    Ct-org, a new dataset for multiple organ segmentation in computed tomography,

    Blaine Rister, Darvin Yi, Kaushik Shivakumar, Tomomi Nobashi, and Daniel L Rubin, “Ct-org, a new dataset for multiple organ segmentation in computed tomography,”Scientific Data, vol. 7, no. 1, pp. 381, 2020. TABLE IV: Detailed statistics of thePMPBenchdataset. Modality Overall of Modality System Overall of Dataset Organ Overall of Organ Source Dataset Ove...

  10. [10]

    Unpaired image-to-image translation using cycle-consistent adversarial networks,

    Jun-Yan Zhu, Taesung Park, Phillip Isola, and Alexei A Efros, “Unpaired image-to-image translation using cycle-consistent adversarial networks,” inProceedings of the IEEE international conference on computer vision, 2017, pp. 2223–2232

  11. [11]

    Diffusion models trained with large data are transferable visual models,

    Guangkai Xu, Yongtao Ge, Mingyu Liu, Chengxiang Fan, Kangyang Xie, Zhiyue Zhao, Hao Chen, and Chunhua Shen, “Diffusion models trained with large data are transferable visual models,”arXiv e-prints, pp. arXiv–2403, 2024

  12. [12]

    Restoring vision in adverse weather conditions with patch-based denoising diffusion models,

    Ozan ¨Ozdenizci and Robert Legenstein, “Restoring vision in adverse weather conditions with patch-based denoising diffusion models,”IEEE TPAMI, vol. 45, no. 8, pp. 10346–10357, 2023

  13. [14]

    Learning patient-specific disease dynamics with latent flow matching for longitudinal imaging generation.arXiv preprint arXiv:2512.09185, 2025

    Hao Chen, Rui Yin, Yifan Chen, Qi Chen, and Chao Li, “Learning patient-specific disease dynamics with latent flow matching for longitu- dinal imaging generation,”arXiv preprint arXiv:2512.09185, 2025

  14. [15]

    Image- to-image translation with conditional adversarial networks,

    Phillip Isola, Jun-Yan Zhu, Tinghui Zhou, and Alexei A Efros, “Image- to-image translation with conditional adversarial networks,” inProceed- ings of the IEEE conference on computer vision and pattern recognition, 2017, pp. 1125–1134

  15. [16]

    Palette: Image-to-image diffusion models,

    Chitwan Saharia, William Chan, Huiwen Chang, Chris Lee, Jonathan Ho, Tim Salimans, David Fleet, and Mohammad Norouzi, “Palette: Image-to-image diffusion models,” inACM SIGGRAPH 2022 conference proceedings, 2022, pp. 1–10

  16. [17]

    Mambair: A simple baseline for image restoration with state- space model,

    Hang Guo, Jinmin Li, Tao Dai, Zhihao Ouyang, Xudong Ren, and Shu- Tao Xia, “Mambair: A simple baseline for image restoration with state- space model,” inEuropean conference on computer vision. Springer, 2024, pp. 222–241

  17. [18]

    elastix: a tool- box for intensity-based medical image registration,

    Stefan Klein, Margreet Staring, Koen Murphy, and et al., “elastix: a tool- box for intensity-based medical image registration,”IEEE Transactions on Medical Imaging, vol. 29, no. 1, pp. 196–205, 2010

  18. [19]

    Multi-modal learning from unpaired images: Application to multi-organ segmentation in ct and mri,

    Vanya V Valindria, Nick Pawlowski, Martin Rajchl, Ioannis Lavdas, Eric O Aboagye, Andrea G Rockall, Daniel Rueckert, and Ben Glocker, “Multi-modal learning from unpaired images: Application to multi-organ segmentation in ct and mri,” in2018 IEEE winter conference on applications of computer vision (WACV). IEEE, 2018, pp. 547–556,

  19. [20]

    arXiv preprint arXiv:2504.12527 (2025)

    Nazanin Maleki, Raisa Amiruddin, Ahmed W Moawad, Nikolay Yor- danov, Athanasios Gkampenis, Pascal Fehringer, Fabian Umeh, Crys- tal Chukwurah, Fatima Memon, Bojan Petrovic, et al., “Analysis of the miccai brain tumor segmentation–metastases (brats-mets) 2025 lighthouse challenge: Brain metastasis segmentation on pre-and post- treatment mri,”arXiv preprint...

  20. [21]

    Ixi dataset (rrid:scr 005839),

    “Ixi dataset (rrid:scr 005839),” http://brain-development.org/ ixi-dataset/

  21. [22]

    Crossmoda 2021 challenge: Bench- mark of cross-modality domain adaptation techniques for vestibular schwannoma and cochlea segmentation,

    Reuben Dorent, Aaron Kujawa, Marina Ivory, Spyridon Bakas, Nicola Rieke, Samuel Joutard, Ben Glocker, Jorge Cardoso, Marc Modat, Kayhan Batmanghelich, et al., “Crossmoda 2021 challenge: Bench- mark of cross-modality domain adaptation techniques for vestibular schwannoma and cochlea segmentation,”Medical Image Analysis, vol. 83, pp. 102628, 2023, https://w...

  22. [23]

    Deep learning techniques for automatic mri cardiac multi-structures segmentation and diagnosis: is the problem solved?,

    Olivier Bernard, Alain Lalande, Clement Zotti, Frederick Cervenansky, Xin Yang, Pheng-Ann Heng, Irem Cetin, Karim Lekadir, Oscar Camara, Miguel Angel Gonzalez Ballester, et al., “Deep learning techniques for automatic mri cardiac multi-structures segmentation and diagnosis: is the problem solved?,”IEEE transactions on medical imaging, vol. 37, no. 11, pp....

  23. [24]

    Multivariate mixture model for myocardial segmen- tation combining multi-source images,

    Xiahai Zhuang, “Multivariate mixture model for myocardial segmen- tation combining multi-source images,”IEEE transactions on pattern analysis and machine intelligence, vol. 41, no. 12, pp. 2933–2946, 2018, https://ieeexplore.ieee.org/abstract/document/8458220

  24. [25]

    A whole-body fdg-pet/ct dataset with manually annotated tumor lesions,

    Sergios Gatidis, Tobias Hepp, Marcel Fr ¨uh, Christian La Foug `ere, Kon- stantin Nikolaou, Christina Pfannenberg, Bernhard Sch ¨olkopf, Thomas K¨ustner, Clemens Cyran, and Daniel Rubin, “A whole-body fdg-pet/ct dataset with manually annotated tumor lesions,”Scientific Data, vol. 9, no. 1, pp. 601, 2022, https://www.scopus.com/pages/publications/ 85139504476

  25. [26]

    The brain tumor segmentation (brats) challenge 2023: Brain mr image synthesis for tumor segmentation (brasyn),

    Hongwei Bran Li, Gian Marco Conte, Qingqiao Hu, Syed Muhammad Anwar, Florian Kofler, Ivan Ezhov, Koen van Leemput, Marie Piraud, Maria Diaz, Byrone Cole, et al., “The brain tumor segmentation (brats) challenge 2023: Brain mr image synthesis for tumor segmentation (brasyn),”ArXiv, pp. arXiv–2305, 2024, https://pmc.ncbi.nlm.nih.gov/ articles/PMC10441440/

  26. [27]

    Oasis-3: longitudinal neuroimaging, clinical, and cognitive dataset for normal aging and alzheimer disease,

    Pamela J LaMontagne, Tammie LS Benzinger, John C Morris, Sarah Keefe, Russ Hornbeck, Chengjie Xiong, Elizabeth Grant, Jason Hassen- stab, Krista Moulder, Andrei G Vlassenko, et al., “Oasis-3: longitudinal neuroimaging, clinical, and cognitive dataset for normal aging and alzheimer disease,”medrxiv, pp. 2019–12, 2019, https://www.medrxiv. org/content/10.11...

  27. [28]

    The alzheimer’s disease neuroimaging initiative-4 (adni-4) engagement core: A culturally informed, community-engaged research (ci-cer) model to advance brain health equity,

    M ´onica Rivera Mindt, Alyssa Arentoft, Amanda T Calcetas, Vanessa A Guzman, Hannatu Amaza, Adeyinka Ajayi, Miriam T Ashford, Omobolanle Ayo, Lisa L Barnes, Alicia Camuy, et al., “The alzheimer’s disease neuroimaging initiative-4 (adni-4) engagement core: A culturally informed, community-engaged research (ci-cer) model to advance brain health equity,”Alzh...

  28. [29]

    Ultra-low dose pet imaging challenge 2024 (udpet),

    “Ultra-low dose pet imaging challenge 2024 (udpet),” Dataset and challenge information available online, 2024, Accessed via https: //udpet-challenge.github.io/

  29. [30]

    Flow matching for generative modeling,

    Yaron Lipman, Ricky TQ Chen, Heli Ben-Hamu, Maximilian Nickel, and Matthew Le, “Flow matching for generative modeling,” inThe Eleventh International Conference on Learning Representations, 2023

  30. [31]

    Flow straight and fast: Learning to generate and transfer data with rectified flow,

    Xingchao Liu, Chengyue Gong, et al., “Flow straight and fast: Learning to generate and transfer data with rectified flow,” inThe Eleventh International Conference on Learning Representations, 2022

  31. [32]

    Flow matching in latent space

    Quan Dao, Hao Phung, Binh Nguyen, and Anh Tran, “Flow matching in latent space,”arXiv preprint arXiv:2307.08698, 2023

  32. [33]

    Lbm: Latent bridge match- ing for fast image-to-image translation.arXiv preprint arXiv:2503.07535, 2025

    Cl ´ement Chadebec, Onur Tasar, Sanjeev Sreetharan, and Benjamin Aubin, “Lbm: Latent bridge matching for fast image-to-image trans- lation,”arXiv preprint arXiv:2503.07535, 2025

  33. [34]

    Hi- net: hybrid-fusion network for multi-modal mr image synthesis,

    Tao Zhou, Huazhu Fu, Geng Chen, Jianbing Shen, and Ling Shao, “Hi- net: hybrid-fusion network for multi-modal mr image synthesis,”IEEE transactions on medical imaging, vol. 39, no. 9, pp. 2772–2781, 2020

  34. [35]

    Resvit: Residual vision transformers for multimodal medical image synthesis,

    Onat Dalmaz, Mahmut Yurt, and Tolga C ¸ ukur, “Resvit: Residual vision transformers for multimodal medical image synthesis,”IEEE Transactions on Medical Imaging, vol. 41, no. 10, pp. 2598–2614, 2022

  35. [36]

    Self-consistent recursive diffusion bridge for medical image translation,

    Fuat Arslan, Bilal Kabas, Onat Dalmaz, Muzaffer Ozbey, and Tolga C ¸ ukur, “Self-consistent recursive diffusion bridge for medical image translation,”Medical Image Analysis, vol. 106, pp. 103747, 2025

  36. [37]

    Flow Matching for Generative Modeling

    Yaron Lipman, Ricky TQ Chen, Heli Ben-Hamu, Maximilian Nickel, and Matt Le, “Flow matching for generative modeling,”arXiv preprint arXiv:2210.02747, 2022

  37. [38]

    Posterior-mean rectified flow: Towards minimum mse photo-realistic image restoration,

    Guy Ohayon, Tomer Michaeli, and Michael Elad, “Posterior-mean rectified flow: Towards minimum mse photo-realistic image restoration,” arXiv preprint arXiv:2410.00418, 2024

  38. [39]

    U-net: Con- volutional networks for biomedical image segmentation,

    Olaf Ronneberger, Philipp Fischer, and Thomas Brox, “U-net: Con- volutional networks for biomedical image segmentation,” inMedical image computing and computer-assisted intervention–MICCAI 2015: 18th international conference, Munich, Germany, October 5-9, 2015, proceedings, part III 18. Springer, 2015, pp. 234–241

  39. [40]

    Restore-rwkv: Efficient and effective medical image restoration with rwkv,

    Zhiwen Yang, Jiayin Li, Hui Zhang, Dan Zhao, Bingzheng Wei, and Yan Xu, “Restore-rwkv: Efficient and effective medical image restoration with rwkv,” 2024

  40. [41]

    Effective diffusion transformer architecture for image super-resolution,

    Kun Cheng, Lei Yu, Zhijun Tu, Xiao He, Liyu Chen, Yong Guo, Mingrui Zhu, Nannan Wang, Xinbo Gao, and Jie Hu, “Effective diffusion transformer architecture for image super-resolution,” inProceedings of the AAAI Conference on Artificial Intelligence, 2025, pp. 2455–2463

  41. [42]

    V oxel-level segmentation of pathologically-proven adrenocortical carcinoma with ki-67 expression (adrenal-acc-ki67-seg)[data set],

    Ahmed W Moawad, Ayahallah A Ahmed, Mohab ElMohr, Mohamed Eltaher, Mouhammed Amir Habra, Sarah Fisher, Nancy Perrier, Miao Zhang, David Fuentes, and Khaled Elsayes, “V oxel-level segmentation of pathologically-proven adrenocortical carcinoma with ki-67 expression (adrenal-acc-ki67-seg)[data set],”The Cancer Imaging Archive, vol. 8, 2023

  42. [43]

    The cancer genome atlas ovarian cancer collection (tcga-ov)(version 4)[data set],

    Chandra Holback, Rose Jarosz, Fred Prior, David G Mutch, Priya Bhosale, Kimberly Garcia, Yueh Lee, Shanah Kirk, Cheryl A Sadow, Seth Levine, et al., “The cancer genome atlas ovarian cancer collection (tcga-ov)(version 4)[data set],”The Cancer Imaging Archive, vol. 10, pp. K9, 2016

  43. [44]

    The cancer genome atlas uterine corpus endometrial carcinoma collection (tcga-ucec),

    BJ Erickson, D Mutch, L Lippmann, and R Jarosz, “The cancer genome atlas uterine corpus endometrial carcinoma collection (tcga-ucec),”The Cancer Imaging Archive, 2016

  44. [45]

    The clinical proteomic tumor analysis consortium uterine corpus endometrial carcinoma collection (cptac-ucec), version 10 [dataset],

    National Cancer Institute Clinical Proteomic Tumor Analysis Con- sortium et al., “The clinical proteomic tumor analysis consortium uterine corpus endometrial carcinoma collection (cptac-ucec), version 10 [dataset],”The Cancer Imaging Archive, vol. 10, pp. k9, 2019

  45. [46]

    The cancer genome atlas stomach adenocarcinoma collection (tcga-stad),

    FR Lucchesi and ND Aredes, “The cancer genome atlas stomach adenocarcinoma collection (tcga-stad),”The Cancer Imaging Archive, 2016

  46. [47]

    The clinical proteomic tumor analysis consortium pancreatic ductal adenocarcinoma collection (cptac-pda) (version 15) [data set],

    National Cancer Institute Clinical Proteomic Tumor Analysis Consor- tium (CPTAC), “The clinical proteomic tumor analysis consortium pancreatic ductal adenocarcinoma collection (cptac-pda) (version 15) [data set],” 2018, The Cancer Imaging Archive. https://doi.org/10.7937/ k9/tcia.2018.sc20fo18

  47. [48]

    Multimodality annotated hcc cases with and without advanced imaging segmentation,

    AW Moawad, D Fuentes, A Morshid, AM Khalaf, MM Elmohr, A Abu- saif, JD Hazle, AO Kaseb, M Hassan, A Mahvash, et al., “Multimodality annotated hcc cases with and without advanced imaging segmentation,” The Cancer Imaging Archive (TCIA), 2021

  48. [49]

    The cancer genome atlas liver hepatocellular carcinoma collection (tcga- lihc)(version 5)[data set],

    Bradley J Erickson, Shanah Kirk, Y Lee, Oliver Bathe, Melissa Kearns, C Gerdes, Kimberly Rieger-Christ, and John Lemmerman, “The cancer genome atlas liver hepatocellular carcinoma collection (tcga- lihc)(version 5)[data set],”The Cancer Imaging Archive, 2016

  49. [50]

    2022.url:https://doi.org/10.7937/DJG7-GZ87

    Cancer Moonshot Biobank, “Cancer moonshot biobank – colorectal cancer collection (cmb-crc) (version 8) [data set],” 2022, The Cancer Imaging Archive. https://doi.org/10.7937/djg7-gz87

  50. [51]

    The cancer genome atlas colon adenocarcinoma collection (tcga- coad)(version 3)[data set],

    S Kirk, Y Lee, CA Sadow, S Levine, C Roche, E Bonaccio, and J Filiip- pini, “The cancer genome atlas colon adenocarcinoma collection (tcga- coad)(version 3)[data set],”The Cancer Imaging Archive. https://doi. org/10.7937 K, vol. 9, 2016

  51. [52]

    The cancer genome atlas urothelial bladder carcinoma collection (tcga-blca),

    Shanah Kirk, Yueh Lee, Fabiano R Lucchesi, Natalia D Aredes, Nicholas Gruszauskas, James Catto, Kimberly Garcia, Rose Jarosz, Vinay Duddal- war, Bino Varghese, et al., “The cancer genome atlas urothelial bladder carcinoma collection (tcga-blca),” inThe Cancer Imaging Archive. 2016

  52. [53]

    The cancer genome atlas kidney renal clear cell carcinoma collection (tcga-kirc)(version 3)[data set],

    Oguz Akin, Pierre Elnajjar, Matthew Heller, Rose Jarosz, Bradley J Erickson, Shanah Kirk, Yueh Lee, Marston W Linehan, Rabindra Gautam, Raghu Vikram, et al., “The cancer genome atlas kidney renal clear cell carcinoma collection (tcga-kirc)(version 3)[data set],”Cancer Imaging Arch, 2016

  53. [54]

    C4KC KiTS Challenge Kidney Tumor Segmentation Dataset

    Nicholas Heller, Nithesh Sathianathen, Arveen Kalapara, Ethan Walczak, Kenneth Moore, Holly Kaluzniak, Jacob Rosenberg, Paul Blake, Zachary Rengel, Michael Oestreich, Joel Dean, Matthew Tradewell, Adeel Shah, Rishi Tejpaul, Zachary Edgerton, Matthew Peterson, Sohaib Raza, Samip Regmi, Nikolaos Papanikolopoulos, and Christopher Weight, “Data from c4kc-kits...

  54. [55]

    The clinical proteomic tumor analysis consortium clear cell renal cell carcinoma collection (cptac-ccrcc) (version 14) [data set],

    National Cancer Institute Clinical Proteomic Tumor Analysis Consor- tium (CPTAC), “The clinical proteomic tumor analysis consortium clear cell renal cell carcinoma collection (cptac-ccrcc) (version 14) [data set],” 2018, The Cancer Imaging Archive. https://doi.org/10.7937/k9/ tcia.2018.oblamn27

  55. [56]

    The cancer genome atlas cervical kidney renal papillary cell carcinoma collection (tcga-kirp), version 4,

    Marston Linehan, R Gautam, S Kirk, Y Lee, C Roche, E Bonaccio, J Filippini, K Rieger-Christ, J Lemmerman, and R Jarosz, “The cancer genome atlas cervical kidney renal papillary cell carcinoma collection (tcga-kirp), version 4,”The Cancer Imaging Archive, 2016

  56. [57]

    The cancer genome atlas kidney chromophobe collection (tcga-kich)(version 3),

    M Linehan, R Gautam, C Sadow, and SJ Levine, “The cancer genome atlas kidney chromophobe collection (tcga-kich)(version 3),”The Cancer Imaging Archive, 2016

  57. [58]

    Cancer moonshot biobank-lung cancer collection (cmb- lca)(version 3)[dataset],

    C Biobank, “Cancer moonshot biobank-lung cancer collection (cmb- lca)(version 3)[dataset],”The Cancer Imaging Archive, 2022

  58. [59]

    The clinical proteomic tumor analysis consortium lung squamous cell carcinoma collection (cptac-lscc),

    National Cancer Institute Clinical Proteomic Tumor Analysis Consor- tium et al., “The clinical proteomic tumor analysis consortium lung squamous cell carcinoma collection (cptac-lscc),”(No Title), 2018

  59. [60]

    The clinical proteomic tumor analysis consortium lung adenocarcinoma collection (cptac-luad)(version 12)[data set],

    National Cancer Institute Clinical Proteomic Tumor Analysis Consor- tium et al., “The clinical proteomic tumor analysis consortium lung adenocarcinoma collection (cptac-luad)(version 12)[data set],”The Cancer Imaging Archive. Published online, 2018

  60. [61]

    A large-scale ct and pet/ct dataset for lung cancer diagnosis (lung-pet-ct-dx) [data set],

    Peng Li, Shaoke Wang, Tianyu Li, Jie Lu, Yufan HuangFu, and Dong Wei Wang, “A large-scale ct and pet/ct dataset for lung cancer diagnosis (lung-pet-ct-dx) [data set],” 2020, The Cancer Imaging Archive. https://doi.org/10.7937/TCIA.2020.NNC2-0461

  61. [62]

    The cancer genome atlas lung squamous cell carcinoma collection (tcga-lusc)(version 4)[data set],

    S Kirk, Y Lee, P Kumar, J Filippini, B Albertina, M Watson, K Rieger- Christ, and J Lemmerman, “The cancer genome atlas lung squamous cell carcinoma collection (tcga-lusc)(version 4)[data set],”The Cancer Imaging Archive, 2016

  62. [63]

    Data from anti-pd- 1 immunotherapy lung [data set],

    Pranathi Madhavi, Shweta Patel, and Anne S. Tsao, “Data from anti-pd- 1 immunotherapy lung [data set],” 2019, The Cancer Imaging Archive. https://doi.org/10.7937/tcia.2019.zjjwb9ip

  63. [64]

    The cancer genome atlas breast invasive carcinoma collection (tcga-brca),

    Wilma Lingle, Bradley J Erickson, Margarita L Zuley, Rose Jarosz, Ermelinda Bonaccio, Joe Filippini, Jose M Net, Len Levi, Elizabeth A Morris, Gloria G Figler, et al., “The cancer genome atlas breast invasive carcinoma collection (tcga-brca),”(No Title), 2016

  64. [65]

    Radiological tumour classification across imaging modality and histology,

    Jia Wu, Chao Li, Michael Gensheimer, Sukhmani Padda, Fumi Kato, Hiroki Shirato, Yiran Wei, Carola-Bibiane Sch ¨onlieb, Stephen John Price, David Jaffray, et al., “Radiological tumour classification across imaging modality and histology,”Nature machine intelligence, vol. 3, no. 9, pp. 787–798, 2021

  65. [66]

    Invasive breast cancer: predicting disease recurrence by using high-spatial-resolution signal enhancement ratio imaging,

    Ka-Loh Li, Savannah C Partridge, Bonnie N Joe, Jessica E Gibbs, Ying Lu, Laura J Esserman, and Nola M Hylton, “Invasive breast cancer: predicting disease recurrence by using high-spatial-resolution signal enhancement ratio imaging,”Radiology, vol. 248, no. 1, pp. 79– 87, 2008

  66. [67]

    Optimized breast mri functional tumor volume as a biomarker of recurrence-free survival following neoadjuvant chemotherapy,

    Nazia F Jafri, David C Newitt, John Kornak, Laura J Esserman, Bonnie N Joe, and Nola M Hylton, “Optimized breast mri functional tumor volume as a biomarker of recurrence-free survival following neoadjuvant chemotherapy,”Journal of Magnetic Resonance Imaging, vol. 40, no. 2, pp. 476–482, 2014

  67. [68]

    Multi-center breast dce-mri data and segmentations from patients in the i-spy 1/acrin 6657 trials,

    David Newitt, Nola Hylton, et al., “Multi-center breast dce-mri data and segmentations from patients in the i-spy 1/acrin 6657 trials,”Cancer Imaging Arch, vol. 10, no. 7, pp. 2016, 2016