pith. sign in

arxiv: 2508.11211 · v2 · submitted 2025-08-15 · 📡 eess.IV · cs.CV

Efficient Image-to-Image Schr\"odinger Bridge for CT Field of View Extension

Pith reviewed 2026-05-18 23:35 UTC · model grok-4.3

classification 📡 eess.IV cs.CV
keywords CT field of view extensionSchrödinger Bridgediffusion modelsimage-to-image mappingtruncated projectionsmedical image reconstructionartifact reduction
0
0 comments X

The pith

An image-to-image Schrödinger Bridge learns direct stochastic mappings from limited-FOV to extended-FOV CT scans.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper establishes that replacing noise-to-image diffusion with a direct bridge between paired limited and full field-of-view CT images produces lower reconstruction error and far faster inference. A reader would care because truncated CT projections currently force either incomplete anatomy or slow iterative fixes, and a method that finishes in under a second per slice could bring reliable FOV extension into everyday clinical workflows. The direct mapping also keeps the generative steps traceable, which helps preserve consistent anatomical structures instead of synthesizing them from random noise.

Core claim

The I²SB model learns a direct stochastic mapping between paired limited-FOV and extended-FOV CT images rather than synthesizing from pure Gaussian noise. This produces RMSE values of 49.8 HU on simulated noisy data and 152.0 HU on real data while completing reconstruction in a single step that takes 0.19 seconds per 2D slice, more than 700 times faster than conditional DDPM.

What carries the argument

The image-to-image Schrödinger Bridge, which learns a direct stochastic mapping between paired limited-FOV and extended-FOV images to replace iterative denoising from noise.

If this is right

  • Reconstruction finishes in 0.19 seconds per 2D slice instead of minutes.
  • RMSE stays lower than cDDPM and patch-based diffusion on both simulated noisy and real data.
  • The traceable mapping improves anatomical consistency over noise-driven synthesis.
  • The speed-accuracy balance supports real-time or clinical deployment.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same direct-mapping approach could address truncation artifacts in cone-beam CT or limited-angle tomography without new hardware.
  • Collecting paired data across multiple scanner models during training would likely improve robustness to real-world geometry differences.
  • Extending the one-step bridge to full 3D volumes would remove slice-wise inconsistencies that arise in current 2D processing.

Load-bearing premise

The method assumes paired limited-FOV and extended-FOV training images exist and that the learned mapping generalizes to unseen patient anatomies and scanner geometries without creating false structures.

What would settle it

Apply the trained model to real scans from a scanner model or patient population absent from training and measure whether RMSE exceeds 152 HU or new anatomical inconsistencies appear at FOV boundaries.

Figures

Figures reproduced from arXiv: 2508.11211 by Haijun Yu, Hongbin Han, Jiazhou Wang, Long Yang, Song Ni, Weigang Hu, Xiaojie Yin, Yixing Huang, Zhenhao Li.

Figure 1
Figure 1. Figure 1: The key difference in the image generation process between a [PITH_FULL_IMAGE:figures/full_fig_p003_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Results of two exemplary test slices in the noisy scenario. The first and third rows represent different slices in the test set, and the second and fourth [PITH_FULL_IMAGE:figures/full_fig_p005_2.png] view at source ↗
Figure 4
Figure 4. Figure 4: The reference images were reconstructed using the fast [PITH_FULL_IMAGE:figures/full_fig_p006_4.png] view at source ↗
Figure 3
Figure 3. Figure 3: Quantifying the uncertainty of reconstruction. (a) Ground truth, (b-d) sampling images with different random seed, (e) mean of the reconstruction, (f) [PITH_FULL_IMAGE:figures/full_fig_p007_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: Results of two exemplary test slices in the real data. The first and third rows represent different slices in the test set, and the second and fourth rows [PITH_FULL_IMAGE:figures/full_fig_p007_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: It is worth noting that residual noise remains in the [PITH_FULL_IMAGE:figures/full_fig_p009_5.png] view at source ↗
Figure 5
Figure 5. Figure 5: Results of two exemplary test slices in the noise-free scenario. The first and third rows represent different slices in the test set, and the second and [PITH_FULL_IMAGE:figures/full_fig_p010_5.png] view at source ↗
read the original abstract

Computed tomography (CT) is a cornerstone imaging modality for non-invasive, high-resolution visualization of internal anatomical structures. However, when the scanned object exceeds the scanner's field of view (FOV), projection data are truncated, resulting in incomplete reconstructions and pronounced artifacts near FOV boundaries. Conventional reconstruction algorithms struggle to recover accurate anatomy from such data, limiting clinical reliability. Deep learning approaches have been explored for FOV extension, with diffusion generative models representing the latest advances in image synthesis. Yet, conventional diffusion models are computationally demanding and slow at inference due to their iterative sampling process. To address these limitations, we propose an efficient CT FOV extension framework based on the image-to-image Schr\"odinger Bridge (I$^2$SB) diffusion model. Unlike traditional diffusion models that synthesize images from pure Gaussian noise, I$^2$SB learns a direct stochastic mapping between paired limited-FOV and extended-FOV images. This direct correspondence yields a more interpretable and traceable generative process, enhancing anatomical consistency and structural fidelity in reconstructions. I$^2$SB achieves superior quantitative performance, with root-mean-square error (RMSE) values of 49.8 HU on simulated noisy data and 152.0 HU on real data, outperforming state-of-the-art diffusion models such as conditional denoising diffusion probabilistic models (cDDPM) and patch-based diffusion methods. Moreover, its one-step inference enables reconstruction in just 0.19 s per 2D slice, representing over a 700-fold speedup compared to cDDPM (135 s) and surpassing DiffusionGAN (0.58 s), the second fastest. This combination of accuracy and efficiency indicates that I$^2$SB has potential for real-time or clinical deployment.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The manuscript introduces an image-to-image Schrödinger Bridge (I²SB) framework for CT field-of-view extension. It learns a direct stochastic mapping between paired limited-FOV and extended-FOV images rather than synthesizing from Gaussian noise, yielding reported RMSE values of 49.8 HU on simulated noisy data and 152.0 HU on real data, together with one-step inference at 0.19 s per 2D slice (over 700-fold speedup versus cDDPM).

Significance. If the quantitative gains and generalization hold, the work offers a practical route to real-time FOV extension in clinical CT, addressing truncation artifacts with both higher fidelity and orders-of-magnitude faster inference than iterative diffusion baselines. The direct-mapping formulation is a clear methodological strength that improves traceability over standard conditional diffusion models.

major comments (2)
  1. [§4] §4 (Experiments and Results): The reported RMSE of 152.0 HU on 'real data' and the clinical-deployment claim rest on evaluation that uses simulated truncations for both training and test sets; no cross-scanner, cross-anatomy, or unpaired real-patient validation is described, leaving the assumption that the learned mapping generalizes without introducing hallucinations untested and load-bearing for the central performance claim.
  2. [§3.2] §3.2 (I²SB formulation): While the one-step inference is presented as a direct stochastic mapping, the manuscript does not provide an ablation or theoretical argument showing that this mapping remains stable under distribution shift in scanner geometry or patient anatomy; the quantitative superiority therefore depends on an unverified generalization premise.
minor comments (2)
  1. [Abstract] Abstract and §1: The LaTeX rendering 'Schrödinger' appears correctly, but ensure consistent use of the umlaut throughout the text and figure captions.
  2. [Results] Table 1 or equivalent results table: Include standard deviations or statistical significance tests alongside the reported RMSE and timing values to strengthen the comparison with cDDPM and DiffusionGAN.

Simulated Author's Rebuttal

2 responses · 1 unresolved

We thank the referee for the detailed and constructive report. We address each major comment below and have revised the manuscript to improve clarity, add supporting analysis, and acknowledge limitations where appropriate.

read point-by-point responses
  1. Referee: [§4] §4 (Experiments and Results): The reported RMSE of 152.0 HU on 'real data' and the clinical-deployment claim rest on evaluation that uses simulated truncations for both training and test sets; no cross-scanner, cross-anatomy, or unpaired real-patient validation is described, leaving the assumption that the learned mapping generalizes without introducing hallucinations untested and load-bearing for the central performance claim.

    Authors: We acknowledge that the 'real data' experiments apply simulated truncations to actual clinical CT volumes, as paired ground-truth extended-FOV images from truly truncated clinical acquisitions are unavailable. This is standard practice for the task. In the revised manuscript we have clarified this setup in §4 and added an explicit limitations paragraph discussing the generalization premise. We have not performed cross-scanner or unpaired real-patient validation because such diverse paired datasets are not accessible to us at present; we therefore treat the current real-data results as preliminary evidence rather than definitive proof of clinical readiness. revision: partial

  2. Referee: [§3.2] §3.2 (I²SB formulation): While the one-step inference is presented as a direct stochastic mapping, the manuscript does not provide an ablation or theoretical argument showing that this mapping remains stable under distribution shift in scanner geometry or patient anatomy; the quantitative superiority therefore depends on an unverified generalization premise.

    Authors: We appreciate this point. In the revised version we have added a short theoretical paragraph in §3.2 noting that the Schrödinger Bridge learns an optimal transport map between the paired marginals, which is expected to be more robust to moderate shifts than iterative noise-to-image diffusion. We have also included a new ablation (now Table 3) that perturbs test-set geometry and anatomy parameters and reports the resulting RMSE degradation, showing graceful rather than catastrophic failure. These additions directly address the stability concern. revision: yes

standing simulated objections not resolved
  • Extensive multi-scanner or unpaired real truncated-patient validation would require new data collection beyond the scope of the current study.

Circularity Check

0 steps flagged

No circularity in the derivation or performance claims

full rationale

The paper applies the existing I²SB framework to learn a direct stochastic mapping from paired limited-FOV and extended-FOV CT images via standard supervised training. Reported RMSE values (49.8 HU simulated, 152.0 HU real) and inference times are presented as empirical results from experiments on simulated and real data, not as outputs of a mathematical derivation that reduces to the training assumptions or fitted parameters by construction. No equations, self-definitional steps, or load-bearing self-citations appear in the abstract or described method that would make the central claims equivalent to their inputs. The approach is self-contained as a data-driven application of a known generative model.

Axiom & Free-Parameter Ledger

2 free parameters · 1 axioms · 0 invented entities

The central claim rests on supervised training of a neural network on paired truncated and full-FOV CT images; this introduces a large number of fitted parameters whose values are determined by the training data rather than derived from first principles.

free parameters (2)
  • neural network weights
    All parameters of the I²SB model are optimized on paired CT data; no count or specific values are given in the abstract.
  • training hyperparameters
    Learning rate, batch size, and noise schedule parameters are chosen to fit the observed CT distributions.
axioms (1)
  • domain assumption Paired limited-FOV and extended-FOV images exist and are representative of clinical distributions
    The direct stochastic mapping in I²SB presupposes access to such aligned training pairs.

pith-pipeline@v0.9.0 · 5872 in / 1424 out tokens · 33667 ms · 2026-05-18T23:35:08.836586+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

50 extracted references · 50 canonical work pages

  1. [1]

    Adaptive radiotherapy triggering for nasopharyngeal cancer based on bayesian decision model,

    L. Yang, X. Yin, Z. Li, Z. Ding, Y . Zou, Z. Li, E. Mo, Q. Zhou, J. Wang, and W. Hu, “Adaptive radiotherapy triggering for nasopharyngeal cancer based on bayesian decision model,”Physics in Medicine & Biology, vol. 70, no. 7, p. 075015, 2025

  2. [2]

    Evaluation of novel AI-based extended field-of-view CT reconstructions,

    G. P. Fonseca, M. Baer-Beck, E. Fournie, C. Hofmann, I. Rinaldi, M. C. Ollers, W. J. van Elmpt, and F. Verhaegen, “Evaluation of novel AI-based extended field-of-view CT reconstructions,”Medical physics, vol. 48, no. 7, pp. 3583–3594, 2021

  3. [3]

    A transformer-based dual-domain network for reconstructing FOV ex- tended cone-beam CT images from truncated sinograms in radiation therapy,

    L. Gao, K. Xie, J. Sun, T. Lin, J. Sui, G. Yang, and X. Ni, “A transformer-based dual-domain network for reconstructing FOV ex- tended cone-beam CT images from truncated sinograms in radiation therapy,”Computer methods and programs in biomedicine, vol. 241, p. 107767, 2023

  4. [4]

    Fiducial marker recovery and detection from severely truncated data in navigation- assisted spine surgery,

    F. Fan, B. Kreher, H. Keil, A. Maier, and Y . Huang, “Fiducial marker recovery and detection from severely truncated data in navigation- assisted spine surgery,”Medical Physics, vol. 49, pp. 2914–2930, 2022

  5. [5]

    Body composition assessment with limited field-of-view computed tomography: A semantic image extension perspective,

    K. Xu, T. Li, M. S. Khan, R. Gao, S. L. Antic, Y . Huo, K. L. Sandler, F. Maldonado, and B. A. Landman, “Body composition assessment with limited field-of-view computed tomography: A semantic image extension perspective,”Medical Image Analysis, vol. 88, p. 102852, 2023

  6. [6]

    Diffusion- Based Generative Image Outpainting for Recovery of FOV-Truncated CT Images,

    M. E. Liman, D. Rueckert, F. J. Fintelmann, and P. M ¨uller, “Diffusion- Based Generative Image Outpainting for Recovery of FOV-Truncated CT Images,” inInternational Conference on Medical Image Computing and Computer-Assisted Intervention. Springer, 2024, pp. 14–23

  7. [7]

    Towards clinical application of a laplace operator-based region of interest reconstruction algorithm in C-arm CT,

    Y . Xia, H. Hofmann, F. Dennerlein, K. Mueller, C. Schwemmer, S. Bauer, G. Chintalapani, P. Chinnadurai, J. Hornegger, and A. Maier, “Towards clinical application of a laplace operator-based region of interest reconstruction algorithm in C-arm CT,”IEEE Trans. Med. Imaging, vol. 33, no. 3, pp. 593–606, 2013

  8. [8]

    Data extrapolation from learned prior images for truncation correction in computed tomography,

    Y . Huang, A. Preuhs, M. Manhart, G. Lauritsch, and A. Maier, “Data extrapolation from learned prior images for truncation correction in computed tomography,”IEEE Transactions on Medical Imaging, vol. 40, no. 11, pp. 3042–3053, 2021

  9. [9]

    A review of deep learning ct reconstruction from incomplete projection data,

    T. Wang, W. Xia, J. Lu, and Y . Zhang, “A review of deep learning ct reconstruction from incomplete projection data,”IEEE Transactions on Radiation and Plasma Medical Sciences, vol. 8, no. 2, pp. 138–152, 2023

  10. [10]

    Field of view extension in computed tomography using deep learning prior,

    Y . Huang, L. Gao, A. Preuhs, and A. Maier, “Field of view extension in computed tomography using deep learning prior,” inProc. BVM, 2020, pp. 186–191

  11. [11]

    Deep learning-based extended field of view computed tomography image reconstruction: influence of network design on image estimation outside the scan field of view,

    B. S. Khural, M. Baer-Beck, E. Fourni ´e, K. Stierstorfer, Y . Huang, and A. Maier, “Deep learning-based extended field of view computed tomography image reconstruction: influence of network design on image estimation outside the scan field of view,”Biomedical Physics & Engineering Express, vol. 8, no. 2, p. 025021, 2022

  12. [12]

    Region of interest reconstruction from truncated data in circular cone- beam CT,

    L. Yu, Y . Zou, E. Y . Sidky, C. A. Pelizzari, P. Munro, and X. Pan, “Region of interest reconstruction from truncated data in circular cone- beam CT,”IEEE Trans. Med. imaging, vol. 25, pp. 869–881, 2006

  13. [13]

    Interior reconstruction using the truncated Hilbert transform via singular value decomposition,

    H. Yu, Y . Ye, and G. Wang, “Interior reconstruction using the truncated Hilbert transform via singular value decomposition,”Journal of X-ray science and technology, vol. 16, no. 4, pp. 243–251, 2008

  14. [14]

    Efficient correction for CT image artifacts caused by objects extending outside the scan field of view,

    B. Ohnesorge, T. Flohr, K. Schwarz, J. Heiken, and K. Bae, “Efficient correction for CT image artifacts caused by objects extending outside the scan field of view,”Medical physics, vol. 27, no. 1, pp. 39–46, 2000

  15. [15]

    Application of geometric shape-based CT field-of-view exten- sion algorithms in an all-digital positron emission tomography/computed tomography system,

    T. Hu, B. Li, J. Yang, B. Zhang, L. Fang, Y . Liu, P. Xiao, and Q. Xie, “Application of geometric shape-based CT field-of-view exten- sion algorithms in an all-digital positron emission tomography/computed tomography system,”Medical Physics, vol. 51, no. 2, pp. 1034–1046, 2024

  16. [16]

    A novel reconstruction algorithm to extend the CT scan field-of-view,

    J. Hsieh, E. Chao, J. Thibault, B. Grekowicz, A. Horst, S. McOlash, and T. Myers, “A novel reconstruction algorithm to extend the CT scan field-of-view,”Medical physics, vol. 31, no. 9, pp. 2385–2391, 2004

  17. [17]

    Total Variation-Stokes Strategy for Sparse-View X-ray CT Image Reconstruction,

    Y . Liu, Z. Liang, J. Ma, H. Lu, K. Wang, H. Zhang, and W. Moore, “Total Variation-Stokes Strategy for Sparse-View X-ray CT Image Reconstruction,”IEEE Transactions on Medical Imaging, vol. 33, no. 3, pp. 749–763, 2014

  18. [18]

    Scale-Space Anisotropic Total Variation for Limited Angle Tomography,

    Y . Huang, O. Taubmann, X. Huang, V . Haase, G. Lauritsch, and A. Maier, “Scale-Space Anisotropic Total Variation for Limited Angle Tomography,”IEEE Transactions on Radiation and Plasma Medical Sciences, vol. 2, no. 4, pp. 307–314, 2018

  19. [19]

    Compressed sensing based interior tomography,

    H. Yu and G. Wang, “Compressed sensing based interior tomography,” Phys. Med. Biol., vol. 54, no. 9, p. 2791, 2009

  20. [20]

    A general total variation minimization theorem for compressed sensing based interior tomography,

    W. Han, H. Yu, and G. Wang, “A general total variation minimization theorem for compressed sensing based interior tomography,”Int. J. Biomed. Imaging, vol. 2009, 2009

  21. [21]

    Learning Perspective Distortion Correction in Cone-Beam X-Ray Transmission Imaging,

    Y . Huang, A. Maier, F. Fan, B. Kreher, X. Huang, R. Fietkau, H. Han, F. Putz, and C. Bert, “Learning Perspective Distortion Correction in Cone-Beam X-Ray Transmission Imaging,”IEEE Transactions on Ra- diation and Plasma Medical Sciences, 2025

  22. [22]

    CoreDiff: Contextual error-modulated generalized diffusion model for low-dose CT denoising and generalization,

    Q. Gao, Z. Li, J. Zhang, Y . Zhang, and H. Shan, “CoreDiff: Contextual error-modulated generalized diffusion model for low-dose CT denoising and generalization,”IEEE Transactions on Medical Imaging, vol. 43, no. 2, pp. 745–759, 2024

  23. [23]

    Prior-image-based low-dose CT recon- struction for adaptive radiation therapy,

    Y . Xu, J. Wang, and W. Hu, “Prior-image-based low-dose CT recon- struction for adaptive radiation therapy,”Physics in Medicine & Biology, vol. 69, no. 21, p. 215004, 2024

  24. [24]

    Prior- FOVNet: A multimodal deep learning framework for megavoltage computed tomography truncation artifact correction and field-of-view extension,

    L. Tang, M. Zheng, P. Liang, Z. Li, Y . Zhu, and H. Zhang, “Prior- FOVNet: A multimodal deep learning framework for megavoltage computed tomography truncation artifact correction and field-of-view extension,”Sensors, vol. 25, no. 1, p. 39, 2024

  25. [25]

    Score-based generative modeling through stochastic differ- ential equations,

    Y . Song, J. Sohl-Dickstein, D. P. Kingma, A. Kumar, S. Ermon, and B. Poole, “Score-based generative modeling through stochastic differ- ential equations,” inInternational Conference on Learning Representa- tions, 2021, pp. 1–36

  26. [26]

    Denoising diffusion probabilistic models,

    J. Ho, A. Jain, and P. Abbeel, “Denoising diffusion probabilistic models,” Advances in neural information processing systems, vol. 33, pp. 6840– 6851, 2020. JOURNAL OF LATEX CLASS FILES, VOL. 14, NO. 8, AUGUST 2021 9

  27. [27]

    Diffusion models beat gans on image synthesis,

    P. Dhariwal and A. Nichol, “Diffusion models beat gans on image synthesis,”Advances in neural information processing systems, vol. 34, pp. 8780–8794, 2021

  28. [28]

    Improved denoising diffusion probabilis- tic models,

    A. Q. Nichol and P. Dhariwal, “Improved denoising diffusion probabilis- tic models,” inInternational conference on machine learning. PMLR, 2021, pp. 8162–8171

  29. [29]

    Two-view industrial CT reconstruc- tion based on a multi-scale conditional latent diffusion network,

    L. Hou, J. Liu, X. Li, and Y . Sun, “Two-view industrial CT reconstruc- tion based on a multi-scale conditional latent diffusion network,”IEEE Transactions on Instrumentation and Measurement, 2024

  30. [30]

    Prior frequency guided diffusion model for limited angle (LA)-CBCT reconstruction,

    J. Xie, H.-C. Shao, Y . Li, and Y . Zhang, “Prior frequency guided diffusion model for limited angle (LA)-CBCT reconstruction,”Physics in Medicine & Biology, vol. 69, no. 13, p. 135008, 2024

  31. [31]

    Wavelet-inspired multi-channel score-based model for limited-angle CT reconstruction,

    J. Zhang, H. Mao, X. Wang, Y . Guo, and W. Wu, “Wavelet-inspired multi-channel score-based model for limited-angle CT reconstruction,” IEEE Transactions on Medical Imaging, vol. 43, no. 10, pp. 3436–3448, 2024

  32. [32]

    Score-based generative null-space shuttle for the field-of-view of STCT expansion,

    H. Xie, H. Yu, S. Ni, C. Tan, G. Zhang, Z. Wang, M. Zhan, and F. Liu, “Score-based generative null-space shuttle for the field-of-view of STCT expansion,”IEEE Transactions on Radiation and Plasma Medical Sciences, vol. 9, no. 6, pp. 776–787, 2025

  33. [33]

    Palette: Image-to-image diffusion models,

    C. Saharia, W. Chan, H. Chang, C. Lee, J. Ho, T. Salimans, D. Fleet, and M. Norouzi, “Palette: Image-to-image diffusion models,” inACM SIGGRAPH 2022 conference proceedings, 2022, pp. 1–10

  34. [34]

    Diffusion models: A comprehensive survey of methods and applications,

    L. Yang, Z. Zhang, Y . Song, S. Hong, R. Xu, Y . Zhao, W. Zhang, B. Cui, and M.-H. Yang, “Diffusion models: A comprehensive survey of methods and applications,”ACM computing surveys, vol. 56, no. 4, pp. 1–39, 2023

  35. [35]

    Diffusion models in vision: A survey,

    F.-A. Croitoru, V . Hondru, R. T. Ionescu, and M. Shah, “Diffusion models in vision: A survey,”IEEE transactions on pattern analysis and machine intelligence, vol. 45, no. 9, pp. 10 850–10 869, 2023

  36. [36]

    Fast ode-based sampling for diffusion models in around 5 steps,

    Z. Zhou, D. Chen, C. Wang, and C. Chen, “Fast ode-based sampling for diffusion models in around 5 steps,” inProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024, pp. 7777–7786

  37. [37]

    High- resolution image synthesis with latent diffusion models,

    R. Rombach, A. Blattmann, D. Lorenz, P. Esser, and B. Ommer, “High- resolution image synthesis with latent diffusion models,” inProceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2022, pp. 10 684–10 695

  38. [38]

    Patch diffusion: Faster and more data-efficient train- ing of diffusion models,

    Z. Wang, Y . Jiang, H. Zheng, P. Wang, P. He, Z. Wang, W. Chen, M. Zhouet al., “Patch diffusion: Faster and more data-efficient train- ing of diffusion models,”Advances in neural information processing systems, vol. 36, pp. 72 137–72 154, 2023

  39. [39]

    Tackling the generative learning trilemma with denoising diffusion GANs,

    Z. Xiao, K. Kreis, and A. Vahdat, “Tackling the generative learning trilemma with denoising diffusion GANs,” inInternational Conference on Learning Representations (ICLR), 2022, pp. 1–28

  40. [40]

    Sur la th´eorie relativiste de l’´electron et l’interpr´etation de la m ´ecanique quantique,

    E. Schr ¨odinger, “Sur la th´eorie relativiste de l’´electron et l’interpr´etation de la m ´ecanique quantique,” inAnnales de l’institut Henri Poincar ´e, vol. 2, no. 4, 1932, pp. 269–310

  41. [41]

    Diffusion schr¨odinger bridge with applications to score-based generative model- ing,

    V . De Bortoli, J. Thornton, J. Heng, and A. Doucet, “Diffusion schr¨odinger bridge with applications to score-based generative model- ing,”Advances in neural information processing systems, vol. 34, pp. 17 695–17 709, 2021

  42. [42]

    Diffusion schr¨odinger bridge matching,

    Y . Shi, V . De Bortoli, A. Campbell, and A. Doucet, “Diffusion schr¨odinger bridge matching,”Advances in Neural Information Process- ing Systems, vol. 36, pp. 62 183–62 223, 2023

  43. [43]

    I 2SB: Image-to-image Schr ¨odinger Bridge,

    G.-H. Liu, A. Vahdat, D.-A. Huang, E. A. Theodorou, W. Nie, and A. Anandkumar, “I 2SB: Image-to-image Schr ¨odinger Bridge,” inInter- national Conference on Machine Learning, 2023, pp. 1–21

  44. [44]

    Diffusion schr¨odinger bridge models for high-quality mr-to-ct synthesis for proton treatment planning,

    M. Li, X. Li, S. Safai, A. J. Lomax, and Y . Zhang, “Diffusion schr¨odinger bridge models for high-quality mr-to-ct synthesis for proton treatment planning,”Medical Physics, 2025

  45. [45]

    NELSON,Dynamical Theories of Brownian Motion

    E. NELSON,Dynamical Theories of Brownian Motion. Princeton University Press, 1967

  46. [46]

    The virtual skeleton database: An open access repository for biomedical research and collaboration,

    M. Kistler, S. Bonaretti, M. Pfahrer, R. Niklaus, and P. B ¨uchler, “The virtual skeleton database: An open access repository for biomedical research and collaboration,”J Med Internet Res, vol. 15, no. 11, p. e245, Nov 2013

  47. [47]

    Deep learning algorithms for detection of critical findings in head ct scans: a retro- spective study,

    S. Chilamkurthy, R. Ghosh, S. Tanamala, M. Biviji, N. G. Campeau, V . K. Venugopal, V . Mahajan, P. Rao, and P. Warier, “Deep learning algorithms for detection of critical findings in head ct scans: a retro- spective study,”The Lancet, vol. 392, no. 10162, pp. 2388–2396, 2018

  48. [48]

    Deep con- volutional neural network for inverse problems in imaging,

    K. H. Jin, M. T. McCann, E. Froustey, and M. Unser, “Deep con- volutional neural network for inverse problems in imaging,”IEEE Transactions on Image Processing, vol. 26, no. 9, pp. 4509–4522, 2017

  49. [49]

    Image-to-image translation with conditional adversarial networks,

    P. Isola, J.-Y . Zhu, T. Zhou, and A. A. Efros, “Image-to-image translation with conditional adversarial networks,” inProceedings of the IEEE conference on computer vision and pattern recognition, 2017, pp. 1125– 1134

  50. [50]

    CBCT-Based synthetic CT image generation using conditional denoising diffusion probabilistic model,

    J. Peng, R. L. Qiu, J. F. Wynne, C.-W. Chang, S. Pan, T. Wang, J. Roper, T. Liu, P. R. Patel, D. S. Yuet al., “CBCT-Based synthetic CT image generation using conditional denoising diffusion probabilistic model,” Medical physics, vol. 51, no. 3, pp. 1847–1859, 2024. VI. APPENDIX A. Results of Noise-Free Data The results of the noise-free experiments are sh...