pith. sign in

arxiv: 2509.01629 · v3 · pith:CDVOLSSMnew · submitted 2025-09-01 · 📊 stat.ML · cs.LG· cs.NA· math.NA

Lipschitz-Guided Design of Interpolation Schedules in Generative Models

Pith reviewed 2026-05-21 21:48 UTC · model grok-4.3

classification 📊 stat.ML cs.LGcs.NAmath.NA
keywords interpolation schedulesgenerative modelsLipschitz constantdiffusion modelsstochastic interpolantsflow matchingnumerical stabilitystochastic PDEs
0
0 comments X

The pith

Scalar interpolation schedules are statistically equivalent under path-space KL after diffusion tuning, so minimizing averaged squared Lipschitzness of the drift produces superior schedules usable without retraining.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper shows that scalar interpolation schedules in stochastic interpolants are statistically equivalent under Kullback-Leibler divergence in path space once the diffusion coefficient is optimally tuned afterward. This equivalence shifts design attention from statistical criteria to numerical ones, specifically minimizing the averaged squared Lipschitzness of the drift field rather than kinetic energy. A transfer formula then expresses the drift of the new schedule in terms of an existing one, so the improved schedule can be applied at inference to a model trained under a different schedule such as linear, without any retraining. For Gaussian targets the resulting schedule yields exponential reductions in Lipschitz constant relative to linear interpolation, while for Gaussian mixtures it reduces mode collapse in few-step sampling. Validation on high-dimensional stochastic Allen-Cahn and Navier-Stokes invariant measures confirms markedly more accurate fine-scale statistics at fixed integrator budget.

Core claim

Within the stochastic interpolants framework, scalar interpolation schedules are statistically equivalent under the Kullback-Leibler divergence in path space after optimal a posteriori tuning of the diffusion coefficient. This equivalence motivates minimizing the averaged squared Lipschitzness of the drift as the design criterion. A simple transfer formula then allows the designed schedule to be used at inference time with a model trained under another schedule without retraining. For Gaussian targets the optimal schedule achieves exponential improvements in the Lipschitz constant over linear schedules; for Gaussian mixtures it mitigates mode collapse in few-step sampling. On highdimensional

What carries the argument

Minimizing the averaged squared Lipschitzness of the drift field, combined with the transfer formula that rewrites the drift of one schedule in terms of another to enable zero-retraining use.

If this is right

  • For Gaussian targets the optimal schedule achieves exponential Lipschitz-constant improvements over linear interpolation.
  • For Gaussian-mixture targets the schedule mitigates mode collapse during few-step sampling.
  • On stochastic Allen-Cahn and Navier-Stokes invariant measures the schedule produces markedly more accurate fine-scale statistics at fixed integrator budget.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The transfer formula could be combined with online adaptation of the schedule during sampling to further reduce integrator error.
  • The Lipschitz criterion might be replaced by higher-order smoothness measures for even stiffer drifts in future schedule design.
  • The same equivalence-plus-transfer approach could apply to non-scalar or learned schedules beyond the scalar case studied here.

Load-bearing premise

The statistical equivalence of scalar schedules under path-space KL divergence after optimal diffusion-coefficient tuning is assumed to hold, and the transfer formula is assumed to preserve the learned score or velocity field accurately enough that no retraining is required.

What would settle it

Direct numerical computation of the path-space KL divergence between two scalar schedules after their respective optimal diffusion-coefficient tunings, showing the divergences are not equal, would falsify the equivalence; alternatively, applying the transferred Lipschitz-optimal schedule to a model trained on linear interpolation and observing no gain or degradation in fine-scale statistics on the stochastic Navier-Stokes example would falsify the practical benefit.

Figures

Figures reproduced from arXiv: 2509.01629 by Eric Vanden-Eijnden, Jiawei Xu, Yifan Chen.

Figure 1
Figure 1. Figure 1: Comparison of different interpolation schedules βt . Left: M = 5. Right: M = 20. We set p = 0.3. For the dilated schedule, we take κ = 1. We plot different schedules in [PITH_FULL_IMAGE:figures/full_fig_p010_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: shows random fields generated using a standard linear schedule βt = t compared to those using our designed schedule (3.11) optimized for avg-Lip2 , both employing 20 RK4 steps with N = 128. The designed schedule clearly produces superior samples. The right panel of [PITH_FULL_IMAGE:figures/full_fig_p014_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Energy spectra of Gaussian fields: comparison between truth, generated via designed schedules or standard linear schedules, with 20, 40 or 80 RK4 steps. The three figures correspond to different resolutions. Left: 32 × 32; middle: 64 × 64; right: 128 × 128. 4.2. Gaussian mixtures. We consider the d-dimensional Gaussian mixture distribu￾tion in (3.13) with d = 1000, p = 0.3, and r = [1, 1, . . . , 1] ∈ R d … view at source ↗
Figure 4
Figure 4. Figure 4: Energy spectra of invariant distributions of stochastic Allen￾Cahn: comparison between truth, generated via designed schedules or standard linear schedules, with 10, 20 or 40 RK4 steps. The three figures correspond to different resolutions. Left: 32; middle: 64; right: 128. where v = ∇⊥ψ = (−∂yψ, ∂xψ) is the velocity field from stream function ψ satisfying −∆ψ = ω. We use parameters ν = 10−3 , α = 0.1, ε =… view at source ↗
Figure 5
Figure 5. Figure 5: Left: generated 128 × 128 sample using linear schedule and 10 steps of RK4; middle: generated 128 × 128 sample using designed schedule and 10 steps of RK4; enstrophy spectra of samples using differ￾ent schedules [PITH_FULL_IMAGE:figures/full_fig_p016_5.png] view at source ↗
read the original abstract

We study the design of interpolation schedules in flow and diffusion-based generative models from both statistical and numerical perspectives. Within the stochastic interpolants framework, we first show that scalar interpolation schedules are statistically equivalent under the Kullback--Leibler divergence in path space, after optimal a posteriori tuning of the diffusion coefficient. This equivalence motivates focusing on numerical properties of the drift field rather than purely statistical criteria. We propose minimizing the averaged squared Lipschitzness of the drift as a principled criterion for schedule design, in contrast with kinetic-energy minimization in optimal transport. A simple transfer formula expresses the drift of one schedule in terms of the drift of another, allowing the designed schedule to be used at inference time with a model trained under a different (e.g., linear) schedule, without retraining. We work out the optimal schedules analytically for Gaussian and Gaussian-mixture targets: for Gaussians, we obtain exponential improvements in the Lipschitz constant over linear schedules; for Gaussian mixtures, we obtain schedules that mitigate mode collapse in few-step sampling. We then validate the approach on high-dimensional invariant measures of stochastic Allen--Cahn and Navier--Stokes equations, where the designed schedule yields markedly more accurate fine-scale statistics at fixed integrator budget.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The paper studies interpolation schedule design in flow and diffusion generative models via the stochastic interpolants framework. It proves that scalar schedules are statistically equivalent under path-space KL divergence after optimal a posteriori diffusion-coefficient tuning, motivating a numerical criterion of minimizing averaged squared Lipschitzness of the drift (as opposed to kinetic energy). A transfer formula allows applying a designed schedule at inference using a model trained on a different schedule (e.g., linear) without retraining. Optimal schedules are derived analytically for Gaussian and Gaussian-mixture targets, yielding exponential Lipschitz improvements for Gaussians and reduced mode collapse for mixtures. The approach is validated on high-dimensional invariant measures of stochastic Allen-Cahn and Navier-Stokes equations, where the designed schedule improves fine-scale statistics at fixed integrator budget.

Significance. If the central claims hold, the work supplies a principled, numerically motivated alternative to purely statistical schedule design, with clean analytical results for Gaussians and mixtures plus empirical gains on stochastic PDE targets. The equivalence result and transfer formula are strengths that could reduce the need for schedule-specific retraining. The manuscript ships explicit derivations and reproducible validation on SPDEs, which strengthens its contribution if error propagation under the transfer formula is controlled.

major comments (2)
  1. [transfer formula paragraph] Transfer formula section: the claim that the designed schedule can be used at inference without retraining rests on the assumption that the transfer formula preserves the learned (approximate) drift accurately enough; however, no quantitative bound or sensitivity analysis is given on how residual training errors are amplified by schedule-dependent factors (derivatives or rescalings near t=0 or t=1). This is load-bearing for the headline SPDE results, which rely on applying the transferred drift.
  2. [validation section] Validation on Allen-Cahn and Navier-Stokes (abstract and corresponding results section): the reported improvements in fine-scale statistics are described qualitatively, but the manuscript supplies neither error bars on the statistics, an ablation on the diffusion-coefficient tuning step, nor quantitative comparisons against strong baselines (e.g., other Lipschitz or OT-based schedules). This leaves the practical significance of the numerical gains only partially supported.
minor comments (2)
  1. [method section] Notation for the averaged squared Lipschitzness functional is introduced without an explicit equation number in the main text; adding a numbered display equation would improve readability when the criterion is later optimized.
  2. [abstract] The abstract states 'exponential improvements in the Lipschitz constant' for Gaussians; the corresponding theorem or proposition deriving the exact rate should be referenced in the abstract for precision.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive and detailed report. The comments identify important points that will strengthen the manuscript. We address each major comment below and indicate the revisions we will incorporate.

read point-by-point responses
  1. Referee: Transfer formula section: the claim that the designed schedule can be used at inference without retraining rests on the assumption that the transfer formula preserves the learned (approximate) drift accurately enough; however, no quantitative bound or sensitivity analysis is given on how residual training errors are amplified by schedule-dependent factors (derivatives or rescalings near t=0 or t=1). This is load-bearing for the headline SPDE results, which rely on applying the transferred drift.

    Authors: We agree that the absence of a quantitative sensitivity analysis is a limitation. The transfer formula is exact for the true drift, but error amplification through schedule-dependent rescalings is a valid concern for learned models. A general theoretical bound is difficult to derive without restrictive assumptions on the training residual. In the revision we will add a new subsection that performs a numerical sensitivity study on the analytically solvable Gaussian case, measuring how controlled perturbations to the drift are propagated under the transfer formula. For the SPDE experiments we will include additional diagnostics on the stability of the transferred drift across integrator steps. These changes will be placed in the validation section. revision: partial

  2. Referee: Validation on Allen-Cahn and Navier-Stokes (abstract and corresponding results section): the reported improvements in fine-scale statistics are described qualitatively, but the manuscript supplies neither error bars on the statistics, an ablation on the diffusion-coefficient tuning step, nor quantitative comparisons against strong baselines (e.g., other Lipschitz or OT-based schedules). This leaves the practical significance of the numerical gains only partially supported.

    Authors: We concur that stronger quantitative evidence is needed. In the revised manuscript we will add error bars computed from multiple independent runs for all reported fine-scale statistics. We will also insert an ablation that isolates the effect of the optimal diffusion-coefficient tuning step. Finally, we will provide direct quantitative comparisons of the designed schedule against the linear baseline and against at least one additional schedule (a kinetic-energy optimal-transport schedule). The abstract will be updated to summarize the quantitative improvements. These additions will appear in the results section. revision: yes

Circularity Check

0 steps flagged

No circularity: equivalence and Lipschitz criterion derived independently from stochastic interpolants framework

full rationale

The paper derives statistical equivalence of scalar schedules under path-space KL after a posteriori diffusion-coefficient tuning directly from the stochastic interpolants setup, without fitting to target data or reducing to prior fitted quantities. The averaged squared Lipschitzness objective is introduced as a new numerical criterion contrasting with kinetic-energy minimization, and the transfer formula is presented as an explicit algebraic relation allowing schedule change at inference. Optimal schedules for Gaussians and mixtures are obtained analytically from the resulting optimization problem, while SPDE validation uses the transferred drift on learned models. No step reduces by construction to its inputs, no load-bearing self-citation chain is invoked for the central claims, and the derivation remains self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

1 free parameters · 1 axioms · 0 invented entities

The central claims rest on the stochastic interpolants framework and the existence of an optimal a posteriori diffusion coefficient; no new free parameters beyond that coefficient are introduced in the abstract, and no invented entities are postulated.

free parameters (1)
  • diffusion coefficient
    Tuned a posteriori to achieve KL equivalence between schedules; appears once in the equivalence statement.
axioms (1)
  • domain assumption Scalar interpolation schedules remain equivalent under path-space KL after diffusion-coefficient tuning
    Invoked to justify shifting focus from statistical to numerical criteria (abstract opening paragraph).

pith-pipeline@v0.9.0 · 5754 in / 1434 out tokens · 44977 ms · 2026-05-21T21:48:49.498656+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Forward citations

Cited by 3 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. Variational Optimality of F\"ollmer Processes in Generative Diffusions

    math.ST 2026-02 unverdicted novelty 8.0

    Föllmer processes are variationally optimal among generative diffusions because they minimize the impact of drift estimation error on path-space KL divergence, rendering different interpolation schedules statistically...

  2. Geometry-Aware Discretization Error of Diffusion Models

    cs.LG 2026-05 unverdicted novelty 7.0

    First-order asymptotic expansions of weak and Fréchet discretization errors in diffusion sampling are derived, explicit under Gaussian data through covariance geometry and robust to other data geometries.

  3. On The Hidden Biases of Flow Matching Samplers

    stat.ML 2025-12 unverdicted novelty 7.0

    Empirical flow matching introduces coupled biases from plug-in estimation, including altered statistical targets, non-gradient minimizers, and non-unique dynamics via flux-null fields, with base distribution controlli...

Reference graph

Works this paper leans on

50 extracted references · 50 canonical work pages · cited by 3 Pith papers · 4 internal anchors

  1. [1]

    Stochastic Interpolants: A Unifying Framework for Flows and Diffusions

    Michael S Albergo, Nicholas M Boffi, and Eric Vanden-Eijnden. Stochastic interpolants: A unifying framework for flows and diffusions. arXiv preprint arXiv:2303.08797 , 2023

  2. [2]

    Building normalizing flows with stochastic inter- polants

    Michael S Albergo and Eric Vanden-Eijnden. Building normalizing flows with stochastic inter- polants. In The Eleventh International Conference on Learning Representations , 2022

  3. [3]

    Optimizing noise sched- ules of generative models in high dimensionss

    Santiago Aranguri, Giulio Biroli, Marc Mezard, and Eric Vanden-Eijnden. Optimizing noise sched- ules of generative models in high dimensionss. arXiv preprint arXiv:2501.00988 , 2025

  4. [4]

    Flow map matching with stochastic interpolants: A mathematical framework for consistency models

    Nicholas Matthew Boffi, Michael Samuel Albergo, and Eric Vanden-Eijnden. Flow map matching with stochastic interpolants: A mathematical framework for consistency models. Transactions on Machine Learning Research, 2025

  5. [5]

    On the trajectory regularity of ode-based diffusion sampling

    Defang Chen, Zhenyu Zhou, Can Wang, Chunhua Shen, and Siwei Lyu. On the trajectory regularity of ode-based diffusion sampling. InForty-first International Conference on Machine Learning, 2024. 18 LIPSCHITZ-GUIDED DESIGN OF GENERATIVE MODELS

  6. [6]

    Accelerating diffusion models with parallel sampling: Inference at sub-linear time complexity

    Haoxuan Chen, Yinuo Ren, Lexing Ying, and Grant Rotskoff. Accelerating diffusion models with parallel sampling: Inference at sub-linear time complexity. Advances in Neural Information Pro- cessing Systems, 37:133661–133709, 2024

  7. [7]

    New affine invariant ensemble samplers and their dimensional scaling

    Yifan Chen. New affine invariant ensemble samplers and their dimensional scaling. arXiv preprint arXiv:2505.02987, 2025

  8. [8]

    Probabilistic forecasting with stochastic interpolants and F¨ ollmer processes

    Yifan Chen, Mark Goldstein, Mengjian Hua, Michael S Albergo, Nicholas M Boffi, and Eric Vanden- Eijnden. Probabilistic forecasting with stochastic interpolants and F¨ ollmer processes. InProceedings of the 41st International Conference on Machine Learning , pages 6728–6756, 2024

  9. [9]

    On the contractivity of stochastic interpolation flow.arXiv preprint arXiv:2504.10653, 2025

    Max Daniels. On the contractivity of stochastic interpolation flow.arXiv preprint arXiv:2504.10653, 2025

  10. [10]

    Accelerated diffu- sion models via speculative sampling

    Valentin De Bortoli, Alexandre Galashov, Arthur Gretton, and Arnaud Doucet. Accelerated diffu- sion models via speculative sampling. arXiv preprint arXiv:2501.05370 , 2025

  11. [11]

    Diffusion schr¨ odinger bridge with applications to score-based generative modeling

    Valentin De Bortoli, James Thornton, Jeremy Heng, and Arnaud Doucet. Diffusion schr¨ odinger bridge with applications to score-based generative modeling. Advances in neural information pro- cessing systems, 34:17695–17709, 2021

  12. [12]

    Diffusion models beat gans on image synthesis

    Prafulla Dhariwal and Alexander Nichol. Diffusion models beat gans on image synthesis. Advances in neural information processing systems , 34:8780–8794, 2021

  13. [13]

    Genie: Higher-order denoising diffusion solvers

    Tim Dockhorn, Arash Vahdat, and Karsten Kreis. Genie: Higher-order denoising diffusion solvers. Advances in Neural Information Processing Systems , 35:30150–30166, 2022

  14. [14]

    One Step Diffusion via Shortcut Models

    Kevin Frans, Danijar Hafner, Sergey Levine, and Pieter Abbeel. One step diffusion via shortcut models. arXiv preprint arXiv:2410.12557 , 2024

  15. [15]

    Gaussian interpolation flows.Journal of Machine Learning Research, 25(253):1–52, 2024

    Yuan Gao, Jian Huang, and Yuling Jiao. Gaussian interpolation flows.Journal of Machine Learning Research, 25(253):1–52, 2024

  16. [16]

    Wavelet score-based generative modeling

    Florentin Guth, Simon Coste, Valentin De Bortoli, and Stephane Mallat. Wavelet score-based generative modeling. Advances in neural information processing systems , 35:478–491, 2022

  17. [17]

    Mimicking the one-dimensional marginal distributions of processes having an itˆ o differential

    Istv´ an Gy¨ ongy. Mimicking the one-dimensional marginal distributions of processes having an itˆ o differential. Probability theory and related fields, 71(4):501–516, 1986

  18. [18]

    Ergodicity of the 2d navier-stokes equations with degen- erate stochastic forcing

    Martin Hairer and Jonathan C Mattingly. Ergodicity of the 2d navier-stokes equations with degen- erate stochastic forcing. Annals of Mathematics , pages 993–1032, 2006

  19. [19]

    Denoising diffusion probabilistic models

    Jonathan Ho, Ajay Jain, and Pieter Abbeel. Denoising diffusion probabilistic models. In Advances in neural information processing systems , volume 33, pages 6840–6851, 2020

  20. [20]

    Cascaded diffusion models for high fidelity image generation

    Jonathan Ho, Chitwan Saharia, William Chan, David J Fleet, Mohammad Norouzi, and Tim Sal- imans. Cascaded diffusion models for high fidelity image generation. Journal of Machine Learning Research, 23(47):1–33, 2022

  21. [21]

    Subspace diffusion gener- ative models

    Bowen Jing, Gabriele Corso, Renato Berlinghieri, and Tommi Jaakkola. Subspace diffusion gener- ative models. In European Conference on Computer Vision , pages 274–289. Springer, 2022

  22. [22]

    Gotta go fast when generating data with score-based models

    Alexia Jolicoeur-Martineau, Ke Li, R´ emi Pich´ e-Taillefer, Tal Kachman, and Ioannis Mitliagkas. Gotta go fast when generating data with score-based models. arXiv preprint arXiv:2105.14080 , 2021

  23. [23]

    Elucidating the design space of diffusion- based generative models

    Tero Karras, Miika Aittala, Timo Aila, and Samuli Laine. Elucidating the design space of diffusion- based generative models. Advances in neural information processing systems, 35:26565–26577, 2022

  24. [24]

    Consistency trajectory models: Learning probability flow ode trajectory of diffusion

    Dongjun Kim, Chieh-Hsin Lai, Wei-Hsiang Liao, Naoki Murata, Yuhta Takida, Toshimitsu Ue- saka, Yutong He, Yuki Mitsufuji, and Stefano Ermon. Consistency trajectory models: Learning probability flow ode trajectory of diffusion. In ICLR, 2024

  25. [25]

    Variational diffusion models

    Diederik Kingma, Tim Salimans, Ben Poole, and Jonathan Ho. Variational diffusion models. Ad- vances in neural information processing systems , 34:21696–21707, 2021

  26. [26]

    Accelerating con- vergence of score-based diffusion models, provably

    Gen Li, Yu Huang, Timofey Efimov, Yuting Wei, Yuejie Chi, and Yuxin Chen. Accelerating con- vergence of score-based diffusion models, provably. arXiv preprint arXiv:2403.03852 , 2024

  27. [27]

    Flow match- ing for generative modeling

    Yaron Lipman, Ricky TQ Chen, Heli Ben-Hamu, Maximilian Nickel, and Matthew Le. Flow match- ing for generative modeling. InThe Eleventh International Conference on Learning Representations, 2022

  28. [28]

    Flow straight and fast: Learning to generate and transfer data with rectified flow

    Xingchao Liu, Chengyue Gong, and Qiang Liu. Flow straight and fast: Learning to generate and transfer data with rectified flow. In The Eleventh International Conference on Learning Represen- tations, 2022. LIPSCHITZ-GUIDED DESIGN OF GENERATIVE MODELS 19

  29. [29]

    Dpm-solver: A fast ode solver for diffusion probabilistic model sampling in around 10 steps

    Cheng Lu, Yuhao Zhou, Fan Bao, Jianfei Chen, Chongxuan Li, and Jun Zhu. Dpm-solver: A fast ode solver for diffusion probabilistic model sampling in around 10 steps. Advances in Neural Information Processing Systems, 35:5775–5787, 2022

  30. [30]

    Improved denoising dfiffusion probabilistic models

    Alexander Quinn Nichol and Prafulla Dhariwal. Improved denoising dfiffusion probabilistic models. In International conference on machine learning , pages 8162–8171. PMLR, 2021

  31. [31]

    Wavelet diffusion models are fast and scalable image generators

    Hao Phung, Quan Dao, and Anh Tran. Wavelet diffusion models are fast and scalable image generators. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 10199–10208, 2023

  32. [32]

    Align your steps: Optimizing sampling schedules in diffusion models

    Amirmojtaba Sabour, Sanja Fidler, and Karsten Kreis. Align your steps: Optimizing sampling schedules in diffusion models. In Forty-first International Conference on Machine Learning , 2024

  33. [33]

    Image super-resolution via iterative refinement

    Chitwan Saharia, Jonathan Ho, William Chan, Tim Salimans, David J Fleet, and Mohammad Norouzi. Image super-resolution via iterative refinement. IEEE transactions on pattern analysis and machine intelligence , 45(4):4713–4726, 2022

  34. [34]

    Progressive Distillation for Fast Sampling of Diffusion Models

    Tim Salimans and Jonathan Ho. Progressive distillation for fast sampling of diffusion models. arXiv preprint arXiv:2202.00512, 2022

  35. [35]

    Noise estimation for generative diffusion mod- els

    Robin San-Roman, Eliya Nachmani, and Lior Wolf. Noise estimation for generative diffusion mod- els. arXiv preprint arXiv:2104.02600 , 2021

  36. [36]

    Bespoke solvers for generative flow models

    N Shaul, J Perez, RTQ Chen, A Thabet, A Pumarola, and Y Lipman. Bespoke solvers for generative flow models. In 12th International Conference on Learning Representations, ICLR 2024 , 2024

  37. [37]

    Diffusion schr¨ odinger bridge matching

    Yuyang Shi, Valentin De Bortoli, Andrew Campbell, and Arnaud Doucet. Diffusion schr¨ odinger bridge matching. Advances in Neural Information Processing Systems , 36:62183–62223, 2023

  38. [38]

    Deep unsu- pervised learning using nonequilibrium thermodynamics

    Jascha Sohl-Dickstein, Eric A Weiss, Niru Maheswaranathan, and Surya Ganguli. Deep unsu- pervised learning using nonequilibrium thermodynamics. In Proceedings of the 32nd International Conference on International Conference on Machine Learning-Volume 37 , pages 2256–2265, 2015

  39. [39]

    Consistency models

    Yang Song, Prafulla Dhariwal, Mark Chen, and Ilya Sutskever. Consistency models. InInternational Conference on Machine Learning, pages 32211–32252. PMLR, 2023

  40. [40]

    Generative modeling by estimating gradients of the data distribu- tion

    Yang Song and Stefano Ermon. Generative modeling by estimating gradients of the data distribu- tion. Advances in neural information processing systems , 32, 2019

  41. [41]

    Improved techniques for training score-based generative models

    Yang Song and Stefano Ermon. Improved techniques for training score-based generative models. Advances in neural information processing systems , 33:12438–12448, 2020

  42. [42]

    Score-Based Generative Modeling through Stochastic Differential Equations

    Yang Song, Jascha Sohl-Dickstein, Diederik P Kingma, Abhishek Kumar, Stefano Ermon, and Ben Poole. Score-based generative modeling through stochastic differential equations. arXiv preprint arXiv:2011.13456, 2020

  43. [43]

    Stork: Improving the fidelity of mid-nfe sampling for diffusion and flow matching models

    Zheng Tan, Weizhen Wang, Andrea L Bertozzi, and Ernest K Ryu. Stork: Improving the fidelity of mid-nfe sampling for diffusion and flow matching models. arXiv preprint arXiv:2505.24210 , 2025

  44. [44]

    Optimal scheduling of dynamic trans- port

    Panos Tsimpos, Zhi Ren, Jakob Zech, and Youssef Marzouk. Optimal scheduling of dynamic trans- port. arXiv preprint arXiv:2504.14425 , 2025

  45. [45]

    Evaluating the design space of diffusion-based generative models

    Yuqing Wang, Ye He, and Molei Tao. Evaluating the design space of diffusion-based generative models. Advances in Neural Information Processing Systems , 37:19307–19352, 2024

  46. [46]

    Stochastic runge-kutta methods: Provable acceleration of diffusion models

    Yuchen Wu, Yuxin Chen, and Yuting Wei. Stochastic runge-kutta methods: Provable acceleration of diffusion models. arXiv preprint arXiv:2410.04760 , 2024

  47. [47]

    Accelerating diffusion sampling with optimized time steps

    Shuchen Xue, Zhaoqiang Liu, Fei Chen, Shifeng Zhang, Tianyang Hu, Enze Xie, and Zhenguo Li. Accelerating diffusion sampling with optimized time steps. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition , pages 8292–8301, 2024

  48. [48]

    Diffusion models: A comprehensive survey of methods and appli- cations

    Ling Yang, Zhilong Zhang, Yang Song, Shenda Hong, Runsheng Xu, Yue Zhao, Wentao Zhang, Bin Cui, and Ming-Hsuan Yang. Diffusion models: A comprehensive survey of methods and appli- cations. ACM computing surveys, 56(4):1–39, 2023

  49. [49]

    Wavelet flow: Fast training of high resolution normalizing flows

    Jason J Yu, Konstantinos G Derpanis, and Marcus A Brubaker. Wavelet flow: Fast training of high resolution normalizing flows. Advances in Neural Information Processing Systems , 33:6184–6196, 2020

  50. [50]

    Fast sampling of diffusion models with exponential integrator

    Qinsheng Zhang and Yongxin Chen. Fast sampling of diffusion models with exponential integrator. arXiv preprint arXiv:2204.13902 , 2022. 20 LIPSCHITZ-GUIDED DESIGN OF GENERATIVE MODELS Appendix A. Sketch of Derivations for Stochastic Interpolants Sketch of derivation for Proposition 2.2. For any smooth test function ϕ : Rd → R, (A.1) d ϕ(It) = ˙It · ∇ϕ(It)...