Lipschitz-Guided Design of Interpolation Schedules in Generative Models

Eric Vanden-Eijnden; Jiawei Xu; Yifan Chen

arxiv: 2509.01629 · v3 · pith:CDVOLSSMnew · submitted 2025-09-01 · 📊 stat.ML · cs.LG· cs.NA· math.NA

Lipschitz-Guided Design of Interpolation Schedules in Generative Models

Yifan Chen , Eric Vanden-Eijnden , Jiawei Xu This is my paper

Pith reviewed 2026-05-21 21:48 UTC · model grok-4.3

classification 📊 stat.ML cs.LGcs.NAmath.NA

keywords interpolation schedulesgenerative modelsLipschitz constantdiffusion modelsstochastic interpolantsflow matchingnumerical stabilitystochastic PDEs

0 comments

The pith

Scalar interpolation schedules are statistically equivalent under path-space KL after diffusion tuning, so minimizing averaged squared Lipschitzness of the drift produces superior schedules usable without retraining.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper shows that scalar interpolation schedules in stochastic interpolants are statistically equivalent under Kullback-Leibler divergence in path space once the diffusion coefficient is optimally tuned afterward. This equivalence shifts design attention from statistical criteria to numerical ones, specifically minimizing the averaged squared Lipschitzness of the drift field rather than kinetic energy. A transfer formula then expresses the drift of the new schedule in terms of an existing one, so the improved schedule can be applied at inference to a model trained under a different schedule such as linear, without any retraining. For Gaussian targets the resulting schedule yields exponential reductions in Lipschitz constant relative to linear interpolation, while for Gaussian mixtures it reduces mode collapse in few-step sampling. Validation on high-dimensional stochastic Allen-Cahn and Navier-Stokes invariant measures confirms markedly more accurate fine-scale statistics at fixed integrator budget.

Core claim

Within the stochastic interpolants framework, scalar interpolation schedules are statistically equivalent under the Kullback-Leibler divergence in path space after optimal a posteriori tuning of the diffusion coefficient. This equivalence motivates minimizing the averaged squared Lipschitzness of the drift as the design criterion. A simple transfer formula then allows the designed schedule to be used at inference time with a model trained under another schedule without retraining. For Gaussian targets the optimal schedule achieves exponential improvements in the Lipschitz constant over linear schedules; for Gaussian mixtures it mitigates mode collapse in few-step sampling. On highdimensional

What carries the argument

Minimizing the averaged squared Lipschitzness of the drift field, combined with the transfer formula that rewrites the drift of one schedule in terms of another to enable zero-retraining use.

If this is right

For Gaussian targets the optimal schedule achieves exponential Lipschitz-constant improvements over linear interpolation.
For Gaussian-mixture targets the schedule mitigates mode collapse during few-step sampling.
On stochastic Allen-Cahn and Navier-Stokes invariant measures the schedule produces markedly more accurate fine-scale statistics at fixed integrator budget.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The transfer formula could be combined with online adaptation of the schedule during sampling to further reduce integrator error.
The Lipschitz criterion might be replaced by higher-order smoothness measures for even stiffer drifts in future schedule design.
The same equivalence-plus-transfer approach could apply to non-scalar or learned schedules beyond the scalar case studied here.

Load-bearing premise

The statistical equivalence of scalar schedules under path-space KL divergence after optimal diffusion-coefficient tuning is assumed to hold, and the transfer formula is assumed to preserve the learned score or velocity field accurately enough that no retraining is required.

What would settle it

Direct numerical computation of the path-space KL divergence between two scalar schedules after their respective optimal diffusion-coefficient tunings, showing the divergences are not equal, would falsify the equivalence; alternatively, applying the transferred Lipschitz-optimal schedule to a model trained on linear interpolation and observing no gain or degradation in fine-scale statistics on the stochastic Navier-Stokes example would falsify the practical benefit.

Figures

Figures reproduced from arXiv: 2509.01629 by Eric Vanden-Eijnden, Jiawei Xu, Yifan Chen.

**Figure 1.** Figure 1: Comparison of different interpolation schedules βt . Left: M = 5. Right: M = 20. We set p = 0.3. For the dilated schedule, we take κ = 1. We plot different schedules in [PITH_FULL_IMAGE:figures/full_fig_p010_1.png] view at source ↗

**Figure 2.** Figure 2: shows random fields generated using a standard linear schedule βt = t compared to those using our designed schedule (3.11) optimized for avg-Lip2 , both employing 20 RK4 steps with N = 128. The designed schedule clearly produces superior samples. The right panel of [PITH_FULL_IMAGE:figures/full_fig_p014_2.png] view at source ↗

**Figure 3.** Figure 3: Energy spectra of Gaussian fields: comparison between truth, generated via designed schedules or standard linear schedules, with 20, 40 or 80 RK4 steps. The three figures correspond to different resolutions. Left: 32 × 32; middle: 64 × 64; right: 128 × 128. 4.2. Gaussian mixtures. We consider the d-dimensional Gaussian mixture distribution in (3.13) with d = 1000, p = 0.3, and r = [1, 1, . . . , 1] ∈ R d … view at source ↗

**Figure 4.** Figure 4: Energy spectra of invariant distributions of stochastic AllenCahn: comparison between truth, generated via designed schedules or standard linear schedules, with 10, 20 or 40 RK4 steps. The three figures correspond to different resolutions. Left: 32; middle: 64; right: 128. where v = ∇⊥ψ = (−∂yψ, ∂xψ) is the velocity field from stream function ψ satisfying −∆ψ = ω. We use parameters ν = 10−3 , α = 0.1, ε =… view at source ↗

**Figure 5.** Figure 5: Left: generated 128 × 128 sample using linear schedule and 10 steps of RK4; middle: generated 128 × 128 sample using designed schedule and 10 steps of RK4; enstrophy spectra of samples using different schedules [PITH_FULL_IMAGE:figures/full_fig_p016_5.png] view at source ↗

read the original abstract

We study the design of interpolation schedules in flow and diffusion-based generative models from both statistical and numerical perspectives. Within the stochastic interpolants framework, we first show that scalar interpolation schedules are statistically equivalent under the Kullback--Leibler divergence in path space, after optimal a posteriori tuning of the diffusion coefficient. This equivalence motivates focusing on numerical properties of the drift field rather than purely statistical criteria. We propose minimizing the averaged squared Lipschitzness of the drift as a principled criterion for schedule design, in contrast with kinetic-energy minimization in optimal transport. A simple transfer formula expresses the drift of one schedule in terms of the drift of another, allowing the designed schedule to be used at inference time with a model trained under a different (e.g., linear) schedule, without retraining. We work out the optimal schedules analytically for Gaussian and Gaussian-mixture targets: for Gaussians, we obtain exponential improvements in the Lipschitz constant over linear schedules; for Gaussian mixtures, we obtain schedules that mitigate mode collapse in few-step sampling. We then validate the approach on high-dimensional invariant measures of stochastic Allen--Cahn and Navier--Stokes equations, where the designed schedule yields markedly more accurate fine-scale statistics at fixed integrator budget.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

After optimal diffusion tuning scalar schedules are statistically equivalent under path KL, so the paper shifts to minimizing averaged squared Lipschitzness of the drift and gives a transfer formula that avoids retraining.

read the letter

The main things to take away are that different scalar interpolation schedules end up statistically equivalent under path KL divergence once the diffusion coefficient is tuned optimally after the fact, and that this equivalence frees you to optimize the drift for better numerical properties instead. The paper pushes minimizing the averaged squared Lipschitz constant of the drift as the criterion, and supplies a transfer formula that converts the drift from one schedule to another so the new schedule can be used at inference on a model trained with a different schedule. What stands out as new is the Lipschitz objective itself, which differs from the usual kinetic energy or variance-exploding approaches, plus the explicit transfer formula and the closed-form optima for Gaussian targets. For Gaussians the optimal schedule delivers exponential reduction in the Lipschitz constant relative to linear. The mixture case targets mode collapse in few-step generation. Those calculations appear rigorous within the stochastic interpolants setup. The work does well on the analytical side and on the motivating examples. The stochastic Allen-Cahn and Navier-Stokes tests indicate that the designed schedule improves fine-scale statistics without increasing the integrator budget, which is the practical payoff for high-dimensional scientific sampling. The soft spots sit in the empirical section and the robustness of the transfer step. The reported improvements are described qualitatively with no error bars, no ablation studies on the tuning parameter, and limited comparison to established baselines. On the transfer formula, because it involves schedule-dependent rescalings that can grow large near the endpoints, any approximation error in the learned drift could get magnified. The paper shows the gains for the SPDE cases but does not provide bounds or sensitivity analysis on how training error propagates, so it is not yet clear whether the benefits hold up for typical neural approximations rather than exact drifts. This is the kind of paper that would interest people developing generative models for physics simulations and PDE sampling who already work within the stochastic interpolants or diffusion framework. A reader looking for a new handle on schedule design that avoids retraining will find usable ideas here. It should go to peer review because the statistical equivalence result and the Gaussian derivations are on solid ground and the application is timely, though the numerical claims would benefit from more quantitative support and error analysis.

Referee Report

2 major / 2 minor

Summary. The paper studies interpolation schedule design in flow and diffusion generative models via the stochastic interpolants framework. It proves that scalar schedules are statistically equivalent under path-space KL divergence after optimal a posteriori diffusion-coefficient tuning, motivating a numerical criterion of minimizing averaged squared Lipschitzness of the drift (as opposed to kinetic energy). A transfer formula allows applying a designed schedule at inference using a model trained on a different schedule (e.g., linear) without retraining. Optimal schedules are derived analytically for Gaussian and Gaussian-mixture targets, yielding exponential Lipschitz improvements for Gaussians and reduced mode collapse for mixtures. The approach is validated on high-dimensional invariant measures of stochastic Allen-Cahn and Navier-Stokes equations, where the designed schedule improves fine-scale statistics at fixed integrator budget.

Significance. If the central claims hold, the work supplies a principled, numerically motivated alternative to purely statistical schedule design, with clean analytical results for Gaussians and mixtures plus empirical gains on stochastic PDE targets. The equivalence result and transfer formula are strengths that could reduce the need for schedule-specific retraining. The manuscript ships explicit derivations and reproducible validation on SPDEs, which strengthens its contribution if error propagation under the transfer formula is controlled.

major comments (2)

[transfer formula paragraph] Transfer formula section: the claim that the designed schedule can be used at inference without retraining rests on the assumption that the transfer formula preserves the learned (approximate) drift accurately enough; however, no quantitative bound or sensitivity analysis is given on how residual training errors are amplified by schedule-dependent factors (derivatives or rescalings near t=0 or t=1). This is load-bearing for the headline SPDE results, which rely on applying the transferred drift.
[validation section] Validation on Allen-Cahn and Navier-Stokes (abstract and corresponding results section): the reported improvements in fine-scale statistics are described qualitatively, but the manuscript supplies neither error bars on the statistics, an ablation on the diffusion-coefficient tuning step, nor quantitative comparisons against strong baselines (e.g., other Lipschitz or OT-based schedules). This leaves the practical significance of the numerical gains only partially supported.

minor comments (2)

[method section] Notation for the averaged squared Lipschitzness functional is introduced without an explicit equation number in the main text; adding a numbered display equation would improve readability when the criterion is later optimized.
[abstract] The abstract states 'exponential improvements in the Lipschitz constant' for Gaussians; the corresponding theorem or proposition deriving the exact rate should be referenced in the abstract for precision.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive and detailed report. The comments identify important points that will strengthen the manuscript. We address each major comment below and indicate the revisions we will incorporate.

read point-by-point responses

Referee: Transfer formula section: the claim that the designed schedule can be used at inference without retraining rests on the assumption that the transfer formula preserves the learned (approximate) drift accurately enough; however, no quantitative bound or sensitivity analysis is given on how residual training errors are amplified by schedule-dependent factors (derivatives or rescalings near t=0 or t=1). This is load-bearing for the headline SPDE results, which rely on applying the transferred drift.

Authors: We agree that the absence of a quantitative sensitivity analysis is a limitation. The transfer formula is exact for the true drift, but error amplification through schedule-dependent rescalings is a valid concern for learned models. A general theoretical bound is difficult to derive without restrictive assumptions on the training residual. In the revision we will add a new subsection that performs a numerical sensitivity study on the analytically solvable Gaussian case, measuring how controlled perturbations to the drift are propagated under the transfer formula. For the SPDE experiments we will include additional diagnostics on the stability of the transferred drift across integrator steps. These changes will be placed in the validation section. revision: partial
Referee: Validation on Allen-Cahn and Navier-Stokes (abstract and corresponding results section): the reported improvements in fine-scale statistics are described qualitatively, but the manuscript supplies neither error bars on the statistics, an ablation on the diffusion-coefficient tuning step, nor quantitative comparisons against strong baselines (e.g., other Lipschitz or OT-based schedules). This leaves the practical significance of the numerical gains only partially supported.

Authors: We concur that stronger quantitative evidence is needed. In the revised manuscript we will add error bars computed from multiple independent runs for all reported fine-scale statistics. We will also insert an ablation that isolates the effect of the optimal diffusion-coefficient tuning step. Finally, we will provide direct quantitative comparisons of the designed schedule against the linear baseline and against at least one additional schedule (a kinetic-energy optimal-transport schedule). The abstract will be updated to summarize the quantitative improvements. These additions will appear in the results section. revision: yes

Circularity Check

0 steps flagged

No circularity: equivalence and Lipschitz criterion derived independently from stochastic interpolants framework

full rationale

The paper derives statistical equivalence of scalar schedules under path-space KL after a posteriori diffusion-coefficient tuning directly from the stochastic interpolants setup, without fitting to target data or reducing to prior fitted quantities. The averaged squared Lipschitzness objective is introduced as a new numerical criterion contrasting with kinetic-energy minimization, and the transfer formula is presented as an explicit algebraic relation allowing schedule change at inference. Optimal schedules for Gaussians and mixtures are obtained analytically from the resulting optimization problem, while SPDE validation uses the transferred drift on learned models. No step reduces by construction to its inputs, no load-bearing self-citation chain is invoked for the central claims, and the derivation remains self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

1 free parameters · 1 axioms · 0 invented entities

The central claims rest on the stochastic interpolants framework and the existence of an optimal a posteriori diffusion coefficient; no new free parameters beyond that coefficient are introduced in the abstract, and no invented entities are postulated.

free parameters (1)

diffusion coefficient
Tuned a posteriori to achieve KL equivalence between schedules; appears once in the equivalence statement.

axioms (1)

domain assumption Scalar interpolation schedules remain equivalent under path-space KL after diffusion-coefficient tuning
Invoked to justify shifting focus from statistical to numerical criteria (abstract opening paragraph).

pith-pipeline@v0.9.0 · 5754 in / 1434 out tokens · 44977 ms · 2026-05-21T21:48:49.498656+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

Proposition 2.5 ... KL⋆(α, β) remains constant regardless of the interpolation schedules ... all linear scalar interpolants ... are statistically indistinguishable.
IndisputableMonolith/Foundation/AlphaCoordinateFixation.lean J_uniquely_calibrated_via_higher_derivative unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

Definition 3.2. The averaged squared Lipschitzness (avg-Lip²) ... A² = ∫ E[∥∇b_t(I_t)∥₂²] dt

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Forward citations

Cited by 3 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

Variational Optimality of F\"ollmer Processes in Generative Diffusions
math.ST 2026-02 unverdicted novelty 8.0

Föllmer processes are variationally optimal among generative diffusions because they minimize the impact of drift estimation error on path-space KL divergence, rendering different interpolation schedules statistically...
Geometry-Aware Discretization Error of Diffusion Models
cs.LG 2026-05 unverdicted novelty 7.0

First-order asymptotic expansions of weak and Fréchet discretization errors in diffusion sampling are derived, explicit under Gaussian data through covariance geometry and robust to other data geometries.
On The Hidden Biases of Flow Matching Samplers
stat.ML 2025-12 unverdicted novelty 7.0

Empirical flow matching introduces coupled biases from plug-in estimation, including altered statistical targets, non-gradient minimizers, and non-unique dynamics via flux-null fields, with base distribution controlli...

Reference graph

Works this paper leans on

50 extracted references · 50 canonical work pages · cited by 3 Pith papers · 4 internal anchors

[1]

Stochastic Interpolants: A Unifying Framework for Flows and Diffusions

Michael S Albergo, Nicholas M Boffi, and Eric Vanden-Eijnden. Stochastic interpolants: A unifying framework for flows and diffusions. arXiv preprint arXiv:2303.08797 , 2023

work page internal anchor Pith review Pith/arXiv arXiv 2023
[2]

Building normalizing flows with stochastic inter- polants

Michael S Albergo and Eric Vanden-Eijnden. Building normalizing flows with stochastic inter- polants. In The Eleventh International Conference on Learning Representations , 2022

work page 2022
[3]

Optimizing noise sched- ules of generative models in high dimensionss

Santiago Aranguri, Giulio Biroli, Marc Mezard, and Eric Vanden-Eijnden. Optimizing noise sched- ules of generative models in high dimensionss. arXiv preprint arXiv:2501.00988 , 2025

work page arXiv 2025
[4]

Flow map matching with stochastic interpolants: A mathematical framework for consistency models

Nicholas Matthew Boffi, Michael Samuel Albergo, and Eric Vanden-Eijnden. Flow map matching with stochastic interpolants: A mathematical framework for consistency models. Transactions on Machine Learning Research, 2025

work page 2025
[5]

On the trajectory regularity of ode-based diffusion sampling

Defang Chen, Zhenyu Zhou, Can Wang, Chunhua Shen, and Siwei Lyu. On the trajectory regularity of ode-based diffusion sampling. InForty-first International Conference on Machine Learning, 2024. 18 LIPSCHITZ-GUIDED DESIGN OF GENERATIVE MODELS

work page 2024
[6]

Accelerating diffusion models with parallel sampling: Inference at sub-linear time complexity

Haoxuan Chen, Yinuo Ren, Lexing Ying, and Grant Rotskoff. Accelerating diffusion models with parallel sampling: Inference at sub-linear time complexity. Advances in Neural Information Pro- cessing Systems, 37:133661–133709, 2024

work page 2024
[7]

New affine invariant ensemble samplers and their dimensional scaling

Yifan Chen. New affine invariant ensemble samplers and their dimensional scaling. arXiv preprint arXiv:2505.02987, 2025

work page arXiv 2025
[8]

Probabilistic forecasting with stochastic interpolants and F¨ ollmer processes

Yifan Chen, Mark Goldstein, Mengjian Hua, Michael S Albergo, Nicholas M Boffi, and Eric Vanden- Eijnden. Probabilistic forecasting with stochastic interpolants and F¨ ollmer processes. InProceedings of the 41st International Conference on Machine Learning , pages 6728–6756, 2024

work page 2024
[9]

On the contractivity of stochastic interpolation flow.arXiv preprint arXiv:2504.10653, 2025

Max Daniels. On the contractivity of stochastic interpolation flow.arXiv preprint arXiv:2504.10653, 2025

work page arXiv 2025
[10]

Accelerated diffu- sion models via speculative sampling

Valentin De Bortoli, Alexandre Galashov, Arthur Gretton, and Arnaud Doucet. Accelerated diffu- sion models via speculative sampling. arXiv preprint arXiv:2501.05370 , 2025

work page arXiv 2025
[11]

Diffusion schr¨ odinger bridge with applications to score-based generative modeling

Valentin De Bortoli, James Thornton, Jeremy Heng, and Arnaud Doucet. Diffusion schr¨ odinger bridge with applications to score-based generative modeling. Advances in neural information pro- cessing systems, 34:17695–17709, 2021

work page 2021
[12]

Diffusion models beat gans on image synthesis

Prafulla Dhariwal and Alexander Nichol. Diffusion models beat gans on image synthesis. Advances in neural information processing systems , 34:8780–8794, 2021

work page 2021
[13]

Genie: Higher-order denoising diffusion solvers

Tim Dockhorn, Arash Vahdat, and Karsten Kreis. Genie: Higher-order denoising diffusion solvers. Advances in Neural Information Processing Systems , 35:30150–30166, 2022

work page 2022
[14]

One Step Diffusion via Shortcut Models

Kevin Frans, Danijar Hafner, Sergey Levine, and Pieter Abbeel. One step diffusion via shortcut models. arXiv preprint arXiv:2410.12557 , 2024

work page internal anchor Pith review Pith/arXiv arXiv 2024
[15]

Gaussian interpolation flows.Journal of Machine Learning Research, 25(253):1–52, 2024

Yuan Gao, Jian Huang, and Yuling Jiao. Gaussian interpolation flows.Journal of Machine Learning Research, 25(253):1–52, 2024

work page 2024
[16]

Wavelet score-based generative modeling

Florentin Guth, Simon Coste, Valentin De Bortoli, and Stephane Mallat. Wavelet score-based generative modeling. Advances in neural information processing systems , 35:478–491, 2022

work page 2022
[17]

Mimicking the one-dimensional marginal distributions of processes having an itˆ o differential

Istv´ an Gy¨ ongy. Mimicking the one-dimensional marginal distributions of processes having an itˆ o differential. Probability theory and related fields, 71(4):501–516, 1986

work page 1986
[18]

Ergodicity of the 2d navier-stokes equations with degen- erate stochastic forcing

Martin Hairer and Jonathan C Mattingly. Ergodicity of the 2d navier-stokes equations with degen- erate stochastic forcing. Annals of Mathematics , pages 993–1032, 2006

work page 2006
[19]

Denoising diffusion probabilistic models

Jonathan Ho, Ajay Jain, and Pieter Abbeel. Denoising diffusion probabilistic models. In Advances in neural information processing systems , volume 33, pages 6840–6851, 2020

work page 2020
[20]

Cascaded diffusion models for high fidelity image generation

Jonathan Ho, Chitwan Saharia, William Chan, David J Fleet, Mohammad Norouzi, and Tim Sal- imans. Cascaded diffusion models for high fidelity image generation. Journal of Machine Learning Research, 23(47):1–33, 2022

work page 2022
[21]

Subspace diffusion gener- ative models

Bowen Jing, Gabriele Corso, Renato Berlinghieri, and Tommi Jaakkola. Subspace diffusion gener- ative models. In European Conference on Computer Vision , pages 274–289. Springer, 2022

work page 2022
[22]

Gotta go fast when generating data with score-based models

Alexia Jolicoeur-Martineau, Ke Li, R´ emi Pich´ e-Taillefer, Tal Kachman, and Ioannis Mitliagkas. Gotta go fast when generating data with score-based models. arXiv preprint arXiv:2105.14080 , 2021

work page arXiv 2021
[23]

Elucidating the design space of diffusion- based generative models

Tero Karras, Miika Aittala, Timo Aila, and Samuli Laine. Elucidating the design space of diffusion- based generative models. Advances in neural information processing systems, 35:26565–26577, 2022

work page 2022
[24]

Consistency trajectory models: Learning probability flow ode trajectory of diffusion

Dongjun Kim, Chieh-Hsin Lai, Wei-Hsiang Liao, Naoki Murata, Yuhta Takida, Toshimitsu Ue- saka, Yutong He, Yuki Mitsufuji, and Stefano Ermon. Consistency trajectory models: Learning probability flow ode trajectory of diffusion. In ICLR, 2024

work page 2024
[25]

Variational diffusion models

Diederik Kingma, Tim Salimans, Ben Poole, and Jonathan Ho. Variational diffusion models. Ad- vances in neural information processing systems , 34:21696–21707, 2021

work page 2021
[26]

Accelerating con- vergence of score-based diffusion models, provably

Gen Li, Yu Huang, Timofey Efimov, Yuting Wei, Yuejie Chi, and Yuxin Chen. Accelerating con- vergence of score-based diffusion models, provably. arXiv preprint arXiv:2403.03852 , 2024

work page arXiv 2024
[27]

Flow match- ing for generative modeling

Yaron Lipman, Ricky TQ Chen, Heli Ben-Hamu, Maximilian Nickel, and Matthew Le. Flow match- ing for generative modeling. InThe Eleventh International Conference on Learning Representations, 2022

work page 2022
[28]

Flow straight and fast: Learning to generate and transfer data with rectified flow

Xingchao Liu, Chengyue Gong, and Qiang Liu. Flow straight and fast: Learning to generate and transfer data with rectified flow. In The Eleventh International Conference on Learning Represen- tations, 2022. LIPSCHITZ-GUIDED DESIGN OF GENERATIVE MODELS 19

work page 2022
[29]

Dpm-solver: A fast ode solver for diffusion probabilistic model sampling in around 10 steps

Cheng Lu, Yuhao Zhou, Fan Bao, Jianfei Chen, Chongxuan Li, and Jun Zhu. Dpm-solver: A fast ode solver for diffusion probabilistic model sampling in around 10 steps. Advances in Neural Information Processing Systems, 35:5775–5787, 2022

work page 2022
[30]

Improved denoising dfiffusion probabilistic models

Alexander Quinn Nichol and Prafulla Dhariwal. Improved denoising dfiffusion probabilistic models. In International conference on machine learning , pages 8162–8171. PMLR, 2021

work page 2021
[31]

Wavelet diffusion models are fast and scalable image generators

Hao Phung, Quan Dao, and Anh Tran. Wavelet diffusion models are fast and scalable image generators. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 10199–10208, 2023

work page 2023
[32]

Align your steps: Optimizing sampling schedules in diffusion models

Amirmojtaba Sabour, Sanja Fidler, and Karsten Kreis. Align your steps: Optimizing sampling schedules in diffusion models. In Forty-first International Conference on Machine Learning , 2024

work page 2024
[33]

Image super-resolution via iterative refinement

Chitwan Saharia, Jonathan Ho, William Chan, Tim Salimans, David J Fleet, and Mohammad Norouzi. Image super-resolution via iterative refinement. IEEE transactions on pattern analysis and machine intelligence , 45(4):4713–4726, 2022

work page 2022
[34]

Progressive Distillation for Fast Sampling of Diffusion Models

Tim Salimans and Jonathan Ho. Progressive distillation for fast sampling of diffusion models. arXiv preprint arXiv:2202.00512, 2022

work page internal anchor Pith review Pith/arXiv arXiv 2022
[35]

Noise estimation for generative diffusion mod- els

Robin San-Roman, Eliya Nachmani, and Lior Wolf. Noise estimation for generative diffusion mod- els. arXiv preprint arXiv:2104.02600 , 2021

work page arXiv 2021
[36]

Bespoke solvers for generative flow models

N Shaul, J Perez, RTQ Chen, A Thabet, A Pumarola, and Y Lipman. Bespoke solvers for generative flow models. In 12th International Conference on Learning Representations, ICLR 2024 , 2024

work page 2024
[37]

Diffusion schr¨ odinger bridge matching

Yuyang Shi, Valentin De Bortoli, Andrew Campbell, and Arnaud Doucet. Diffusion schr¨ odinger bridge matching. Advances in Neural Information Processing Systems , 36:62183–62223, 2023

work page 2023
[38]

Deep unsu- pervised learning using nonequilibrium thermodynamics

Jascha Sohl-Dickstein, Eric A Weiss, Niru Maheswaranathan, and Surya Ganguli. Deep unsu- pervised learning using nonequilibrium thermodynamics. In Proceedings of the 32nd International Conference on International Conference on Machine Learning-Volume 37 , pages 2256–2265, 2015

work page 2015
[39]

Consistency models

Yang Song, Prafulla Dhariwal, Mark Chen, and Ilya Sutskever. Consistency models. InInternational Conference on Machine Learning, pages 32211–32252. PMLR, 2023

work page 2023
[40]

Generative modeling by estimating gradients of the data distribu- tion

Yang Song and Stefano Ermon. Generative modeling by estimating gradients of the data distribu- tion. Advances in neural information processing systems , 32, 2019

work page 2019
[41]

Improved techniques for training score-based generative models

Yang Song and Stefano Ermon. Improved techniques for training score-based generative models. Advances in neural information processing systems , 33:12438–12448, 2020

work page 2020
[42]

Score-Based Generative Modeling through Stochastic Differential Equations

Yang Song, Jascha Sohl-Dickstein, Diederik P Kingma, Abhishek Kumar, Stefano Ermon, and Ben Poole. Score-based generative modeling through stochastic differential equations. arXiv preprint arXiv:2011.13456, 2020

work page internal anchor Pith review Pith/arXiv arXiv 2011
[43]

Stork: Improving the fidelity of mid-nfe sampling for diffusion and flow matching models

Zheng Tan, Weizhen Wang, Andrea L Bertozzi, and Ernest K Ryu. Stork: Improving the fidelity of mid-nfe sampling for diffusion and flow matching models. arXiv preprint arXiv:2505.24210 , 2025

work page arXiv 2025
[44]

Optimal scheduling of dynamic trans- port

Panos Tsimpos, Zhi Ren, Jakob Zech, and Youssef Marzouk. Optimal scheduling of dynamic trans- port. arXiv preprint arXiv:2504.14425 , 2025

work page arXiv 2025
[45]

Evaluating the design space of diffusion-based generative models

Yuqing Wang, Ye He, and Molei Tao. Evaluating the design space of diffusion-based generative models. Advances in Neural Information Processing Systems , 37:19307–19352, 2024

work page 2024
[46]

Stochastic runge-kutta methods: Provable acceleration of diffusion models

Yuchen Wu, Yuxin Chen, and Yuting Wei. Stochastic runge-kutta methods: Provable acceleration of diffusion models. arXiv preprint arXiv:2410.04760 , 2024

work page arXiv 2024
[47]

Accelerating diffusion sampling with optimized time steps

Shuchen Xue, Zhaoqiang Liu, Fei Chen, Shifeng Zhang, Tianyang Hu, Enze Xie, and Zhenguo Li. Accelerating diffusion sampling with optimized time steps. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition , pages 8292–8301, 2024

work page 2024
[48]

Diffusion models: A comprehensive survey of methods and appli- cations

Ling Yang, Zhilong Zhang, Yang Song, Shenda Hong, Runsheng Xu, Yue Zhao, Wentao Zhang, Bin Cui, and Ming-Hsuan Yang. Diffusion models: A comprehensive survey of methods and appli- cations. ACM computing surveys, 56(4):1–39, 2023

work page 2023
[49]

Wavelet flow: Fast training of high resolution normalizing flows

Jason J Yu, Konstantinos G Derpanis, and Marcus A Brubaker. Wavelet flow: Fast training of high resolution normalizing flows. Advances in Neural Information Processing Systems , 33:6184–6196, 2020

work page 2020
[50]

Fast sampling of diffusion models with exponential integrator

Qinsheng Zhang and Yongxin Chen. Fast sampling of diffusion models with exponential integrator. arXiv preprint arXiv:2204.13902 , 2022. 20 LIPSCHITZ-GUIDED DESIGN OF GENERATIVE MODELS Appendix A. Sketch of Derivations for Stochastic Interpolants Sketch of derivation for Proposition 2.2. For any smooth test function ϕ : Rd → R, (A.1) d ϕ(It) = ˙It · ∇ϕ(It)...

work page arXiv 2022

[1] [1]

Stochastic Interpolants: A Unifying Framework for Flows and Diffusions

Michael S Albergo, Nicholas M Boffi, and Eric Vanden-Eijnden. Stochastic interpolants: A unifying framework for flows and diffusions. arXiv preprint arXiv:2303.08797 , 2023

work page internal anchor Pith review Pith/arXiv arXiv 2023

[2] [2]

Building normalizing flows with stochastic inter- polants

Michael S Albergo and Eric Vanden-Eijnden. Building normalizing flows with stochastic inter- polants. In The Eleventh International Conference on Learning Representations , 2022

work page 2022

[3] [3]

Optimizing noise sched- ules of generative models in high dimensionss

Santiago Aranguri, Giulio Biroli, Marc Mezard, and Eric Vanden-Eijnden. Optimizing noise sched- ules of generative models in high dimensionss. arXiv preprint arXiv:2501.00988 , 2025

work page arXiv 2025

[4] [4]

Flow map matching with stochastic interpolants: A mathematical framework for consistency models

Nicholas Matthew Boffi, Michael Samuel Albergo, and Eric Vanden-Eijnden. Flow map matching with stochastic interpolants: A mathematical framework for consistency models. Transactions on Machine Learning Research, 2025

work page 2025

[5] [5]

On the trajectory regularity of ode-based diffusion sampling

Defang Chen, Zhenyu Zhou, Can Wang, Chunhua Shen, and Siwei Lyu. On the trajectory regularity of ode-based diffusion sampling. InForty-first International Conference on Machine Learning, 2024. 18 LIPSCHITZ-GUIDED DESIGN OF GENERATIVE MODELS

work page 2024

[6] [6]

Accelerating diffusion models with parallel sampling: Inference at sub-linear time complexity

Haoxuan Chen, Yinuo Ren, Lexing Ying, and Grant Rotskoff. Accelerating diffusion models with parallel sampling: Inference at sub-linear time complexity. Advances in Neural Information Pro- cessing Systems, 37:133661–133709, 2024

work page 2024

[7] [7]

New affine invariant ensemble samplers and their dimensional scaling

Yifan Chen. New affine invariant ensemble samplers and their dimensional scaling. arXiv preprint arXiv:2505.02987, 2025

work page arXiv 2025

[8] [8]

Probabilistic forecasting with stochastic interpolants and F¨ ollmer processes

Yifan Chen, Mark Goldstein, Mengjian Hua, Michael S Albergo, Nicholas M Boffi, and Eric Vanden- Eijnden. Probabilistic forecasting with stochastic interpolants and F¨ ollmer processes. InProceedings of the 41st International Conference on Machine Learning , pages 6728–6756, 2024

work page 2024

[9] [9]

On the contractivity of stochastic interpolation flow.arXiv preprint arXiv:2504.10653, 2025

Max Daniels. On the contractivity of stochastic interpolation flow.arXiv preprint arXiv:2504.10653, 2025

work page arXiv 2025

[10] [10]

Accelerated diffu- sion models via speculative sampling

Valentin De Bortoli, Alexandre Galashov, Arthur Gretton, and Arnaud Doucet. Accelerated diffu- sion models via speculative sampling. arXiv preprint arXiv:2501.05370 , 2025

work page arXiv 2025

[11] [11]

Diffusion schr¨ odinger bridge with applications to score-based generative modeling

Valentin De Bortoli, James Thornton, Jeremy Heng, and Arnaud Doucet. Diffusion schr¨ odinger bridge with applications to score-based generative modeling. Advances in neural information pro- cessing systems, 34:17695–17709, 2021

work page 2021

[12] [12]

Diffusion models beat gans on image synthesis

Prafulla Dhariwal and Alexander Nichol. Diffusion models beat gans on image synthesis. Advances in neural information processing systems , 34:8780–8794, 2021

work page 2021

[13] [13]

Genie: Higher-order denoising diffusion solvers

Tim Dockhorn, Arash Vahdat, and Karsten Kreis. Genie: Higher-order denoising diffusion solvers. Advances in Neural Information Processing Systems , 35:30150–30166, 2022

work page 2022

[14] [14]

One Step Diffusion via Shortcut Models

Kevin Frans, Danijar Hafner, Sergey Levine, and Pieter Abbeel. One step diffusion via shortcut models. arXiv preprint arXiv:2410.12557 , 2024

work page internal anchor Pith review Pith/arXiv arXiv 2024

[15] [15]

Gaussian interpolation flows.Journal of Machine Learning Research, 25(253):1–52, 2024

Yuan Gao, Jian Huang, and Yuling Jiao. Gaussian interpolation flows.Journal of Machine Learning Research, 25(253):1–52, 2024

work page 2024

[16] [16]

Wavelet score-based generative modeling

Florentin Guth, Simon Coste, Valentin De Bortoli, and Stephane Mallat. Wavelet score-based generative modeling. Advances in neural information processing systems , 35:478–491, 2022

work page 2022

[17] [17]

Mimicking the one-dimensional marginal distributions of processes having an itˆ o differential

Istv´ an Gy¨ ongy. Mimicking the one-dimensional marginal distributions of processes having an itˆ o differential. Probability theory and related fields, 71(4):501–516, 1986

work page 1986

[18] [18]

Ergodicity of the 2d navier-stokes equations with degen- erate stochastic forcing

Martin Hairer and Jonathan C Mattingly. Ergodicity of the 2d navier-stokes equations with degen- erate stochastic forcing. Annals of Mathematics , pages 993–1032, 2006

work page 2006

[19] [19]

Denoising diffusion probabilistic models

Jonathan Ho, Ajay Jain, and Pieter Abbeel. Denoising diffusion probabilistic models. In Advances in neural information processing systems , volume 33, pages 6840–6851, 2020

work page 2020

[20] [20]

Cascaded diffusion models for high fidelity image generation

Jonathan Ho, Chitwan Saharia, William Chan, David J Fleet, Mohammad Norouzi, and Tim Sal- imans. Cascaded diffusion models for high fidelity image generation. Journal of Machine Learning Research, 23(47):1–33, 2022

work page 2022

[21] [21]

Subspace diffusion gener- ative models

Bowen Jing, Gabriele Corso, Renato Berlinghieri, and Tommi Jaakkola. Subspace diffusion gener- ative models. In European Conference on Computer Vision , pages 274–289. Springer, 2022

work page 2022

[22] [22]

Gotta go fast when generating data with score-based models

Alexia Jolicoeur-Martineau, Ke Li, R´ emi Pich´ e-Taillefer, Tal Kachman, and Ioannis Mitliagkas. Gotta go fast when generating data with score-based models. arXiv preprint arXiv:2105.14080 , 2021

work page arXiv 2021

[23] [23]

Elucidating the design space of diffusion- based generative models

Tero Karras, Miika Aittala, Timo Aila, and Samuli Laine. Elucidating the design space of diffusion- based generative models. Advances in neural information processing systems, 35:26565–26577, 2022

work page 2022

[24] [24]

Consistency trajectory models: Learning probability flow ode trajectory of diffusion

Dongjun Kim, Chieh-Hsin Lai, Wei-Hsiang Liao, Naoki Murata, Yuhta Takida, Toshimitsu Ue- saka, Yutong He, Yuki Mitsufuji, and Stefano Ermon. Consistency trajectory models: Learning probability flow ode trajectory of diffusion. In ICLR, 2024

work page 2024

[25] [25]

Variational diffusion models

Diederik Kingma, Tim Salimans, Ben Poole, and Jonathan Ho. Variational diffusion models. Ad- vances in neural information processing systems , 34:21696–21707, 2021

work page 2021

[26] [26]

Accelerating con- vergence of score-based diffusion models, provably

Gen Li, Yu Huang, Timofey Efimov, Yuting Wei, Yuejie Chi, and Yuxin Chen. Accelerating con- vergence of score-based diffusion models, provably. arXiv preprint arXiv:2403.03852 , 2024

work page arXiv 2024

[27] [27]

Flow match- ing for generative modeling

Yaron Lipman, Ricky TQ Chen, Heli Ben-Hamu, Maximilian Nickel, and Matthew Le. Flow match- ing for generative modeling. InThe Eleventh International Conference on Learning Representations, 2022

work page 2022

[28] [28]

Flow straight and fast: Learning to generate and transfer data with rectified flow

Xingchao Liu, Chengyue Gong, and Qiang Liu. Flow straight and fast: Learning to generate and transfer data with rectified flow. In The Eleventh International Conference on Learning Represen- tations, 2022. LIPSCHITZ-GUIDED DESIGN OF GENERATIVE MODELS 19

work page 2022

[29] [29]

Dpm-solver: A fast ode solver for diffusion probabilistic model sampling in around 10 steps

Cheng Lu, Yuhao Zhou, Fan Bao, Jianfei Chen, Chongxuan Li, and Jun Zhu. Dpm-solver: A fast ode solver for diffusion probabilistic model sampling in around 10 steps. Advances in Neural Information Processing Systems, 35:5775–5787, 2022

work page 2022

[30] [30]

Improved denoising dfiffusion probabilistic models

Alexander Quinn Nichol and Prafulla Dhariwal. Improved denoising dfiffusion probabilistic models. In International conference on machine learning , pages 8162–8171. PMLR, 2021

work page 2021

[31] [31]

Wavelet diffusion models are fast and scalable image generators

Hao Phung, Quan Dao, and Anh Tran. Wavelet diffusion models are fast and scalable image generators. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 10199–10208, 2023

work page 2023

[32] [32]

Align your steps: Optimizing sampling schedules in diffusion models

Amirmojtaba Sabour, Sanja Fidler, and Karsten Kreis. Align your steps: Optimizing sampling schedules in diffusion models. In Forty-first International Conference on Machine Learning , 2024

work page 2024

[33] [33]

Image super-resolution via iterative refinement

Chitwan Saharia, Jonathan Ho, William Chan, Tim Salimans, David J Fleet, and Mohammad Norouzi. Image super-resolution via iterative refinement. IEEE transactions on pattern analysis and machine intelligence , 45(4):4713–4726, 2022

work page 2022

[34] [34]

Progressive Distillation for Fast Sampling of Diffusion Models

Tim Salimans and Jonathan Ho. Progressive distillation for fast sampling of diffusion models. arXiv preprint arXiv:2202.00512, 2022

work page internal anchor Pith review Pith/arXiv arXiv 2022

[35] [35]

Noise estimation for generative diffusion mod- els

Robin San-Roman, Eliya Nachmani, and Lior Wolf. Noise estimation for generative diffusion mod- els. arXiv preprint arXiv:2104.02600 , 2021

work page arXiv 2021

[36] [36]

Bespoke solvers for generative flow models

N Shaul, J Perez, RTQ Chen, A Thabet, A Pumarola, and Y Lipman. Bespoke solvers for generative flow models. In 12th International Conference on Learning Representations, ICLR 2024 , 2024

work page 2024

[37] [37]

Diffusion schr¨ odinger bridge matching

Yuyang Shi, Valentin De Bortoli, Andrew Campbell, and Arnaud Doucet. Diffusion schr¨ odinger bridge matching. Advances in Neural Information Processing Systems , 36:62183–62223, 2023

work page 2023

[38] [38]

Deep unsu- pervised learning using nonequilibrium thermodynamics

Jascha Sohl-Dickstein, Eric A Weiss, Niru Maheswaranathan, and Surya Ganguli. Deep unsu- pervised learning using nonequilibrium thermodynamics. In Proceedings of the 32nd International Conference on International Conference on Machine Learning-Volume 37 , pages 2256–2265, 2015

work page 2015

[39] [39]

Consistency models

Yang Song, Prafulla Dhariwal, Mark Chen, and Ilya Sutskever. Consistency models. InInternational Conference on Machine Learning, pages 32211–32252. PMLR, 2023

work page 2023

[40] [40]

Generative modeling by estimating gradients of the data distribu- tion

Yang Song and Stefano Ermon. Generative modeling by estimating gradients of the data distribu- tion. Advances in neural information processing systems , 32, 2019

work page 2019

[41] [41]

Improved techniques for training score-based generative models

Yang Song and Stefano Ermon. Improved techniques for training score-based generative models. Advances in neural information processing systems , 33:12438–12448, 2020

work page 2020

[42] [42]

Score-Based Generative Modeling through Stochastic Differential Equations

Yang Song, Jascha Sohl-Dickstein, Diederik P Kingma, Abhishek Kumar, Stefano Ermon, and Ben Poole. Score-based generative modeling through stochastic differential equations. arXiv preprint arXiv:2011.13456, 2020

work page internal anchor Pith review Pith/arXiv arXiv 2011

[43] [43]

Stork: Improving the fidelity of mid-nfe sampling for diffusion and flow matching models

Zheng Tan, Weizhen Wang, Andrea L Bertozzi, and Ernest K Ryu. Stork: Improving the fidelity of mid-nfe sampling for diffusion and flow matching models. arXiv preprint arXiv:2505.24210 , 2025

work page arXiv 2025

[44] [44]

Optimal scheduling of dynamic trans- port

Panos Tsimpos, Zhi Ren, Jakob Zech, and Youssef Marzouk. Optimal scheduling of dynamic trans- port. arXiv preprint arXiv:2504.14425 , 2025

work page arXiv 2025

[45] [45]

Evaluating the design space of diffusion-based generative models

Yuqing Wang, Ye He, and Molei Tao. Evaluating the design space of diffusion-based generative models. Advances in Neural Information Processing Systems , 37:19307–19352, 2024

work page 2024

[46] [46]

Stochastic runge-kutta methods: Provable acceleration of diffusion models

Yuchen Wu, Yuxin Chen, and Yuting Wei. Stochastic runge-kutta methods: Provable acceleration of diffusion models. arXiv preprint arXiv:2410.04760 , 2024

work page arXiv 2024

[47] [47]

Accelerating diffusion sampling with optimized time steps

Shuchen Xue, Zhaoqiang Liu, Fei Chen, Shifeng Zhang, Tianyang Hu, Enze Xie, and Zhenguo Li. Accelerating diffusion sampling with optimized time steps. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition , pages 8292–8301, 2024

work page 2024

[48] [48]

Diffusion models: A comprehensive survey of methods and appli- cations

Ling Yang, Zhilong Zhang, Yang Song, Shenda Hong, Runsheng Xu, Yue Zhao, Wentao Zhang, Bin Cui, and Ming-Hsuan Yang. Diffusion models: A comprehensive survey of methods and appli- cations. ACM computing surveys, 56(4):1–39, 2023

work page 2023

[49] [49]

Wavelet flow: Fast training of high resolution normalizing flows

Jason J Yu, Konstantinos G Derpanis, and Marcus A Brubaker. Wavelet flow: Fast training of high resolution normalizing flows. Advances in Neural Information Processing Systems , 33:6184–6196, 2020

work page 2020

[50] [50]

Fast sampling of diffusion models with exponential integrator

Qinsheng Zhang and Yongxin Chen. Fast sampling of diffusion models with exponential integrator. arXiv preprint arXiv:2204.13902 , 2022. 20 LIPSCHITZ-GUIDED DESIGN OF GENERATIVE MODELS Appendix A. Sketch of Derivations for Stochastic Interpolants Sketch of derivation for Proposition 2.2. For any smooth test function ϕ : Rd → R, (A.1) d ϕ(It) = ˙It · ∇ϕ(It)...

work page arXiv 2022