pith. sign in

arxiv: 2511.08425 · v3 · submitted 2025-11-11 · 💻 cs.LG · cs.SY· eess.SY

HardFlow: Hard-Constrained Sampling for Flow-Matching Models via Trajectory Optimization

Pith reviewed 2026-05-17 23:32 UTC · model grok-4.3

classification 💻 cs.LG cs.SYeess.SY
keywords hard-constrained samplingflow-matchingtrajectory optimizationmodel predictive controlgenerative modelingoptimal controlconstrained generation
0
0 comments X

The pith

Hard-constrained sampling for flow-matching models is reformulated as a trajectory optimization problem that enforces constraints only at the terminal time via model predictive control approximations.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper establishes that projection-based methods for hard constraints in flow-matching degrade sample quality by forcing the entire sampling path onto the constraint manifold. It instead treats sampling as a trajectory optimization task solved with numerical optimal control so that constraints hold exactly at the final time. Model predictive control yields a tractable surrogate that can be solved efficiently while allowing integral costs to limit distribution shift and terminal objectives to boost quality. Control analysis supplies bounds on the surrogate's approximation error to the ideal problem. Experiments in robot planning, PDE boundary control, and text-guided image editing show gains in both constraint adherence and sample fidelity over prior approaches.

Core claim

Hard-constrained sampling is reformulated as a trajectory optimization problem in which numerical optimal control steers the flow-matching sampling path to satisfy constraints precisely at the terminal time. By adopting model predictive control, this complex problem is transformed into an efficiently solvable surrogate. The framework also incorporates integral costs and terminal objectives within a unified setting, with bounds established on the approximation error to the ideal formulation.

What carries the argument

The tractable surrogate optimization problem obtained by applying model predictive control approximations to the underlying flow-matching sampling dynamics, steering the trajectory to meet constraints at terminal time.

If this is right

  • Constraints are enforced only at the terminal time rather than restricting the full sampling trajectory.
  • Integral costs can be added inside the surrogate to keep generated samples close to the unconstrained distribution.
  • Terminal objectives can be included to further improve quality metrics within the same optimization.
  • Control-theoretic bounds limit the approximation error between the surrogate and the ideal constrained problem.
  • The approach yields measurable gains in constraint satisfaction and sample quality on robotics planning, PDE boundary control, and vision editing tasks.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same terminal-time steering idea may transfer to diffusion models by replacing the flow ODE with the corresponding reverse process.
  • Receding-horizon implementations of the surrogate could support online constraint changes in robotic or simulation loops.
  • The framework's flexibility with costs and objectives suggests natural extensions to multi-objective or uncertainty-aware constraints.

Load-bearing premise

The model predictive control surrogate sufficiently approximates the ideal hard-constrained trajectory optimization without introducing large distribution shift or frequent constraint violations.

What would settle it

Generate samples under a simple geometric constraint such as staying inside a known ball and measure the fraction that still violate the boundary after the terminal step; high violation rates or large quality drops relative to unconstrained flow-matching would falsify the central claim.

Figures

Figures reproduced from arXiv: 2511.08425 by Kaveh Alim, Navid Azizan, Zeyang Li.

Figure 1
Figure 1. Figure 1: Roadmap of problem transformations. Double arrows indicate equivalence; single arrows indicate approximation. [PITH_FULL_IMAGE:figures/full_fig_p005_1.png] view at source ↗
Figure 3
Figure 3. Figure 3: Visualization of HardFlow in the robotic manipulation [PITH_FULL_IMAGE:figures/full_fig_p011_3.png] view at source ↗
Figure 2
Figure 2. Figure 2: The robotic manipulation task. We evaluate all methods over 50 trials. The simulation starts at the same initial position and terminates when the end￾effector reaches the target area or collides with any obstacle. A visualization of our algorithm is shown in [PITH_FULL_IMAGE:figures/full_fig_p011_2.png] view at source ↗
Figure 4
Figure 4. Figure 4: Visualization of HardFlow in the maze navigation [PITH_FULL_IMAGE:figures/full_fig_p012_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: Heatmaps of controlled state u and predicted control f produced by HardFlow at one trial in the PDE control task. The grid uses m = 10 time steps and n = 128 spatial points; the axis ticks reflect this discretization [PITH_FULL_IMAGE:figures/full_fig_p013_5.png] view at source ↗
Figure 6
Figure 6. Figure 6: Slices of controlled state u at three time instants, produced by HardFlow. The state constraints are satisfied. D. Text-Guided Image Editing In this task, leveraging a flow-matching model pretrained on the CelebA-HQ celebrity face dataset [56], we edit an input image according to a text prompt so that the edited image aligns with the prompt requirements. We follow the setup of [25]. The pretrained flow-mat… view at source ↗
Figure 7
Figure 7. Figure 7: Box-and-whisker plots of CLIP (left; higher is better) and LPIPS (right; below the dashed line is required) for five [PITH_FULL_IMAGE:figures/full_fig_p014_7.png] view at source ↗
Figure 8
Figure 8. Figure 8: Qualitative comparison of different methods on the text-guided image editing task. The three reference images are [PITH_FULL_IMAGE:figures/full_fig_p015_8.png] view at source ↗
read the original abstract

Diffusion and flow-matching have emerged as powerful methodologies for generative modeling, with remarkable success in capturing complex data distributions and enabling flexible guidance at inference time. Many downstream applications, however, demand enforcing hard constraints on generated samples (for example, robot trajectories must avoid obstacles), a requirement that goes beyond simple guidance. Prevailing projection-based approaches constrain the entire sampling path to the constraint manifold, which is overly restrictive and degrades sample quality. In this paper, we introduce a novel framework that reformulates hard-constrained sampling as a trajectory optimization problem. Our key insight is to leverage numerical optimal control to steer the sampling trajectory so that constraints are satisfied precisely at the terminal time. By exploiting the underlying structure of flow-matching models and adopting techniques from model predictive control, we transform this otherwise complex constrained optimization problem into a tractable surrogate that can be solved efficiently and effectively. Furthermore, this trajectory optimization perspective offers significant flexibility beyond mere constraint satisfaction, allowing for the inclusion of integral costs to minimize distribution shift and terminal objectives to further enhance sample quality, all within a unified framework. We provide a control-theoretic analysis of our method, establishing bounds on the approximation error between our tractable surrogate and the ideal formulation. Extensive experiments across diverse domains, including robotics (planning), partial differential equations (boundary control), and vision (text-guided image editing), demonstrate that our algorithm, which we name $\textit{HardFlow}$, substantially outperforms existing methods in both constraint satisfaction and sample quality.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

1 major / 2 minor

Summary. The manuscript introduces HardFlow, a method that reformulates hard-constrained sampling for flow-matching models as a finite-horizon trajectory optimization problem. The central idea is to apply numerical optimal control and model predictive control (MPC) approximations to steer the sampling trajectory such that hard constraints are met exactly at the terminal time, while using integral costs to limit distribution shift and allowing additional terminal objectives. A control-theoretic analysis supplies bounds on the approximation error between the tractable MPC surrogate and the ideal constrained formulation. Experiments across robotics planning, PDE boundary control, and text-guided image editing demonstrate improved constraint satisfaction and sample quality relative to projection-based baselines.

Significance. If the approximation error bounds hold for the nonlinear dynamics of learned flow-matching models and the MPC surrogate reliably enforces terminal constraints without excessive distribution shift, the work provides a flexible alternative to path-restrictive projection methods. The unified optimal-control framing that incorporates both hard terminal constraints and soft integral costs is a useful conceptual advance for constrained generation tasks. The explicit control-theoretic analysis with error bounds is a strength that goes beyond purely heuristic approaches.

major comments (1)
  1. Control-theoretic analysis (error bound derivation): the a-priori bound on the approximation error between the MPC surrogate and the ideal terminal-constrained problem implicitly assumes the flow vector field is locally Lipschitz with a uniform constant over the full sampling interval. For learned velocity fields this constant can become large and non-uniform near data manifolds or during early denoising steps, which risks the propagated terminal error exceeding the stated bound and producing residual constraint violations or unintended distribution shift. This assumption is load-bearing for the claim that the surrogate tractably approximates hard-constrained sampling.
minor comments (2)
  1. Notation section: the distinction between the ideal infinite-horizon problem and the finite-horizon MPC surrogate could be made more explicit in the problem formulation to avoid reader confusion.
  2. Experimental results: reporting the computational overhead (wall-clock time or number of ODE solves) of the MPC solver relative to baselines would strengthen the practicality claims.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for the constructive review and for highlighting an important subtlety in our control-theoretic analysis. We address the major comment below and have revised the manuscript to clarify the relevant assumptions and their practical implications for learned models.

read point-by-point responses
  1. Referee: Control-theoretic analysis (error bound derivation): the a-priori bound on the approximation error between the MPC surrogate and the ideal terminal-constrained problem implicitly assumes the flow vector field is locally Lipschitz with a uniform constant over the full sampling interval. For learned velocity fields this constant can become large and non-uniform near data manifolds or during early denoising steps, which risks the propagated terminal error exceeding the stated bound and producing residual constraint violations or unintended distribution shift. This assumption is load-bearing for the claim that the surrogate tractably approximates hard-constrained sampling.

    Authors: We thank the referee for this observation. Our error bound derivation in Section 4 does rely on the flow vector field being locally Lipschitz with a uniform constant over [0, T], which is a standard technical assumption that enables a global a-priori guarantee. We acknowledge that learned velocity fields can exhibit large and non-uniform Lipschitz constants, especially near data manifolds or during early denoising steps, and that this could in principle cause the propagated terminal error to exceed the stated bound. In the revised manuscript we have added a new subsection (Section 4.3) that explicitly discusses this limitation, notes its potential consequences for residual constraint violations, and provides a local (trajectory-dependent) error estimate that replaces the uniform constant with the maximum Lipschitz value observed along sampled paths. We also report additional diagnostic experiments that compute effective Lipschitz constants on the trajectories generated by our method; these show that the realized approximation errors remain small enough to preserve high constraint satisfaction in the evaluated domains. While the uniform-bound claim is therefore qualified, the core practical claim—that the MPC surrogate tractably approximates hard-constrained sampling—continues to be supported by both the refined analysis and the empirical results. revision: yes

Circularity Check

0 steps flagged

No significant circularity; derivation builds on external optimal control and flow-matching structures

full rationale

The paper reformulates hard-constrained sampling as a terminal-time trajectory optimization problem, then applies MPC-style surrogates and derives approximation-error bounds via standard control-theoretic arguments (local Lipschitz assumptions on the flow vector field). No quoted step reduces a claimed prediction or bound to a fitted parameter, self-definition, or self-citation chain that makes the central result equivalent to its inputs by construction. The analysis treats the flow ODE as given from prior flow-matching literature and invokes generic MPC error bounds without importing uniqueness theorems or ansatzes from the authors' own prior work as load-bearing premises. The derivation therefore remains self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract-only view yields no explicit free parameters, axioms, or invented entities; the method relies on standard numerical optimal control and flow-matching assumptions not detailed here.

pith-pipeline@v0.9.0 · 5572 in / 956 out tokens · 23505 ms · 2026-05-17T23:32:58.947755+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

60 extracted references · 60 canonical work pages · 3 internal anchors

  1. [1]

    Deep unsupervised learning using nonequilibrium thermodynamics,

    J. Sohl-Dickstein, E. Weiss, N. Maheswaranathan, and S. Ganguli, “Deep unsupervised learning using nonequilibrium thermodynamics,” in International conference on machine learning. pmlr, 2015, pp. 2256– 2265

  2. [2]

    Denoising diffusion probabilistic models,

    J. Ho, A. Jain, and P. Abbeel, “Denoising diffusion probabilistic models,” Advances in neural information processing systems, vol. 33, pp. 6840– 6851, 2020

  3. [3]

    Flow matching for generative modeling,

    Y . Lipman, R. T. Q. Chen, H. Ben-Hamu, M. Nickel, and M. Le, “Flow matching for generative modeling,” inThe Eleventh International Conference on Learning Representations, 2023

  4. [4]

    High- resolution image synthesis with latent diffusion models,

    R. Rombach, A. Blattmann, D. Lorenz, P. Esser, and B. Ommer, “High- resolution image synthesis with latent diffusion models,” inProceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2022, pp. 10 684–10 695

  5. [5]

    Scaling rectified flow transformers for high-resolution image synthesis,

    P. Esser, S. Kulal, A. Blattmann, R. Entezari, J. M ¨uller, H. Saini, Y . Levi, D. Lorenz, A. Sauer, F. Boeselet al., “Scaling rectified flow transformers for high-resolution image synthesis,” inForty-first international conference on machine learning, 2024

  6. [6]

    Video diffusion models,

    J. Ho, T. Salimans, A. Gritsenko, W. Chan, M. Norouzi, and D. J. Fleet, “Video diffusion models,”Advances in neural information processing systems, vol. 35, pp. 8633–8646, 2022

  7. [7]

    Pyramidal flow matching for efficient video generative modeling,

    Y . Jin, Z. Sun, N. Li, K. Xu, K. Xu, H. Jiang, N. Zhuang, Q. Huang, Y . Song, Y . MU, and Z. Lin, “Pyramidal flow matching for efficient video generative modeling,” inThe Thirteenth International Conference on Learning Representations, 2025

  8. [8]

    De novo design of protein structure and function with rfdiffusion,

    J. L. Watson, D. Juergens, N. R. Bennett, B. L. Trippe, J. Yim, H. E. Eisenach, W. Ahern, A. J. Borst, R. J. Ragotte, L. F. Milleset al., “De novo design of protein structure and function with rfdiffusion,”Nature, vol. 620, no. 7976, pp. 1089–1100, 2023

  9. [9]

    Diffusion policy: Visuomotor policy learning via ac- tion diffusion,

    C. Chi, Z. Xu, S. Feng, E. Cousineau, Y . Du, B. Burchfiel, R. Tedrake, and S. Song, “Diffusion policy: Visuomotor policy learning via ac- tion diffusion,”The International Journal of Robotics Research, p. 02783649241273668, 2023

  10. [10]

    Fast and robust visuomotor riemannian flow matching policy,

    H. Ding, N. Jaquier, J. Peters, and L. Rozo, “Fast and robust visuomotor riemannian flow matching policy,”IEEE Transactions on robotics, 2025

  11. [11]

    Diffusion models beat gans on image synthesis,

    P. Dhariwal and A. Nichol, “Diffusion models beat gans on image synthesis,”Advances in neural information processing systems, vol. 34, pp. 8780–8794, 2021

  12. [12]

    Classifier-Free Diffusion Guidance

    J. Ho and T. Salimans, “Classifier-free diffusion guidance,”arXiv preprint arXiv:2207.12598, 2022

  13. [13]

    Aligning text-to-image diffusion models with constrained reinforcement learning,

    Z. Zhang, S. Zhang, L. Shen, Y . Zhan, Y . Luo, H. Hu, B. Du, Y . Wen, and D. Tao, “Aligning text-to-image diffusion models with constrained reinforcement learning,”IEEE Transactions on Pattern Analysis and Machine Intelligence, 2025

  14. [14]

    Sampling constrained trajectories using composable diffusion models,

    T. Power, R. Soltani-Zarrin, S. Iba, and D. Berenson, “Sampling constrained trajectories using composable diffusion models,” inIROS 2023 Workshop on Differentiable Probabilistic Robotics: Emerging Perspectives on Robot Learning, 2023

  15. [15]

    Constrained synthesis with projected diffusion models,

    J. K. Christopher, S. Baek, and N. Fioretto, “Constrained synthesis with projected diffusion models,”Advances in Neural Information Processing Systems, vol. 37, pp. 89 307–89 333, 2024

  16. [16]

    Diffusion predictive control with constraints,

    R. R ¨omer, A. v. Rohr, and A. Schoellig, “Diffusion predictive control with constraints,” inProceedings of the 7th Annual Learning for Dynam- ics & Control Conference, ser. Proceedings of Machine Learning Research, N. Ozay, L. Balzano, D. Panagou, and A. Abate, Eds., vol

  17. [17]

    PMLR, 04–06 Jun 2025, pp. 791–803

  18. [18]

    Physdiff: Physics- guided human motion diffusion model,

    Y . Yuan, J. Song, U. Iqbal, A. Vahdat, and J. Kautz, “Physdiff: Physics- guided human motion diffusion model,” inProceedings of the IEEE/CVF international conference on computer vision, 2023, pp. 16 010–16 021

  19. [19]

    Safediffuser: Safe planning with diffusion probabilistic models,

    W. Xiao, T.-H. Wang, C. Gan, R. Hasani, M. Lechner, and D. Rus, “Safediffuser: Safe planning with diffusion probabilistic models,” inThe Thirteenth International Conference on Learning Representations, 2025

  20. [20]

    Con- strained diffusers for safe planning and control,

    J. Zhang, L. Zhao, A. Papachristodoulou, and J. Umenberger, “Con- strained diffusers for safe planning and control,” inThe Thirty-ninth Annual Conference on Neural Information Processing Systems, 2025

  21. [21]

    Projected Coupled Diffusion for Test-Time Constrained Joint Generation

    H. Luan, Y . X. Goh, S.-K. Ng, and C. K. Ling, “Projected coupled diffusion for test-time constrained joint generation,”arXiv preprint arXiv:2508.10531, 2025

  22. [22]

    arXiv preprint arXiv:2502.05625 , year =

    S. Zampini, J. K. Christopher, L. Oneto, D. Anguita, and F. Fioretto, “Training-free constrained generation with stable diffusion models,” arXiv preprint arXiv:2502.05625, 2025

  23. [23]

    DDAT: Diffusion Policies Enforcing Dynamically Admissible Robot Trajecto- ries,

    J.-B. Bouvier, K. Ryu, Q. Liao, K. Sreenath, and N. Mehr, “DDAT: Diffusion Policies Enforcing Dynamically Admissible Robot Trajecto- ries,” inProceedings of Robotics: Science and Systems, Los Angeles, CA, USA, June 2025

  24. [24]

    Denoising diffusion implicit models,

    J. Song, C. Meng, and S. Ermon, “Denoising diffusion implicit models,” inThe Ninth International Conference on Learning Representations, 2021

  25. [25]

    Rb-modulation: Training-free stylization using reference- based modulation,

    L. Rout, Y . Chen, N. Ruiz, A. Kumar, C. Caramanis, S. Shakkottai, and W.-S. Chu, “Rb-modulation: Training-free stylization using reference- based modulation,” inThe Thirteenth International Conference on Learning Representations, 2025

  26. [26]

    Training free guided flow-matching with optimal control,

    L. Wang, C. Cheng, Y . Liao, Y . Qu, and G. Liu, “Training free guided flow-matching with optimal control,” inThe Thirteenth International Conference on Learning Representations, 2025

  27. [27]

    D. E. Kirk,Optimal Control Theory: An Introduction. Courier Corporation, 2004

  28. [28]

    Direct and indirect methods for trajectory optimization,

    O. V on Stryk and R. Bulirsch, “Direct and indirect methods for trajectory optimization,”Annals of operations research, vol. 37, no. 1, pp. 357– 373, 1992. 17

  29. [29]

    A survey of methods available for the numerical optimization of continuous dynamic systems,

    B. A. Conway, “A survey of methods available for the numerical optimization of continuous dynamic systems,”Journal of Optimization Theory and Applications, vol. 152, no. 2, pp. 271–306, 2012

  30. [30]

    Model predictive control: past, present and future,

    M. Morari and J. H. Lee, “Model predictive control: past, present and future,”Computers & chemical engineering, vol. 23, no. 4-5, pp. 667– 682, 1999

  31. [31]

    Cobl-diffusion: Diffusion-based conditional robot planning in dynamic environments using control barrier and lyapunov functions,

    K. Mizuta and K. Leung, “Cobl-diffusion: Diffusion-based conditional robot planning in dynamic environments using control barrier and lyapunov functions,” in2024 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). IEEE, 2024, pp. 13 801–13 808

  32. [32]

    Motion planning diffusion: Learning and adapting robot motion planning with diffusion models,

    J. Carvalho, A. T. Le, P. Kicki, D. Koert, and J. Peters, “Motion planning diffusion: Learning and adapting robot motion planning with diffusion models,”IEEE Transactions on Robotics, 2025

  33. [33]

    D. J. Bell and D. H. Jacobson,Singular Optimal Control Problems. Elsevier, 1975, vol. 117

  34. [34]

    W. H. Fleming and R. W. Rishel,Deterministic and Stochastic Optimal Control. Springer Science & Business Media, 2012, vol. 1

  35. [35]

    Theoretical guarantees for sampling and inference in generative models with latent diffusions,

    B. Tzen and M. Raginsky, “Theoretical guarantees for sampling and inference in generative models with latent diffusions,” inConference on Learning Theory. PMLR, 2019, pp. 3084–3114

  36. [36]

    Path integral sampler: A stochastic control approach for sampling,

    Q. Zhang and Y . Chen, “Path integral sampler: A stochastic control approach for sampling,” inThe Tenth International Conference on Learning Representations, 2022

  37. [37]

    Denoising diffusion samplers,

    F. Vargas, W. S. Grathwohl, and A. Doucet, “Denoising diffusion samplers,” inThe Eleventh International Conference on Learning Rep- resentations, 2023

  38. [38]

    An optimal control perspective on diffusion-based generative modeling,

    J. Berner, L. Richter, and K. Ullrich, “An optimal control perspective on diffusion-based generative modeling,”Transactions on Machine Learn- ing Research, 2024

  39. [39]

    Adjoint sampling: Highly scalable diffusion samplers via adjoint matching,

    A. J. Havens, B. K. Miller, B. Yan, C. Domingo-Enrich, A. Sriram, D. S. Levine, B. M. Wood, B. Hu, B. Amos, B. Karrer, X. Fu, G.-H. Liu, and R. T. Q. Chen, “Adjoint sampling: Highly scalable diffusion samplers via adjoint matching,” inForty-second International Conference on Machine Learning, 2025

  40. [40]

    Ad- joint matching: Fine-tuning flow and diffusion generative models with memoryless stochastic optimal control,

    C. Domingo-Enrich, M. Drozdzal, B. Karrer, and R. T. Chen, “Ad- joint matching: Fine-tuning flow and diffusion generative models with memoryless stochastic optimal control,” inThe Thirteenth International Conference on Learning Representations, 2025

  41. [41]

    Score as action: Fine tuning diffusion generative models by continuous-time reinforce- ment learning,

    H. Zhao, H. Chen, J. Zhang, D. Yao, and W. Tang, “Score as action: Fine tuning diffusion generative models by continuous-time reinforce- ment learning,” inForty-second International Conference on Machine Learning, 2025

  42. [42]

    Diffusion posterior sampling for general noisy inverse problems,

    H. Chung, J. Kim, M. T. Mccann, M. L. Klasky, and J. C. Ye, “Diffusion posterior sampling for general noisy inverse problems,” inThe Eleventh International Conference on Learning Representations, 2023

  43. [43]

    Universal guidance for diffusion models,

    A. Bansal, H.-M. Chu, A. Schwarzschild, S. Sengupta, M. Goldblum, J. Geiping, and T. Goldstein, “Universal guidance for diffusion models,” inProceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2023, pp. 843–852

  44. [44]

    Gradient guidance for diffusion models: An optimization perspective,

    Y . Guo, H. Yuan, Y . Yang, M. Chen, and M. Wang, “Gradient guidance for diffusion models: An optimization perspective,”Advances in Neural Information Processing Systems, vol. 37, pp. 90 736–90 770, 2024

  45. [45]

    Variational control for guidance in diffusion models,

    K. Pandey, F. M. Sofian, F. Draxler, T. Karaletsos, and S. Mandt, “Variational control for guidance in diffusion models,” inForty-second International Conference on Machine Learning, 2025

  46. [46]

    On the guidance of flow matching,

    R. Feng, C. Yu, W. Deng, P. Hu, and T. Wu, “On the guidance of flow matching,” inForty-second International Conference on Machine Learning, 2025

  47. [47]

    Symbolic music generation with non-differentiable rule guided diffusion,

    Y . Huang, A. Ghatare, Y . Liu, Z. Hu, Q. Zhang, C. S. Sastry, S. Gururani, S. Oore, and Y . Yue, “Symbolic music generation with non-differentiable rule guided diffusion,” inInternational Conference on Machine Learn- ing. PMLR, 2024, pp. 19 772–19 797

  48. [48]

    Diffusion tree sampling: Scalable inference-time alignment of diffusion models,

    V . Jain, K. Sareen, M. Pedramfar, and S. Ravanbakhsh, “Diffusion tree sampling: Scalable inference-time alignment of diffusion models,”arXiv preprint arXiv:2506.20701, 2025

  49. [49]

    Training-free guidance beyond differentiability: Scalable path steering with tree search in diffusion and flow models.arXiv preprint arXiv:2502.11420, 2025

    Y . Guo, Y . Yang, H. Yuan, and M. Wang, “Training-free guidance beyond differentiability: Scalable path steering with tree search in diffusion and flow models,”arXiv preprint arXiv:2502.11420, 2025

  50. [50]

    Zero-shot image restoration using denoising diffusion null-space model,

    Y . Wang, J. Yu, and J. Zhang, “Zero-shot image restoration using denoising diffusion null-space model,” inThe Eleventh International Conference on Learning Representations, 2023

  51. [51]

    Bertsekas,Dynamic Programming and Optimal Control: Volume I

    D. Bertsekas,Dynamic Programming and Optimal Control: Volume I. Athena Scientific, 2012, vol. 4

  52. [52]

    Towards diverse behaviors: A benchmark for imitation learning with human demonstrations,

    X. Jia, D. Blessing, X. Jiang, M. Reuss, A. Donat, R. Lioutikov, and G. Neumann, “Towards diverse behaviors: A benchmark for imitation learning with human demonstrations,” inThe Twelfth International Conference on Learning Representations, 2024

  53. [53]

    On the implementation of an interior- point filter line-search algorithm for large-scale nonlinear programming,

    A. W ¨achter and L. T. Biegler, “On the implementation of an interior- point filter line-search algorithm for large-scale nonlinear programming,” Mathematical programming, vol. 106, no. 1, pp. 25–57, 2006

  54. [54]

    Planning with diffusion for flexible behavior synthesis,

    M. Janner, Y . Du, J. Tenenbaum, and S. Levine, “Planning with diffusion for flexible behavior synthesis,” inProceedings of the 39th International Conference on Machine Learning, vol. 162. PMLR, 2022, pp. 9902– 9915

  55. [55]

    D4RL: Datasets for Deep Data-Driven Reinforcement Learning

    J. Fu, A. Kumar, O. Nachum, G. Tucker, and S. Levine, “D4rl: Datasets for deep data-driven reinforcement learning,”arXiv preprint arXiv:2004.07219, 2020

  56. [56]

    From uncertain to safe: Conformal adaptation of diffusion models for safe PDE control,

    P. Hu, X. Qian, W. Deng, R. Wang, H. Feng, R. Feng, T. Zhang, L. Wei, Y . Wang, Z.-M. Ma, and T. Wu, “From uncertain to safe: Conformal adaptation of diffusion models for safe PDE control,” inForty-second International Conference on Machine Learning, 2025

  57. [57]

    Progressive growing of GANs for improved quality, stability, and variation,

    T. Karras, T. Aila, S. Laine, and J. Lehtinen, “Progressive growing of GANs for improved quality, stability, and variation,” inThe Sixth International Conference on Learning Representations, 2018

  58. [58]

    Flow straight and fast: Learning to generate and transfer data with rectified flow,

    X. Liu, C. Gong, and Q. Liu, “Flow straight and fast: Learning to generate and transfer data with rectified flow,” inThe Eleventh International Conference on Learning Representations, 2023

  59. [59]

    Learning transferable visual models from natural language supervision,

    A. Radford, J. W. Kim, C. Hallacy, A. Ramesh, G. Goh, S. Agarwal, G. Sastry, A. Askell, P. Mishkin, J. Clarket al., “Learning transferable visual models from natural language supervision,” inInternational conference on machine learning. PmLR, 2021, pp. 8748–8763

  60. [60]

    The unreasonable effectiveness of deep features as a perceptual metric,

    R. Zhang, P. Isola, A. A. Efros, E. Shechtman, and O. Wang, “The unreasonable effectiveness of deep features as a perceptual metric,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2018, pp. 586–595