Constraint-Aware Flow Matching via Randomized Exploration
Pith reviewed 2026-05-18 22:14 UTC · model grok-4.3
The pith
Randomized exploration during flow matching training produces a mean flow that satisfies constraints known only through a membership oracle.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The central claim is that randomized exploration lets a flow matching model learn a mean flow with high likelihood of constraint satisfaction when the constraint set is accessible solely through a membership oracle, and that a two-stage training procedure which approximates the original flow while probing constraints only in the second stage is more computationally efficient than a single-stage alternative.
What carries the argument
Randomized exploration applied while learning the flow matching vector field, which averages trajectories over constraint probes to produce a mean flow whose samples respect the oracle-defined set.
If this is right
- When a differentiable distance function to the constraint set is given, adding a penalty term to the flow matching objective reduces the rate of constraint violations.
- The randomized mean flow approach works for non-convex constraint sets without requiring a barrier function or convexity assumptions.
- Separating training into two stages, with randomization used only in the second stage, lowers computational cost while still approximating the constrained flow.
- The method supports training generators that produce adversarial examples using only query access to a hard-label black-box classifier.
Where Pith is reading between the lines
- The same randomization idea could be tried inside other continuous-time generative models that learn vector fields.
- Staged training with selective randomization might reduce cost in other generative tasks that involve expensive-to-check constraints.
- One could test whether post-training refinement of the mean flow further increases the rate of valid samples without changing the base distribution match.
Load-bearing premise
Randomization during training will cause the learned mean flow to generate samples that satisfy the constraints with high probability when only a membership oracle is available.
What would settle it
Apply the randomized training to an oracle-only constraint whose valid region consists of thin disconnected components distant from typical data paths and check whether the fraction of valid generated samples drops close to the level expected from an unconstrained model.
Figures
read the original abstract
We consider the problem of designing constraint-aware flow matching (FM) models that address the issue of constraint violations commonly observed in vanilla generative models. We consider two scenarios, viz.: (a) when a differentiable distance function to the constraint set is given, and (b) when the constraint set is only available via queries to a membership oracle. For case (a), we propose a simple adaptation of the FM objective with an additional term that penalizes the distance between the constraint set and the generated samples. For case (b), we propose to employ randomization and learn a mean flow that is numerically shown to have a high likelihood of satisfying the constraints. This approach deviates significantly from existing works that require simple convex constraints, knowledge of a barrier function, or a reflection mechanism to constrain the probability flow. Furthermore, in the proposed setting we show that a two-stage approach, where both stages approximate the same original flow but with only the second stage probing the constraints via randomization, is more computationally efficient than the corresponding one-stage approach. Through several synthetic cases of constrained generation, we numerically show that the proposed approaches achieve significant gains in terms of constraint satisfaction while matching the target distributions. As a showcase for a practical oracle-based constraint, we show how our approach can be used for training an adversarial example generator, using queries to a hard-label black-box classifier. We conclude with several future research directions. Our code is available at https://github.com/ZhengyanHuan/FM-RE.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper introduces constraint-aware adaptations of flow matching (FM) for generative models to reduce constraint violations. For the case with a differentiable distance function to the constraint set, an additional penalty term is incorporated into the FM objective. For the membership-oracle case, randomization is introduced during training to learn a mean flow that is numerically shown to satisfy constraints with high probability; this is contrasted with prior methods requiring convex sets, barrier functions, or reflections. A two-stage training procedure is proposed as more efficient than one-stage for the oracle setting. The approaches are evaluated on multiple synthetic constrained generation tasks and demonstrated on a practical task of training an adversarial example generator via queries to a hard-label black-box classifier.
Significance. If the numerical results hold under broader conditions, the work offers a practical route to incorporating non-convex, oracle-defined constraints into flow-based generative models without the restrictions of existing techniques. The two-stage efficiency claim and the adversarial-generation showcase add applied value. Releasing code supports reproducibility and is a clear strength.
major comments (2)
- [§3] §3 (Oracle-based randomization method): The central claim that randomization during training yields a mean flow with high likelihood of satisfying arbitrary membership-oracle constraints (without convexity, barriers, or reflections) rests entirely on numerical evidence from synthetic cases and one adversarial task. No derivation or analysis is provided to explain why stochastic perturbations bias the learned vector field toward the feasible region in a manner that survives averaging, particularly when the constraint set has small measure or complicated non-convex geometry. If the randomization only inflates variance without shifting mass onto the feasible set, the reported gains in constraint satisfaction could be artifacts of the chosen synthetic distributions rather than a general property.
- [Experiments] Experimental section (synthetic cases and adversarial example): The manuscript reports significant gains in constraint satisfaction while matching target distributions, but the provided description does not specify the number of independent runs, error bars, or exact controls for the randomization variance. This makes it difficult to assess whether the improvements are robust or sensitive to hyperparameter choices in the oracle setting.
minor comments (2)
- [Abstract] The abstract and introduction could more explicitly contrast the two-stage approach with the one-stage baseline in terms of computational cost and convergence behavior.
- [Method] Notation for the randomized flow and the mean flow could be clarified with an explicit equation relating the stochastic perturbations to the final averaged vector field.
Simulated Author's Rebuttal
We thank the referee for their constructive and detailed feedback on our manuscript. We address each major comment point by point below, providing clarifications and indicating revisions where appropriate.
read point-by-point responses
-
Referee: [§3] §3 (Oracle-based randomization method): The central claim that randomization during training yields a mean flow with high likelihood of satisfying arbitrary membership-oracle constraints (without convexity, barriers, or reflections) rests entirely on numerical evidence from synthetic cases and one adversarial task. No derivation or analysis is provided to explain why stochastic perturbations bias the learned vector field toward the feasible region in a manner that survives averaging, particularly when the constraint set has small measure or complicated non-convex geometry. If the randomization only inflates variance without shifting mass onto the feasible set, the reported gains in constraint satisfaction could be artifacts of the chosen synthetic distributions rather than a general property.
Authors: We acknowledge that the central claims for the oracle-based method are supported by empirical results rather than a formal derivation. Section 3 provides intuition that stochastic perturbations during training encourage the learned mean flow to favor feasible trajectories upon averaging. The numerical evidence spans multiple synthetic distributions with varying non-convex geometries as well as the practical adversarial task, and the method consistently outperforms baselines without randomization. We agree that a rigorous analysis of the bias mechanism for arbitrary sets would be valuable; the revised manuscript adds an expanded limitations paragraph in Section 3 and lists a theoretical characterization as future work. We do not believe the gains are artifacts, given the diversity of tested constraints, but we accept that stronger theory would increase confidence. revision: partial
-
Referee: [Experiments] Experimental section (synthetic cases and adversarial example): The manuscript reports significant gains in constraint satisfaction while matching target distributions, but the provided description does not specify the number of independent runs, error bars, or exact controls for the randomization variance. This makes it difficult to assess whether the improvements are robust or sensitive to hyperparameter choices in the oracle setting.
Authors: We thank the referee for highlighting this omission. The revised experimental section now explicitly states that all quantitative results are averaged over 10 independent runs using different random seeds, with error bars showing one standard deviation. We have also added details on the randomization variance schedule employed in the oracle setting and included a brief sensitivity study for the key variance hyperparameter. These updates should facilitate assessment of robustness. revision: yes
Circularity Check
No significant circularity in derivation chain
full rationale
The paper introduces two constraint-aware adaptations to standard flow matching: an additive penalty term on distance to the constraint set when a differentiable function is available, and randomization during training to produce a mean flow for membership-oracle constraints. These mechanisms are defined directly from the problem setup and validated through numerical experiments on synthetic distributions plus one adversarial-example task. No equation reduces a claimed prediction to a fitted parameter by construction, no uniqueness theorem is imported from prior author work, and no ansatz is smuggled via self-citation. The central claims rest on empirical demonstration rather than tautological re-labeling of inputs, making the derivation self-contained against external benchmarks.
Axiom & Free-Parameter Ledger
axioms (1)
- standard math Standard flow matching objective and probability flow assumptions hold.
Lean theorems connected to this paper
-
IndisputableMonolith/Cost/FunctionalEquation.leanwashburn_uniqueness_aczel unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
we propose to employ randomization and learn a mean flow that is numerically shown to have a high likelihood of satisfying the constraints... ˜Uθ,σt = uθ(Xt,t) + σt W
-
IndisputableMonolith/Foundation/ArithmeticFromLogic.leanLogicNat recovery unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
two-stage approach... only the second stage probing the constraints via randomization
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
Reference graph
Works this paper leans on
-
[1]
Kingma, Abhishek Kumar, Stefano Ermon, and Ben Poole
Yang Song, Jascha Sohl-Dickstein, Diederik P. Kingma, Abhishek Kumar, Stefano Ermon, and Ben Poole. Score-based generative modeling through stochastic differential equations. In9th International Conference on Learning Representations, ICLR 2021, Virtual Event, Austria, May 3-7, 2021. OpenReview.net, 2021
work page 2021
-
[2]
Yaron Lipman, Ricky T. Q. Chen, Heli Ben-Hamu, Maximilian Nickel, and Matthew Le. Flow matching for generative modeling. InThe Eleventh International Conference on Learning Representations, 2023
work page 2023
-
[3]
Stochastic Interpolants: A Unifying Framework for Flows and Diffusions
Michael S Albergo, Nicholas M Boffi, and Eric Vanden-Eijnden. Stochastic interpolants: A unifying framework for flows and diffusions.arXiv preprint arXiv:2303.08797, 2023
work page internal anchor Pith review Pith/arXiv arXiv 2023
-
[4]
Flow straight and fast: Learning to generate and transfer data with rectified flow
Xingchao Liu, Chengyue Gong, et al. Flow straight and fast: Learning to generate and transfer data with rectified flow. InThe Eleventh International Conference on Learning Representations, 2023
work page 2023
-
[5]
Diffusion bridge mixture transports, schrödinger bridge problems and generative modeling
Stefano Peluchetti. Diffusion bridge mixture transports, schrödinger bridge problems and generative modeling. Journal of Machine Learning Research, 24(374):1–51, 2023
work page 2023
-
[6]
Denoising diffusion probabilistic models
Jonathan Ho, Ajay Jain, and Pieter Abbeel. Denoising diffusion probabilistic models. In H. Larochelle, M. Ranzato, R. Hadsell, M.F. Balcan, and H. Lin, editors,Advances in Neural Information Processing Systems, volume 33, pages 6840–6851. Curran Associates, Inc., 2020
work page 2020
-
[7]
Mirror diffusion models for constrained and watermarked generation
Guan-Horng Liu, Tianrong Chen, Evangelos Theodorou, and Molei Tao. Mirror diffusion models for constrained and watermarked generation. In A. Oh, T. Naumann, A. Globerson, K. Saenko, M. Hardt, and S. Levine, editors,Advances in Neural Information Processing Systems, volume 36, pages 42898–42917. Curran Associates, Inc., 2023
work page 2023
-
[8]
Feng, Ricardo Baptista, and Katherine L
Berthy T. Feng, Ricardo Baptista, and Katherine L. Bouman. Neural approximate mirror maps for constrained diffusion models. InThe Thirteenth International Conference on Learning Representations, 2025
work page 2025
-
[9]
Aaron Lou and Stefano Ermon. Reflected diffusion models. In Andreas Krause, Emma Brunskill, Kyunghyun Cho, BarbaraEngelhardt, SivanSabato, andJonathanScarlett, editors,Proceedings of the 40th International Conference on Machine Learning, volume 202 ofProceedings of Machine Learning Research, pages 22675– 22701. PMLR, 23–29 Jul 2023
work page 2023
-
[10]
Tianyu Xie, Yu Zhu, Longlin Yu, Tong Yang, Ziheng Cheng, Shiyue Zhang, Xiangyu Zhang, and Cheng Zhang. Reflected flow matching. InICML, 2024
work page 2024
-
[11]
Diffusion models for constrained domains.Transactions on Machine Learning Research, 2023
Nic Fishman, Leo Klarner, Valentin De Bortoli, Emile Mathieu, and Michael John Hutchinson. Diffusion models for constrained domains.Transactions on Machine Learning Research, 2023. Expert Certification
work page 2023
-
[12]
Constrained synthesis with projected diffusion models
Jacob Christopher, Stephen Baek, and Ferdinando Fioretto. Constrained synthesis with projected diffusion models. In Neural Information Processing Systems, 2024
work page 2024
-
[13]
Constrained diffusion models via dual training
Shervin Khalafi, Dongsheng Ding, and Alejandro Ribeiro. Constrained diffusion models via dual training. In A. Globerson, L. Mackey, D. Belgrave, A. Fan, U. Paquet, J. Tomczak, and C. Zhang, editors,Advances in Neural Information Processing Systems, volume 37, pages 26543–26576. Curran Associates, Inc., 2024
work page 2024
-
[14]
Building normalizing flows with stochastic interpolants
Michael Samuel Albergo and Eric Vanden-Eijnden. Building normalizing flows with stochastic interpolants. In The Eleventh International Conference on Learning Representations, 2023. 12
work page 2023
-
[15]
Improving and generalizing flow-based generative models with minibatch optimal transport
Alexander Tong, Kilian FATRAS, Nikolay Malkin, Guillaume Huguet, Yanlei Zhang, Jarrid Rector-Brooks, Guy Wolf, and Yoshua Bengio. Improving and generalizing flow-based generative models with minibatch optimal transport. Transactions on Machine Learning Research, 2024
work page 2024
-
[16]
Richard S. Sutton and Andrew G. Barto.Introduction to reinforcement learning. MIT Press, 2 edition,
-
[17]
Uri M. Ascher and Linda R. Petzold.Computer Methods for Ordinary Differential Equations and Differential- Algebraic Equations. Society for Industrial and Applied Mathematics, Philadelphia, PA, 1998
work page 1998
-
[18]
Nonparametric score estimators
Yuhao Zhou, Jiaxin Shi, and Jun Zhu. Nonparametric score estimators. InInternational Conference on Machine Learning, pages 11513–11522. PMLR, 2020
work page 2020
-
[19]
George Papamakarios, Eric Nalisnick, Danilo Jimenez Rezende, Shakir Mohamed, and Balaji Lakshmi- narayanan. Normalizing flows for probabilistic modeling and inference.Journal of Machine Learning Research, 22(57):1–64, 2021
work page 2021
-
[20]
Nicolas Bonneel, Julien Rabin, Gabriel Peyré, and Hanspeter Pfister. Sliced and radon wasserstein barycenters of measures.Journal of Mathematical Imaging and Vision, 51, 04 2014
work page 2014
-
[21]
Wasserstein barycenter and its application to texture mixing
Julien Rabin, Gabriel Peyré, Julie Delon, and Marc Bernot. Wasserstein barycenter and its application to texture mixing. In Alfred M. Bruckstein, Bart M. ter Haar Romeny, Alexander M. Bronstein, and Michael M. Bronstein, editors,Scale Space and Variational Methods in Computer Vision, pages 435–446, Berlin, Heidelberg, 2012. Springer Berlin Heidelberg
work page 2012
-
[22]
Gans trained by a two time-scale update rule converge to a local nash equilibrium
Martin Heusel, Hubert Ramsauer, Thomas Unterthiner, Bernhard Nessler, and Sepp Hochreiter. Gans trained by a two time-scale update rule converge to a local nash equilibrium. InProceedings of the 31st International Conference on Neural Information Processing Systems, NIPS’17, page 6629–6640, Red Hook, NY, USA, 2017. Curran Associates Inc
work page 2017
- [23]
-
[24]
Deep residual learning for image recognition
Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 770–778, 2016
work page 2016
-
[25]
Chen, Pin-Yu Chen, Sijia Liu, and Cho-Jui Hsieh
Minhao Cheng, Simranjit Singh, Patrick H. Chen, Pin-Yu Chen, Sijia Liu, and Cho-Jui Hsieh. Sign-opt: A query-efficient hard-label adversarial attack. InInternational Conference on Learning Representations, 2020
work page 2020
-
[26]
Hard-label based small query black-box adversarial attack
Jeonghwan Park, Paul Miller, and Niall McLaughlin. Hard-label based small query black-box adversarial attack. InProceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), pages 3986–3995, January 2024
work page 2024
-
[27]
Decision-based adversarial attacks: Reliable attacks against black-box machine learning models
Wieland Brendel, Jonas Rauber, and Matthias Bethge. Decision-based adversarial attacks: Reliable attacks against black-box machine learning models. In6th International Conference on Learning Repre- sentations, ICLR 2018, Vancouver, BC, Canada, April 30 - May 3, 2018, Conference Track Proceedings. OpenReview.net, 2018. 13 Appendices A Proof of Proposition ...
work page 2018
-
[28]
In both algorithms, larger values of λ generally lead to higher constraint satisfaction rates
FM-DD and FM-RE cannot give guarantees on constraint violation. In both algorithms, larger values of λ generally lead to higher constraint satisfaction rates. However, explicit constraint satisfaction rate cannot be guaranteed, and excessively large values ofλ may adversely affect distributional match
-
[29]
The training of FM-DD and FM-RE requires sampling complete trajectories to obtain the terminal point. This results in higher computational cost compared to FM and other constrained generation methods, such as MDM and reflection-based approaches, as their loss functions are based on single-step evaluation rather than full trajectory computations. D Experim...
-
[30]
This might not be effective against FM-RE
A common strategy for identifying potential adversarial example queries is by checking repeated queries for similar images. This might not be effective against FM-RE. FM-RE’s training requires diverse queries, provided that the selectedx1 are sufficiently different
-
[31]
FM-RE requires no query access when generating images, leading to fast generation. Also, one can generate an infinite number of potential adversarial examples for a single image by repeatedly sampling x0. The first advantage of FM-RE provides additional insight into the protection of image classification models. Although a common and effective defense is ...
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.