Input-to-State Stability of Gradient Flows in Distributional Space
Pith reviewed 2026-05-14 21:10 UTC · model grok-4.3
The pith
Gradient flows of l-smooth and lambda-convex functionals on probability measures satisfy distributional input-to-state stability under the Wasserstein metric.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
We establish dISS for gradient flows defined by a class of l-smooth and lambda-convex functionals subject to bounded disturbances, such as those induced by entropy in optimal transport. The dISS notion relies on the Wasserstein metric and unifies ISS and NSS over compact domains for particle dynamics while extending the classical notions to sets of probability distributions. The same stability property holds for the large-scale algorithms obtained via kernel and sample-based approximations, which produces a characterization of the approximation error.
What carries the argument
The distributional Input-to-State Stability (dISS) property for Wasserstein gradient flows of l-smooth and lambda-convex functionals on the space of probability measures.
If this is right
- dISS unifies ISS and NSS over compact domains for particle dynamics.
- The stability notion extends classical robustness guarantees from individual trajectories to entire sets of probability distributions.
- Wasserstein gradient flows of l-smooth and lambda-convex functionals remain robust to bounded disturbances such as entropy perturbations.
- Kernel and sample-based approximations of the flows inherit dISS with an explicit error bound that depends on the number of agents.
- The error characterization guides selection of swarm size to meet a mean-field objective with prescribed accuracy and stability.
Where Pith is reading between the lines
- The dISS bounds could be used to certify stability margins when designing distributed control inputs for robotic swarms whose collective behavior is modeled by a mean-field limit.
- The same proof structure might extend to other optimal-transport-type metrics if the functional satisfies analogous smoothness and convexity conditions.
- Finite-particle error bounds derived from dISS could be combined with concentration inequalities to obtain high-probability guarantees on the deviation between empirical and mean-field trajectories.
Load-bearing premise
The driving functional must be both l-smooth and lambda-convex on the space of probability measures equipped with the Wasserstein metric over a compact domain.
What would settle it
A concrete trajectory of a Wasserstein gradient flow for which the distance to the unperturbed equilibrium grows unbounded under a bounded disturbance when the driving functional violates lambda-convexity.
Figures
read the original abstract
This paper proposes a new notion of distributional Input-to-State Stability (dISS) for dynamic systems evolving in probability spaces over a domain. Unlike other norm-based ISS concepts, we rely on the Wasserstein metric, which captures more precisely the effects of the disturbances on atomic and non-atomic measures. We show how dISS unifies both ISS and Noise to State Stability (NSS) over compact domains for particle dynamics, while extending the classical notions to sets of probability distributions. We then apply the dISS framework to study the robustness of various Wasserstein gradient flows with respect to perturbations. In particular, we establish dISS for gradient flows defined by a class of $l$-smooth and $\lambda$-convex functionals subject to bounded disturbances, such as those induced by entropy in optimal transport. Further, we study the dISS robustness of the large-scale algorithms when using Kernel and sample-based approximations. This results into a characterization of the error incurred when using a finite number of agents, which can guide the selection of the swarm size to achieve a mean-field objective with prescribed accuracy and stability guarantees.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper introduces a new notion of distributional Input-to-State Stability (dISS) for dynamical systems evolving in the space of probability measures, using the Wasserstein metric to capture disturbance effects on both atomic and non-atomic measures. It establishes dISS for Wasserstein gradient flows of l-smooth and λ-convex functionals under bounded disturbances (with entropy in optimal transport as a highlighted example), unifies classical ISS and Noise-to-State Stability (NSS) for particle systems on compact domains, and derives error characterizations for kernel and finite-sample approximations to guide swarm-size selection for mean-field objectives.
Significance. If the central claims hold, the work supplies a unified stability framework for mean-field and optimal-transport dynamics that directly informs practical algorithm design, including quantitative guidance on approximation errors. The unification of ISS/NSS and the explicit error bounds for large-scale implementations are concrete strengths that could influence robustness analysis in swarm robotics and sampling-based methods.
major comments (2)
- [Abstract / gradient-flow dISS theorem] Abstract and the section establishing dISS for gradient flows: the claim that dISS holds for entropy-induced flows requires the relative entropy to be l-smooth (i.e., its Wasserstein gradient to be Lipschitz) on the relevant domain. This Lipschitz bound is used to absorb the disturbance into the ISS gain; without it the comparison argument does not close. Standard references show that ∇_W J(μ) is typically unbounded for measures with varying densities, and compactness of Ω alone does not restore global l-smoothness. Please identify the precise theorem or proposition that verifies l-smoothness for this example (or the additional density bounds needed) and state the resulting ISS gain explicitly.
- [Unification section / particle-dynamics corollary] The unification statement for particle ISS/NSS: the reduction to classical notions on compact domains is asserted, but the proof sketch must show how the Wasserstein dISS estimate specializes to the Euclidean ISS/NSS estimates when measures are empirical. If the l-smoothness hypothesis is relaxed for particles, the constants in the reduction must be tracked; otherwise the unification claim is only formal.
minor comments (2)
- The definition of dISS should be stated as an explicit inequality (with the precise form of the gain function and the Wasserstein distance) before any theorems are proved.
- Notation for the disturbance class (bounded in which norm?) and the precise meaning of “λ-convex” in the Wasserstein space should be recalled at the start of the main results section to avoid ambiguity.
Simulated Author's Rebuttal
We thank the referee for the careful and constructive review. The comments highlight important points on the assumptions underlying the dISS results and the details of the unification with classical ISS/NSS. We address each major comment below and will revise the manuscript to incorporate the requested clarifications and expansions.
read point-by-point responses
-
Referee: [Abstract / gradient-flow dISS theorem] Abstract and the section establishing dISS for gradient flows: the claim that dISS holds for entropy-induced flows requires the relative entropy to be l-smooth (i.e., its Wasserstein gradient to be Lipschitz) on the relevant domain. This Lipschitz bound is used to absorb the disturbance into the ISS gain; without it the comparison argument does not close. Standard references show that ∇_W J(μ) is typically unbounded for measures with varying densities, and compactness of Ω alone does not restore global l-smoothness. Please identify the precise theorem or proposition that verifies l-smoothness for this example (or the additional density bounds needed) and state the resulting ISS gain explicitly.
Authors: We agree that global l-smoothness of the relative entropy is not automatic and requires additional conditions. The manuscript states the main dISS theorem under the standing hypothesis that the driving functional is l-smooth and λ-convex; the entropy is listed as a motivating example that satisfies these hypotheses on suitable restricted classes of measures. In the revision we will add an explicit remark (and, if space permits, a short proposition) stating that, on a compact domain, relative entropy is l-smooth when restricted to the set of measures whose densities lie between positive constants m and M. Under this restriction the Wasserstein gradient is Lipschitz with constant depending on l, λ, m and M. The resulting ISS gain will be stated explicitly as γ(r) = (l/λ)r, obtained directly from the comparison lemma used in the proof of the main theorem. revision: yes
-
Referee: [Unification section / particle-dynamics corollary] The unification statement for particle ISS/NSS: the reduction to classical notions on compact domains is asserted, but the proof sketch must show how the Wasserstein dISS estimate specializes to the Euclidean ISS/NSS estimates when measures are empirical. If the l-smoothness hypothesis is relaxed for particles, the constants in the reduction must be tracked; otherwise the unification claim is only formal.
Authors: We accept that the current sketch is too brief. In the revised manuscript we will expand the unification section with a detailed specialization argument: for an empirical measure μ = (1/N)∑δ_{x_i} the squared Wasserstein distance W_2²(μ,ν) equals (1/N)‖x−y‖² where x,y ∈ R^{dN} are the stacked particle vectors. Consequently the dISS inequality in Wasserstein space directly yields the classical Euclidean ISS/NSS estimate with gain scaled by 1/√N. Because the particle system evolves on a compact domain, the l-smoothness assumption can be relaxed to local Lipschitz continuity; we will explicitly track the dependence of the ISS gain on the local Lipschitz constant and on N, thereby making the reduction fully rigorous rather than formal. revision: yes
Circularity Check
No significant circularity in dISS derivation for Wasserstein gradient flows
full rationale
The paper introduces the dISS notion independently via the Wasserstein metric and derives stability results directly from the standard l-smoothness and λ-convexity assumptions on the driving functional. These assumptions are external inputs to the proofs rather than quantities fitted or defined inside the paper. The unification of particle ISS/NSS with distributional stability follows from the metric properties and compactness without reducing to self-referential constructions or self-citation chains. No load-bearing step equates a prediction to its own fitted input or renames a known result as a new derivation. The entropy example is presented as an application under the stated hypotheses, not as a self-defining case.
Axiom & Free-Parameter Ledger
axioms (2)
- standard math The domain is compact and the Wasserstein metric induces a complete separable metric space on probability measures.
- domain assumption The driving functional is l-smooth and λ-convex with respect to the Wasserstein metric.
Lean theorems connected to this paper
-
IndisputableMonolith/Cost/FunctionalEquation.leanwashburn_uniqueness_aczel unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
establish dISS for gradient flows defined by a class of l-smooth and λ-convex functionals subject to bounded disturbances, such as those induced by entropy in optimal transport
-
IndisputableMonolith/Foundation/RealityFromDistinction.leanreality_from_one_distinction unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
Wasserstein gradient flows ... λ-convex ... quadratic growth and gradient dominance
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
Reference graph
Works this paper leans on
-
[1]
E. D. Sontag. Smooth stabilization implies coprime factorization. IEEE Transactions on Automatic Control, 34(4):435–443, 1989
work page 1989
-
[2]
H. Deng, M. Krstic, and R. J. Williams. Stabilization of stochastic nonlinear systems driven by noise of unknown covariance.IEEE Transactions on Automatic Control, 46(8):1237–1253, 2001
work page 2001
- [3]
-
[4]
D. Mateos-N ´u˜nez and J. Cort ´es.pth moment noise-to-state stability of stochastic differential equations with persistent noise.SIAM Journal on Control and Optimization, 52(4):2399–2421, 2014
work page 2014
-
[5]
L. Huang and X. Mao. On input-to-state stability of stochastic retarded systems with markovian switching.IEEE Transactions on Automatic Control, 54(8):1898–1902, 2009
work page 1902
-
[6]
P. Culbertson, R. K. Cosner, M. Tucker, and A. D. Ames. Input-to- state stability in probability. InIEEE Conf. on Decision and Control, pages 5796–5803. IEEE, 2023
work page 2023
-
[7]
S. Dashkovskiy and A. Mironchenko. Input-to-state stability of incofinite-dimensional control systems.mcss, 25(1):1–35, 2012
work page 2012
-
[8]
I. Karafyllis and M. Krstic.Input-to-state stability for PDEs. Springer, 2019
work page 2019
- [9]
- [10]
-
[11]
Villani.Optimal Transport: Old and New, volume 338
C. Villani.Optimal Transport: Old and New, volume 338. Springer, 2008
work page 2008
-
[12]
F. Otto. The geometry of dissipative evolution equations: the porous medium equation.Communications in Partial Differential Equations, 26(1–2):101–174, 2001
work page 2001
-
[13]
L. Ambrosio, N. Gigli, and G. Savar ´e.Gradient flows: in metric spaces and in the space of probability measures. Springer, 2008
work page 2008
- [14]
-
[15]
M. Emerick and B. Bamieh. Continuum swarm tracking control: a geometric perspective in Wasserstein space. InAmerican Control Conference, pages 1367–1374, 2023
work page 2023
-
[16]
V . Krishnan and S. Mart ´ınez. A multi-scale analysis of multi-agent coverage control algorithms.Automatica, 145:2545–2550, 2022. DOI: 10.1016/j.automatica.2022.110516
- [17]
- [18]
-
[19]
S. Di Marino, E. Naldi, and S. Villa. Inexact JKO and proximal- gradient algorithms in the Wasserstein space.arXiv preprint arXiv:2505.23517, 2025
-
[20]
C. Wei, J. D. Lee, Q. Liu, and T. Ma. Regularization matters: Generalization and optimization of neural nets vs their induced kernel. InAdvances in Neural Information Processing Systems, volume 32, 2019
work page 2019
-
[21]
E. Sontag. Remarks on input to state stability of perturbed gradient flows motivated by model-free feedback control learning.Systems and Control Letters, 161:105138, 2022
work page 2022
-
[22]
Santambrogio.Optimal Transport for Applied Mathematicians
F. Santambrogio.Optimal Transport for Applied Mathematicians. Springer, 2015
work page 2015
- [23]
-
[24]
A. Fathi and A. Figalli. Optimal transportation on non-compact manifolds.Israel Journal of Mathematics, 175(1):1–59, 2010
work page 2010
-
[25]
A. Figalli. An introduction to optimal transport and Wasserstein gradient flows. InOptimal Transport on Quantum Structures. Springer, 2024
work page 2024
-
[26]
R. W. Brockett. Notes on the control of the Liouville equation. In Control of Partial Differential Equations, pages 101–129. Springer, 2012
work page 2012
-
[27]
H. Hamann and H. W ¨orn. A framework of space–time continuous models for algorithm design in swarm robotics.Swarm Intelligence, 2(2):209–239, 2008
work page 2008
-
[28]
R. Peyr ´e. Comparison betweenw 2 distance and ˙H −1 norm, and localization of Wasserstein distance.ESAIM: Control, Optimisation & Calculus of Variations, 24(4):1489–1501, 2018
work page 2018
- [29]
-
[30]
M. P. Wand and M. C. Jones.Kernel smoothing. CRC Press, 1994
work page 1994
-
[31]
J. Cort ´es, S. Martinez, and F. Bullo. Spatially-distributed coverage optimization and control with limited-range interactions.ESAIM. Control, Optimisation & Calculus of Variations, 11(4):691–719, 2005
work page 2005
-
[32]
P. Zador. Asymptotic quantization error of continuous signals and the quantization dimension.IEEE Transactions on Information Theory, 28(2):139–149, 1982
work page 1982
-
[33]
J. Solomon et al. Convolutional Wasserstein distances: Efficient optimal transportation on geometric domains.ACM Transactions on Graphics, 34(4):1–11, 2015. APPENDIXI PROOF OFLEMMA2.8 We proof the result for an absolutely continuousρ ∗ (minimizer ofF(ρ)) andρ 0. To obtain quadratic growth we evaluate the convexity condition at the minimizer, then the grad...
work page 2015
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.