pith. sign in

arxiv: 2603.28910 · v2 · submitted 2026-03-30 · 📡 eess.SY · cs.SY

Input-to-State Stability of Gradient Flows in Distributional Space

Pith reviewed 2026-05-14 21:10 UTC · model grok-4.3

classification 📡 eess.SY cs.SY
keywords distributional input-to-state stabilityWasserstein gradient flowsprobability measuresoptimal transportrobustness analysismean-field approximationsmooth convex functionalsparticle swarm stability
0
0 comments X

The pith

Gradient flows of l-smooth and lambda-convex functionals on probability measures satisfy distributional input-to-state stability under the Wasserstein metric.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper defines distributional Input-to-State Stability (dISS) for systems whose states are probability measures, using the Wasserstein metric to track how disturbances shift both atomic and non-atomic distributions. It proves that Wasserstein gradient flows driven by l-smooth and lambda-convex functionals remain dISS-stable when inputs are bounded, such as entropy perturbations common in optimal transport. This unifies classical input-to-state and noise-to-state stability for individual particles while extending the guarantees to entire sets of distributions on compact domains. The framework also quantifies how kernel and finite-sample approximations affect stability, yielding explicit error bounds that relate swarm size to mean-field accuracy.

Core claim

We establish dISS for gradient flows defined by a class of l-smooth and lambda-convex functionals subject to bounded disturbances, such as those induced by entropy in optimal transport. The dISS notion relies on the Wasserstein metric and unifies ISS and NSS over compact domains for particle dynamics while extending the classical notions to sets of probability distributions. The same stability property holds for the large-scale algorithms obtained via kernel and sample-based approximations, which produces a characterization of the approximation error.

What carries the argument

The distributional Input-to-State Stability (dISS) property for Wasserstein gradient flows of l-smooth and lambda-convex functionals on the space of probability measures.

If this is right

  • dISS unifies ISS and NSS over compact domains for particle dynamics.
  • The stability notion extends classical robustness guarantees from individual trajectories to entire sets of probability distributions.
  • Wasserstein gradient flows of l-smooth and lambda-convex functionals remain robust to bounded disturbances such as entropy perturbations.
  • Kernel and sample-based approximations of the flows inherit dISS with an explicit error bound that depends on the number of agents.
  • The error characterization guides selection of swarm size to meet a mean-field objective with prescribed accuracy and stability.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The dISS bounds could be used to certify stability margins when designing distributed control inputs for robotic swarms whose collective behavior is modeled by a mean-field limit.
  • The same proof structure might extend to other optimal-transport-type metrics if the functional satisfies analogous smoothness and convexity conditions.
  • Finite-particle error bounds derived from dISS could be combined with concentration inequalities to obtain high-probability guarantees on the deviation between empirical and mean-field trajectories.

Load-bearing premise

The driving functional must be both l-smooth and lambda-convex on the space of probability measures equipped with the Wasserstein metric over a compact domain.

What would settle it

A concrete trajectory of a Wasserstein gradient flow for which the distance to the unperturbed equilibrium grows unbounded under a bounded disturbance when the driving functional violates lambda-convexity.

Figures

Figures reproduced from arXiv: 2603.28910 by Guillem Pascual, Sonia Mart\'inez.

Figure 1
Figure 1. Figure 1: The swarm relaxes from an initial skewed, high￾concentration state ρt1 (left), through a transitional state ρt2 (mid￾dle), moving towards the target distribution ρ ∗ (right). In this case W2(ρt2 , ρ∗ ) < W2(ρt1 , ρ∗ ); however ∥ρt1 − ρ ∗ ∥L2 = ∥ρt2 − ρ ∗ ∥L2 , due to non-overlapping supports. Definition 4.1 (Distributional ISS): Let P∗ ∈ P2(Ω) be the set of stationary points of the unperturbed flow ∂tρt = … view at source ↗
Figure 2
Figure 2. Figure 2: Two step functions (ρ1 and ρ2) compared against a constant target distribution (ρ ∗ ). The crosses on the horizontal axis represent finite agent samples drawn from each step distribution, illustrating the severe spatial displacement that occurs despite both being supported in M, and having constant L 2 error. For stochastic dynamics (5), the particle notion of ISS extends to Noise to State Stability (NSS),… view at source ↗
Figure 3
Figure 3. Figure 3: Comparison of Euclidean (L 2 ) and Wasserstein (W2) interpolations of two overlapping Gaussian distributions (ρ0 and ρ1). The L 2 interpolation (ρL2 ) linearly joints the densities. In contrast, the W2 interpolation (ρW2 ) horizontally transports the mass, preserving the Gaussian structure. A. Entropic Disturbances in Optimal Transport We present here applications of Proposition 5.1 stemming from swarm den… view at source ↗
Figure 4
Figure 4. Figure 4: (a), (b). Final agent positions for the regularized KDE flow with N = 1000 agents and u = 1 · 10−3 , u = 0.025 respectively. (c), (d). Final W2(ρ h,N , ρ∗ ) distance with varying regularizing parameter u, and varying number of agents N.(e), (f). Evolution of the W2(ρ h,N , ρ∗ ) distance for three different signals u(t). VII. CONCLUSIONS This paper introduces a framework to analyse disturbances in large-sca… view at source ↗
read the original abstract

This paper proposes a new notion of distributional Input-to-State Stability (dISS) for dynamic systems evolving in probability spaces over a domain. Unlike other norm-based ISS concepts, we rely on the Wasserstein metric, which captures more precisely the effects of the disturbances on atomic and non-atomic measures. We show how dISS unifies both ISS and Noise to State Stability (NSS) over compact domains for particle dynamics, while extending the classical notions to sets of probability distributions. We then apply the dISS framework to study the robustness of various Wasserstein gradient flows with respect to perturbations. In particular, we establish dISS for gradient flows defined by a class of $l$-smooth and $\lambda$-convex functionals subject to bounded disturbances, such as those induced by entropy in optimal transport. Further, we study the dISS robustness of the large-scale algorithms when using Kernel and sample-based approximations. This results into a characterization of the error incurred when using a finite number of agents, which can guide the selection of the swarm size to achieve a mean-field objective with prescribed accuracy and stability guarantees.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The paper introduces a new notion of distributional Input-to-State Stability (dISS) for dynamical systems evolving in the space of probability measures, using the Wasserstein metric to capture disturbance effects on both atomic and non-atomic measures. It establishes dISS for Wasserstein gradient flows of l-smooth and λ-convex functionals under bounded disturbances (with entropy in optimal transport as a highlighted example), unifies classical ISS and Noise-to-State Stability (NSS) for particle systems on compact domains, and derives error characterizations for kernel and finite-sample approximations to guide swarm-size selection for mean-field objectives.

Significance. If the central claims hold, the work supplies a unified stability framework for mean-field and optimal-transport dynamics that directly informs practical algorithm design, including quantitative guidance on approximation errors. The unification of ISS/NSS and the explicit error bounds for large-scale implementations are concrete strengths that could influence robustness analysis in swarm robotics and sampling-based methods.

major comments (2)
  1. [Abstract / gradient-flow dISS theorem] Abstract and the section establishing dISS for gradient flows: the claim that dISS holds for entropy-induced flows requires the relative entropy to be l-smooth (i.e., its Wasserstein gradient to be Lipschitz) on the relevant domain. This Lipschitz bound is used to absorb the disturbance into the ISS gain; without it the comparison argument does not close. Standard references show that ∇_W J(μ) is typically unbounded for measures with varying densities, and compactness of Ω alone does not restore global l-smoothness. Please identify the precise theorem or proposition that verifies l-smoothness for this example (or the additional density bounds needed) and state the resulting ISS gain explicitly.
  2. [Unification section / particle-dynamics corollary] The unification statement for particle ISS/NSS: the reduction to classical notions on compact domains is asserted, but the proof sketch must show how the Wasserstein dISS estimate specializes to the Euclidean ISS/NSS estimates when measures are empirical. If the l-smoothness hypothesis is relaxed for particles, the constants in the reduction must be tracked; otherwise the unification claim is only formal.
minor comments (2)
  1. The definition of dISS should be stated as an explicit inequality (with the precise form of the gain function and the Wasserstein distance) before any theorems are proved.
  2. Notation for the disturbance class (bounded in which norm?) and the precise meaning of “λ-convex” in the Wasserstein space should be recalled at the start of the main results section to avoid ambiguity.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the careful and constructive review. The comments highlight important points on the assumptions underlying the dISS results and the details of the unification with classical ISS/NSS. We address each major comment below and will revise the manuscript to incorporate the requested clarifications and expansions.

read point-by-point responses
  1. Referee: [Abstract / gradient-flow dISS theorem] Abstract and the section establishing dISS for gradient flows: the claim that dISS holds for entropy-induced flows requires the relative entropy to be l-smooth (i.e., its Wasserstein gradient to be Lipschitz) on the relevant domain. This Lipschitz bound is used to absorb the disturbance into the ISS gain; without it the comparison argument does not close. Standard references show that ∇_W J(μ) is typically unbounded for measures with varying densities, and compactness of Ω alone does not restore global l-smoothness. Please identify the precise theorem or proposition that verifies l-smoothness for this example (or the additional density bounds needed) and state the resulting ISS gain explicitly.

    Authors: We agree that global l-smoothness of the relative entropy is not automatic and requires additional conditions. The manuscript states the main dISS theorem under the standing hypothesis that the driving functional is l-smooth and λ-convex; the entropy is listed as a motivating example that satisfies these hypotheses on suitable restricted classes of measures. In the revision we will add an explicit remark (and, if space permits, a short proposition) stating that, on a compact domain, relative entropy is l-smooth when restricted to the set of measures whose densities lie between positive constants m and M. Under this restriction the Wasserstein gradient is Lipschitz with constant depending on l, λ, m and M. The resulting ISS gain will be stated explicitly as γ(r) = (l/λ)r, obtained directly from the comparison lemma used in the proof of the main theorem. revision: yes

  2. Referee: [Unification section / particle-dynamics corollary] The unification statement for particle ISS/NSS: the reduction to classical notions on compact domains is asserted, but the proof sketch must show how the Wasserstein dISS estimate specializes to the Euclidean ISS/NSS estimates when measures are empirical. If the l-smoothness hypothesis is relaxed for particles, the constants in the reduction must be tracked; otherwise the unification claim is only formal.

    Authors: We accept that the current sketch is too brief. In the revised manuscript we will expand the unification section with a detailed specialization argument: for an empirical measure μ = (1/N)∑δ_{x_i} the squared Wasserstein distance W_2²(μ,ν) equals (1/N)‖x−y‖² where x,y ∈ R^{dN} are the stacked particle vectors. Consequently the dISS inequality in Wasserstein space directly yields the classical Euclidean ISS/NSS estimate with gain scaled by 1/√N. Because the particle system evolves on a compact domain, the l-smoothness assumption can be relaxed to local Lipschitz continuity; we will explicitly track the dependence of the ISS gain on the local Lipschitz constant and on N, thereby making the reduction fully rigorous rather than formal. revision: yes

Circularity Check

0 steps flagged

No significant circularity in dISS derivation for Wasserstein gradient flows

full rationale

The paper introduces the dISS notion independently via the Wasserstein metric and derives stability results directly from the standard l-smoothness and λ-convexity assumptions on the driving functional. These assumptions are external inputs to the proofs rather than quantities fitted or defined inside the paper. The unification of particle ISS/NSS with distributional stability follows from the metric properties and compactness without reducing to self-referential constructions or self-citation chains. No load-bearing step equates a prediction to its own fitted input or renames a known result as a new derivation. The entropy example is presented as an application under the stated hypotheses, not as a self-defining case.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 0 invented entities

The framework rests on the standard properties of the Wasserstein metric on compact domains and on the l-smoothness plus λ-convexity of the driving functional; no free parameters are introduced and no new entities are postulated beyond the definition of dISS itself.

axioms (2)
  • standard math The domain is compact and the Wasserstein metric induces a complete separable metric space on probability measures.
    Invoked to ensure well-posedness of the gradient flows and to apply standard stability arguments from optimal transport.
  • domain assumption The driving functional is l-smooth and λ-convex with respect to the Wasserstein metric.
    This is the key hypothesis under which dISS is proved; it is stated explicitly in the abstract as the class of functionals considered.

pith-pipeline@v0.9.0 · 5487 in / 1466 out tokens · 44630 ms · 2026-05-14T21:10:15.233240+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

33 extracted references · 33 canonical work pages

  1. [1]

    E. D. Sontag. Smooth stabilization implies coprime factorization. IEEE Transactions on Automatic Control, 34(4):435–443, 1989

  2. [2]

    H. Deng, M. Krstic, and R. J. Williams. Stabilization of stochastic nonlinear systems driven by noise of unknown covariance.IEEE Transactions on Automatic Control, 46(8):1237–1253, 2001

  3. [3]

    Cui, Z.-P

    L. Cui, Z.-P. Jiang, and E. D. Sontag. Small-covariance noise-to- state stability of stochastic systems and its applications to stochastic gradient dynamics.arXiv preprint arXiv:2509.24277, 2025

  4. [4]

    Mateos-N ´u˜nez and J

    D. Mateos-N ´u˜nez and J. Cort ´es.pth moment noise-to-state stability of stochastic differential equations with persistent noise.SIAM Journal on Control and Optimization, 52(4):2399–2421, 2014

  5. [5]

    Huang and X

    L. Huang and X. Mao. On input-to-state stability of stochastic retarded systems with markovian switching.IEEE Transactions on Automatic Control, 54(8):1898–1902, 2009

  6. [6]

    Culbertson, R

    P. Culbertson, R. K. Cosner, M. Tucker, and A. D. Ames. Input-to- state stability in probability. InIEEE Conf. on Decision and Control, pages 5796–5803. IEEE, 2023

  7. [7]

    Dashkovskiy and A

    S. Dashkovskiy and A. Mironchenko. Input-to-state stability of incofinite-dimensional control systems.mcss, 25(1):1–35, 2012

  8. [8]

    Karafyllis and M

    I. Karafyllis and M. Krstic.Input-to-state stability for PDEs. Springer, 2019

  9. [9]

    Zheng, Q

    T. Zheng, Q. Han, and H. Lin. Transporting robotic swarms via mean-field feedback control.IEEE Transactions on Automatic Control, 67(8):4170–4177, 2021

  10. [10]

    Zheng, Q

    T. Zheng, Q. Han, and H. Lin. Distributed mean-field density estimation for large-scale systems.IEEE Transactions on Automatic Control, 67(10):5218–5229, 2021

  11. [11]

    Villani.Optimal Transport: Old and New, volume 338

    C. Villani.Optimal Transport: Old and New, volume 338. Springer, 2008

  12. [12]

    F. Otto. The geometry of dissipative evolution equations: the porous medium equation.Communications in Partial Differential Equations, 26(1–2):101–174, 2001

  13. [13]

    Ambrosio, N

    L. Ambrosio, N. Gigli, and G. Savar ´e.Gradient flows: in metric spaces and in the space of probability measures. Springer, 2008

  14. [14]

    Chen, T.T

    Y . Chen, T.T. Georgiou, and M. Pavon. Optimal steering of a linear stochastic system to a final probability distribution, parts I and II. IEEE Transactions on Automatic Control, pages 1158–1180, 2015

  15. [15]

    Emerick and B

    M. Emerick and B. Bamieh. Continuum swarm tracking control: a geometric perspective in Wasserstein space. InAmerican Control Conference, pages 1367–1374, 2023

  16. [16]

    Krishnan and S

    V . Krishnan and S. Mart ´ınez. A multi-scale analysis of multi-agent coverage control algorithms.Automatica, 145:2545–2550, 2022. DOI: 10.1016/j.automatica.2022.110516

  17. [17]

    X. Gao, G. Pascual, S. Brown, and S. Mart ´ınez. Banach control barrier functions for large-scale swarm control. InAmerican Control Conference, 2026. To appear. ArXiv preprint arXiv:2602.05011

  18. [18]

    Wibisono

    A. Wibisono. Sampling as optimization in the space of measures: The Langevin dynamics as a composite optimization problem. InInt. Conf. on Computational Learning Theory, pages 2093–3027, 2018

  19. [19]

    Di Marino, E

    S. Di Marino, E. Naldi, and S. Villa. Inexact JKO and proximal- gradient algorithms in the Wasserstein space.arXiv preprint arXiv:2505.23517, 2025

  20. [20]

    C. Wei, J. D. Lee, Q. Liu, and T. Ma. Regularization matters: Generalization and optimization of neural nets vs their induced kernel. InAdvances in Neural Information Processing Systems, volume 32, 2019

  21. [21]

    E. Sontag. Remarks on input to state stability of perturbed gradient flows motivated by model-free feedback control learning.Systems and Control Letters, 161:105138, 2022

  22. [22]

    Santambrogio.Optimal Transport for Applied Mathematicians

    F. Santambrogio.Optimal Transport for Applied Mathematicians. Springer, 2015

  23. [23]

    Brokate and G

    M. Brokate and G. Kerstin.Measure and Integral. Birkhauser, 2015

  24. [24]

    Fathi and A

    A. Fathi and A. Figalli. Optimal transportation on non-compact manifolds.Israel Journal of Mathematics, 175(1):1–59, 2010

  25. [25]

    A. Figalli. An introduction to optimal transport and Wasserstein gradient flows. InOptimal Transport on Quantum Structures. Springer, 2024

  26. [26]

    R. W. Brockett. Notes on the control of the Liouville equation. In Control of Partial Differential Equations, pages 101–129. Springer, 2012

  27. [27]

    Hamann and H

    H. Hamann and H. W ¨orn. A framework of space–time continuous models for algorithm design in swarm robotics.Swarm Intelligence, 2(2):209–239, 2008

  28. [28]

    R. Peyr ´e. Comparison betweenw 2 distance and ˙H −1 norm, and localization of Wasserstein distance.ESAIM: Control, Optimisation & Calculus of Variations, 24(4):1489–1501, 2018

  29. [29]

    A. A. Pooladian and J. Niles-Weed. Entropic estimation of optimal transport maps.arXiv preprint arXiv:2109.12004, 2021

  30. [30]

    M. P. Wand and M. C. Jones.Kernel smoothing. CRC Press, 1994

  31. [31]

    Cort ´es, S

    J. Cort ´es, S. Martinez, and F. Bullo. Spatially-distributed coverage optimization and control with limited-range interactions.ESAIM. Control, Optimisation & Calculus of Variations, 11(4):691–719, 2005

  32. [32]

    P. Zador. Asymptotic quantization error of continuous signals and the quantization dimension.IEEE Transactions on Information Theory, 28(2):139–149, 1982

  33. [33]

    Solomon et al

    J. Solomon et al. Convolutional Wasserstein distances: Efficient optimal transportation on geometric domains.ACM Transactions on Graphics, 34(4):1–11, 2015. APPENDIXI PROOF OFLEMMA2.8 We proof the result for an absolutely continuousρ ∗ (minimizer ofF(ρ)) andρ 0. To obtain quadratic growth we evaluate the convexity condition at the minimizer, then the grad...