pith. sign in

arxiv: 2604.13377 · v1 · submitted 2026-04-15 · 📡 eess.SY · cs.SY

On the Optimality of Uncertain MDP Abstractions

Pith reviewed 2026-05-10 13:45 UTC · model grok-4.3

classification 📡 eess.SY cs.SY
keywords uncertain MDPabstraction refinementvanishing ambiguityasymptotic optimalityset-valued abstractionsnonlinear stochastic systemstemporal logic specificationsrobust dynamic programming
0
0 comments X

The pith

A vanishing ambiguity condition on uncertain MDP abstractions guarantees asymptotic optimality and algorithm completeness for nonlinear stochastic systems.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper examines whether iterative refinement of uncertain MDP abstractions for nonlinear stochastic systems with general vector fields and temporal logic specifications will converge to optimal controllers with zero error. It proposes an algorithm that constructs a UMDP abstraction, synthesizes controllers using robust dynamic programming to provide performance guarantees, and refines the abstraction until a near-optimality criterion holds. The central result identifies vanishing ambiguity as a sufficient condition for the abstraction process to become asymptotically optimal and for the algorithm to produce near-optimal controllers in finite time. Set-valued MDP abstractions satisfy this condition, ensuring the error bounds go to zero with refinement, whereas interval MDP abstractions do not. This matters for control synthesis because it clarifies when abstraction-based methods can reliably approach true optimality without exhaustive computation.

Core claim

The paper shows that a sufficient condition denoted vanishing ambiguity guarantees asymptotic optimality of the UMDP abstraction-refinement process and completeness of the algorithm. Under this condition, iterative refinement produces controllers whose performance approaches the optimal value with error bounds tending to zero. Set-valued MDP abstractions of nonlinear stochastic systems with general vector fields meet the vanishing ambiguity criterion, while interval MDP abstractions do not, even with arbitrary refinement.

What carries the argument

The vanishing ambiguity condition, which requires that uncertainty sets in the MDP abstraction shrink to the true system behavior under refinement, enabling robust dynamic programming to yield controllers with tightening performance guarantees.

If this is right

  • Iterative refinement of set-valued MDP abstractions will produce controllers with arbitrarily small suboptimality for temporal logic tasks.
  • The synthesis algorithm terminates after finitely many steps with a controller whose performance is within any prescribed tolerance of optimal.
  • Interval MDP abstractions cannot be guaranteed to achieve optimality even after infinite refinement.
  • The result applies specifically to systems with general vector fields where set-valued representations capture the uncertainty tightly.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • Practitioners can select set-valued abstractions over interval ones to obtain theoretical convergence guarantees in uncertain control problems.
  • The vanishing ambiguity lens may extend to hybrid or switched systems if their uncertainty models permit similar shrinking of ambiguity sets.
  • If a system's uncertainty prevents vanishing ambiguity, designers may need alternative abstraction types or direct verification methods instead of refinement loops.

Load-bearing premise

The nonlinear stochastic system dynamics and uncertainty structure must allow set-valued MDP abstractions to make ambiguity vanish with refinement.

What would settle it

A concrete nonlinear stochastic system where repeated refinement of a set-valued MDP abstraction leaves a positive performance gap to the optimum, or where an interval MDP abstraction achieves vanishing ambiguity and zero error.

Figures

Figures reproduced from arXiv: 2604.13377 by Ibon Gracia, Morteza Lahijanian.

Figure 1
Figure 1. Figure 1: illustrates the partition process and the set S η . The partition Bη defines a refinement function ρ η : Xabs → S η that maps each continuous state x ∈ Xabs to the repre￾sentative point s ∈ S η of the region b ∈ Bη that contains x. For the sake of conciseness we also write ρ −η (x) := b for each x ∈ Xabs, with b ∈ Bη being the region x belongs to. We also partition the disturbance set W into the finite set… view at source ↗
Figure 2
Figure 2. Figure 2: Graphical illustration of Proposition 1 and Def [PITH_FULL_IMAGE:figures/full_fig_p005_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Temperature regulation benchmark: (a) Lower and [PITH_FULL_IMAGE:figures/full_fig_p008_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: 2D case study. Lower bound in the probability of satisfying φ2 within T = 60 time steps. The color indicates the value of p η T (x0). In green, trajectories that satisfy φ2. In red, those that do not. APPENDIX A. Proofs Proof of Lemma 1. Pick x, y ∈ X, a ∈ A and a coupling π ∈ arg minπ∈Π(T (·|x,a),T (·|y,a)) R X×X ∥x ′ − y ′∥dπ(x ′ , y′ ). Then, it holds that | R X vdT (· | a, x) − R X vdT (· | a, y)| = | … view at source ↗
Figure 6
Figure 6. Figure 6: IMDP fine 0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1 Water Charge Carpet Obs Obs [PITH_FULL_IMAGE:figures/full_fig_p013_6.png] view at source ↗
Figure 7
Figure 7. Figure 7: SMDP coarse 0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1 Water Charge Carpet Obs Obs [PITH_FULL_IMAGE:figures/full_fig_p013_7.png] view at source ↗
read the original abstract

We study the asymptotic optimality of abstraction-based control synthesis algorithms. Specifically, we consider uncertain MDP (UMDP) abstraction, and investigate whether refinement leads to optimal results, i.e., an optimal controller and zero error bound. Additionally, we study completeness of abstraction-refinement algorithms, i.e., that the algorithm produces near-optimal results in finite time. The focus is on nonlinear stochastic systems with general vector fields and temporal logic specifications. We present an algorithm that abstracts the system into a UMDP and synthesizes a controller with performance guarantees via robust dynamic programming. Then, the algorithm iteratively refines the abstraction until a near-optimality criterion is met. A thorough theoretical analysis reveals a sufficient condition, which we denote vanishing ambiguity, guaranteeing asymptotic optimality of the abstraction process and completeness of the algorithm. We show that set-valued MDP abstractions satisfy this criterion, whereas interval MDP abstractions lack such a guarantee.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The paper studies asymptotic optimality and completeness of abstraction-refinement algorithms for control synthesis of nonlinear stochastic systems with general vector fields and temporal logic specifications. It abstracts the system to an uncertain MDP (UMDP), synthesizes a controller via robust dynamic programming, and refines the abstraction iteratively. A sufficient condition termed 'vanishing ambiguity' is introduced to guarantee that the process yields an asymptotically optimal controller with zero error bound and that the algorithm terminates with near-optimal results in finite time. The authors claim that set-valued MDP abstractions satisfy vanishing ambiguity while interval MDP abstractions do not.

Significance. If the central results hold, the work provides a useful theoretical criterion for selecting abstraction types that ensure convergence to optimality in uncertain control synthesis. The explicit distinction between set-valued and interval MDPs, grounded in robust dynamic programming, offers guidance for algorithm design in systems with general dynamics. The algorithm with performance guarantees and the focus on completeness are positive contributions, though their practical impact hinges on the verifiability of the supporting derivations.

major comments (2)
  1. [§4] §4 (theoretical analysis of vanishing ambiguity): the proof that vanishing ambiguity is sufficient for asymptotic optimality and algorithm completeness must explicitly link the condition to the contraction properties of the robust value iteration operator; without this link the claim that refinement yields zero error bound remains incomplete.
  2. [§5] §5 (comparison of abstractions): the statement that interval MDP abstractions fail to satisfy vanishing ambiguity is load-bearing for the paper's main distinction, yet it is supported only by a general argument rather than a concrete counterexample system (with explicit vector field and uncertainty set) where ambiguity remains bounded away from zero.
minor comments (2)
  1. [Notation and preliminaries] The notation for UMDP transition sets and ambiguity sets should be standardized with the robust dynamic programming references cited in the introduction to improve readability.
  2. [Algorithm description] Figure 1 (abstraction-refinement loop) would benefit from explicit annotation of the vanishing-ambiguity check to clarify its placement in the algorithm.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the careful reading and constructive comments on our manuscript. We address the major comments point by point below and will make the indicated revisions to strengthen the presentation.

read point-by-point responses
  1. Referee: [§4] §4 (theoretical analysis of vanishing ambiguity): the proof that vanishing ambiguity is sufficient for asymptotic optimality and algorithm completeness must explicitly link the condition to the contraction properties of the robust value iteration operator; without this link the claim that refinement yields zero error bound remains incomplete.

    Authors: The proof of Theorem 4.1 in §4 shows sufficiency by bounding the difference between abstract and concrete value functions using the ambiguity measure and invoking convergence of robust value iteration. The robust value iteration operator is a contraction mapping (with modulus γ < 1) in the sup-norm, which ensures a unique fixed point and that the iteration error is controlled by the ambiguity term. Vanishing ambiguity then forces the overall error bound to zero. To address the referee's point, we will add an explicit remark in the proof stating: 'Because the robust Bellman operator T satisfies ||TV − TW||∞ ≤ γ||V − W||∞ and the ambiguity bounds |T V − T_concrete V|, the condition lim ambiguity = 0 implies lim ||V_abstract − V_concrete|| = 0.' This makes the link to contraction properties and the zero-error claim fully explicit. revision: yes

  2. Referee: [§5] §5 (comparison of abstractions): the statement that interval MDP abstractions fail to satisfy vanishing ambiguity is load-bearing for the paper's main distinction, yet it is supported only by a general argument rather than a concrete counterexample system (with explicit vector field and uncertainty set) where ambiguity remains bounded away from zero.

    Authors: The general argument in §5 establishes that interval abstractions over-approximate transition sets in a manner that prevents the ambiguity measure from vanishing for nonlinear dynamics, unlike set-valued abstractions. We agree that an explicit counterexample would make this distinction more concrete and will add one to §5. Consider the scalar system ẋ = −x³ + w with w ∈ [−0.1, 0.1] and a uniform partition of the state space. We will show that successive refinements keep the interval width (and thus the ambiguity) bounded below by a positive constant due to the cubic nonlinearity, while the same system with set-valued abstraction yields vanishing ambiguity. This example will be fully specified with the vector field, uncertainty set, and computed ambiguity values. revision: yes

Circularity Check

0 steps flagged

No significant circularity detected in derivation chain

full rationale

The paper develops a theoretical framework for asymptotic optimality of UMDP abstractions for nonlinear stochastic systems under temporal logic specs. It defines an algorithm using robust dynamic programming, introduces the sufficient condition 'vanishing ambiguity' via direct mathematical analysis, and proves it holds for set-valued abstractions but fails for interval MDPs. All steps rely on internal definitions, proofs, and standard external results in robust DP without any reduction to fitted parameters, self-citation chains, or ansatzes smuggled from prior author work. The central claims are established by explicit construction and verification rather than by construction from inputs.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 1 invented entities

The central claim rests on domain assumptions about nonlinear stochastic dynamics and the ability to refine abstractions arbitrarily; vanishing ambiguity is introduced as a new sufficient condition without independent evidence beyond the proof sketch.

axioms (2)
  • domain assumption Nonlinear stochastic systems with general vector fields admit UMDP abstractions via set-valued or interval methods
    Stated as the focus of the study in the abstract
  • standard math Robust dynamic programming yields controllers with performance guarantees on UMDPs
    Invoked as the synthesis step in the algorithm description
invented entities (1)
  • vanishing ambiguity no independent evidence
    purpose: Sufficient condition for asymptotic optimality and completeness of refinement
    Newly denoted concept whose satisfaction is proven for set-valued but not interval abstractions

pith-pipeline@v0.9.0 · 5449 in / 1371 out tokens · 20068 ms · 2026-05-10T13:45:00.402582+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

27 extracted references · 27 canonical work pages

  1. [1]

    Temporal logic control for nonlinear stochastic systems under unknown disturbances,

    I. Gracia, L. Laurenti, M. Mazo Jr., A. Abate, and M. Lahijanian, “Temporal logic control for nonlinear stochastic systems under unknown disturbances,” inLearning for Dynamics and Control Conference (L4DC). PMLR, Jun. 2025. [Online]. Available: https://sites.google.com/umich.edu/l4dc2025/

  2. [2]

    Approximate model checking of stochastic hybrid systems,

    A. Abate, J.-P. Katoen, J. Lygeros, and M. Prandini, “Approximate model checking of stochastic hybrid systems,”European Journal of Control, vol. 16, no. 6, pp. 624–641, 2010

  3. [3]

    Formal verification and synthesis for discrete-time stochastic systems,

    M. Lahijanian, S. B. Andersson, and C. Belta, “Formal verification and synthesis for discrete-time stochastic systems,”IEEE Transactions on Automatic Control, vol. 60, no. 8, pp. 2031–2045, 2015

  4. [4]

    Efficiency through uncertainty: Scalable formal synthesis for stochastic hybrid systems,

    N. Cauchi, L. Laurenti, M. Lahijanian, A. Abate, M. Kwiatkowska, and L. Cardelli, “Efficiency through uncertainty: Scalable formal synthesis for stochastic hybrid systems,” inProceedings of the 22nd ACM international conference on hybrid systems: computation and control, 2019, pp. 240–251. 0 0.2 0.4 0.6 0.8 10 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1 Water C...

  5. [5]

    Formal control synthesis for stochastic neural network dynamic models,

    S. Adams, M. Lahijanian, and L. Laurenti, “Formal control synthesis for stochastic neural network dynamic models,”IEEE Control Systems Letters, vol. 6, pp. 2858–2863, 2022

  6. [6]

    Abstraction-based synthesis for stochastic systems with omega-regular objectives,

    M. Dutreix, J. Huh, and S. Coogan, “Abstraction-based synthesis for stochastic systems with omega-regular objectives,”Nonlinear Analy- sis: Hybrid Systems, vol. 45, p. 101204, 2022

  7. [7]

    Safe learning for uncertainty-aware planning via interval mdp abstraction,

    J. Jiang, Y . Zhao, and S. Coogan, “Safe learning for uncertainty-aware planning via interval mdp abstraction,”IEEE Control Systems Letters, vol. 6, pp. 2641–2646, 2022

  8. [8]

    Temporal logic control of nonlinear stochastic systems using a piecewise-affine abstraction,

    B. C. van Huijgevoort, S. Weiland, and S. Haesaert, “Temporal logic control of nonlinear stochastic systems using a piecewise-affine abstraction,”IEEE Control Systems Letters, vol. 7, 2022

  9. [9]

    Data-driven strategy synthesis for stochastic systems with unknown nonlinear disturbances,

    I. Gracia, D. Boskos, L. Laurenti, and M. Lahijanian, “Data-driven strategy synthesis for stochastic systems with unknown nonlinear disturbances,” in6th Annual Learning for Dynamics & Control Con- ference. PMLR, 2024, pp. 1633–1645

  10. [10]

    Data-driven abstraction and synthesis for stochastic systems with unknown dynamics,

    M. Nazeri, T. Badings, A.-K. Schmuck, S. Soudjani, and A. Abate, “Data-driven abstraction and synthesis for stochastic systems with unknown dynamics,”arXiv preprint arXiv:2508.15543, 2025

  11. [11]

    Formal abstraction of general stochastic systems via noise partitioning,

    J. Skovbekk, L. Laurenti, E. Frew, and M. Lahijanian, “Formal abstraction of general stochastic systems via noise partitioning,”IEEE Control Systems Letters, vol. 7, pp. 3711–3716, 2023

  12. [12]

    Gracia and M

    I. Gracia and M. Lahijanian, “Beyond interval mdps: Tight and efficient abstractions of stochastic systems,”arXiv preprint arXiv:2507.02213, 2025

  13. [13]

    Efficient strategy synthesis for switched stochastic systems with dis- tributional uncertainty,

    I. Gracia, D. Boskos, M. Lahijanian, L. Laurenti, and M. Mazo Jr, “Efficient strategy synthesis for switched stochastic systems with dis- tributional uncertainty,”Nonlinear Analysis: Hybrid Systems, vol. 55, p. 101554, 2025

  14. [14]

    Enhanc- ing data-driven stochastic control via bundled interval mdp,

    R. Coppola, A. Peruffo, L. Romao, A. Abate, and M. Mazo, “Enhanc- ing data-driven stochastic control via bundled interval mdp,”IEEE Control Systems Letters, vol. 8, pp. 2069–2074, 2024

  15. [15]

    Adaptive and sequential gridding procedures for the abstraction and verification of stochastic processes,

    S. Esmaeil Zadeh Soudjani and A. Abate, “Adaptive and sequential gridding procedures for the abstraction and verification of stochastic processes,”SIAM Journal on Applied Dynamical Systems, vol. 12, no. 2, pp. 921–956, 2013

  16. [16]

    Linear temporal logic and linear dynamic logic on finite traces,

    G. De Giacomo and M. Y . Vardi, “Linear temporal logic and linear dynamic logic on finite traces,” inInt. joint conf. on Artificial Intelli- gence. Association for Computing Machinery, 2013, pp. 854–860

  17. [17]

    Robust dynamic programming,

    G. N. Iyengar, “Robust dynamic programming,”Mathematics of Op- erations Research, vol. 30, no. 2, pp. 257–280, 2005

  18. [18]

    Linear temporal logic and linear dynamic logic on finite traces,

    G. De Giacomo and M. Y . Vardi, “Linear temporal logic and linear dynamic logic on finite traces,” 2013

  19. [19]

    Bertsekas and S

    D. Bertsekas and S. E. Shreve,Stochastic optimal control: the discrete- time case. Athena Scientific, 1996, vol. 5

  20. [20]

    Bounded-parameter markov deci- sion processes,

    R. Givan, S. Leach, and T. Dean, “Bounded-parameter markov deci- sion processes,”Artificial Intelligence, vol. 122, no. 1-2, 2000

  21. [21]

    Robust control of uncertain markov decision processes with temporal logic specifications,

    E. M. Wolff, U. Topcu, and R. M. Murray, “Robust control of uncertain markov decision processes with temporal logic specifications,” in2012 IEEE 51st IEEE Conference on decision and control (CDC). IEEE, 2012, pp. 3372–3379

  22. [22]

    Formal verification of unknown dynamical systems via gaussian process regression,

    J. Skovbekk, L. Laurenti, E. Frew, and M. Lahijanian, “Formal verification of unknown dynamical systems via gaussian process regression,”IEEE Transactions on Automatic Control, 2025. [Online]. Available: https://arxiv.org/abs/2201.00655

  23. [23]

    Data-driven control via conditional mean embeddings: Formal guarantees via uncertain mdp abstraction,

    I. Gracia and M. Lahijanian, “Data-driven control via conditional mean embeddings: Formal guarantees via uncertain mdp abstraction,”arXiv preprint arXiv:2512.13940, 2025

  24. [24]

    Learning task specifications from demonstrations,

    M. Vazquez-Chanlatte, S. Jha, A. Tiwari, M. K. Ho, and S. Seshia, “Learning task specifications from demonstrations,”Advances in neu- ral information processing systems, vol. 31, 2018

  25. [25]

    Villaniet al.,Optimal transport: old and new

    C. Villaniet al.,Optimal transport: old and new. Springer, 2009, vol. 338

  26. [26]

    Extension of range of functions,

    E. J. McShane, “Extension of range of functions,” 1934

  27. [27]

    Strategy synthe- sis for partially-known switched stochastic systems,

    J. Jackson, L. Laurenti, E. Frew, and M. Lahijanian, “Strategy synthe- sis for partially-known switched stochastic systems,” inProceedings of the 24th International Conference on Hybrid Systems: Computation and Control, 2021, pp. 1–11