pith. sign in

arxiv: 2606.10717 · v1 · pith:Q6GUQE4Nnew · submitted 2026-06-09 · 💻 cs.PL · math.OC

Max-Policy Iteration, Revisited

Pith reviewed 2026-06-27 11:03 UTC · model grok-4.3

classification 💻 cs.PL math.OC
keywords max-policy iterationvalue iterationprogram invariantsbound analysisstatic analysisnumeric invariantsmin-policy iterationpolicy iteration
0
0 comments X

The pith

Mathematical optimization in max-policy iteration can be replaced by terminating value iteration for integer and floating-point systems.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper demonstrates that max-policy iteration, used to compute precise numeric program invariants, does not require expensive mathematical optimization steps when the equations are over integers or floating-point numbers. Instead, plain value iteration can be used and is still guaranteed to terminate. This approach yields precise bounds on variables without relying on widening operators. For the rational case, the authors introduce min-policy iteration as a substitute for linear programming and prove it finds the least solution in bounded systems while extending it to unbounded cases with soundness and optimality certificates.

Core claim

Max-policy iteration on systems of equations over integers as well as over floating point numbers allows mathematical optimization to be replaced by plain value iteration, which is still guaranteed to terminate. As an application, a precise bound analysis for integer or floating point variables is obtained, avoiding widening operators altogether. For max-policy iteration over the rational numbers where right-hand sides are maxima of minima of affine combinations, min-policy iteration is guaranteed to return the least solution for bounded systems, with extensions to unbounded systems and certificates of soundness and optimality.

What carries the argument

Max-policy iteration, which successively resolves maximum operators in equation systems and reduces to optimization or now value iteration.

If this is right

  • Precise bound analysis for integer or floating point variables without widening operators.
  • Value iteration terminates for the equation systems in max-policy iteration over integers and floats.
  • Min-policy iteration returns the least solution for bounded systems over rationals.
  • Certificates of soundness and optimality can be constructed for the results.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • This method could reduce the computational cost of static analysis tools by avoiding external optimization solvers.
  • Similar replacements might be possible in other policy iteration variants if termination properties can be established.
  • The approach may extend to hybrid numeric domains combining integers and floats.

Load-bearing premise

Value iteration is guaranteed to terminate specifically for the systems of equations that arise from max-policy iteration over integers and floating-point numbers.

What would settle it

Finding a system of equations over integers or floats arising in max-policy iteration where value iteration either fails to terminate or produces a different result from the optimization-based version.

Figures

Figures reproduced from arXiv: 2606.10717 by David Monniaux (VERIMAG - IMAG, Helmut Seidl (TUM), INS2I-CNRS).

Figure 1
Figure 1. Figure 1: The bounds analysis for the example program. Assuming that N ≥ 0, the least solution of this system is given by the table ⟨1, i−⟩ 0 ⟨1, i+⟩ N ⟨2, i−⟩ 0 ⟨2, i+⟩ N − 1 ⟨3, i−⟩ −1 ⟨3, i+⟩ N ⟨4, i−⟩ −N ⟨4, i+⟩ N More generally, we consider a linearly ordered value domain D, which could be, e.g., Z (the integers) F (the floating point numbers), Q (the rationals) or R (the reals). For Z, Q and R we assume them t… view at source ↗
Figure 2
Figure 2. Figure 2: max -policy iteration algorithm using solve to determine solutions of Σ π . – It starts with the policy π0 = {x 7→ 0 | x ∈ X} and ρ0 = {x 7→ −∞ | x ∈ X}. – For each t ≥ 0, it determines a mapping ρ ′ t+1 with ρ ′ t+1(x) = [[ex]] ρt. It checks whether ρt is already a solution of Σ, i.e., whether ρt = ρ ′ t+1. – If not, a new policy πt+1 is determined reluctantly from πt. This means, that πt+1(x) = πt(x) whe… view at source ↗
Figure 3
Figure 3. Figure 3: The min-policy iteration algorithm for a new max -policy π ′ where ρ ′ (x) = [[ex]] ρ = [[ex,π′(x)]] ρ for the previous solution ρ. (9). The outer max -policy iteration thus improves policy π2 to π3 = {x 7→ 3} — for which ultimately the solution {x 7→ 8.0} found, and both the inner min- and the outer max -policy iteration terminate. Theorem 2 still leaves open how the encountered systems (12) without max￾i… view at source ↗
Figure 4
Figure 4. Figure 4: Comparison of timings (seconds) for 3800 bounded examples with [200, 3999] unknowns for the (x axis) Max+Min (left), or Max+Val (right), and Max+LP using Gurobi (y axis). Roughly, Max+Min is 2× slower than Max+LP (which may be explained by a naïve implementation), but Max+Val is 31× faster than Max+LP. a max - and min- policy to solve an affine system of equations. We solve these sys￾tems by Gaussian elimi… view at source ↗
read the original abstract

Max-policy iteration is an approach to computing precise numeric program invariants by successive attempts at resolving maximum operators and reduction to mathematical optimization. Mathematical optimization, though, may be expensive. Here, we show, for max-policy iteration on systems of equations over integers as well as over floating point numbers, that mathematical optimization can be replaced by plain value iteration -- which is still guaranteed to terminate. As an application, a precise bound analysis for integer or floating point variables is obtained, avoiding widening operators altogether. We also consider max-policy iteration over the rational numbers, where the right-hand sides are maxima of minima of affine combinations of unknowns. We propose min-policy iteration as an alternative to linear programming for solving the optimization problems posed by max-policy iteration. We prove that max-min policy iteration is guaranteed to return the least solution for bounded systems. We also show how to extend this algorithm to unbounded systems, and how to construct certificates of soundness as well as of optimality of the computed results.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The paper claims that max-policy iteration for precise numeric program invariants can replace mathematical optimization with plain value iteration for equation systems over integers and floating-point numbers while preserving termination guarantees. For systems over the rationals (with right-hand sides as maxima of minima of affine combinations), it proposes min-policy iteration as an alternative to linear programming, proves that max-min policy iteration returns the least solution for bounded systems, extends the algorithm to unbounded systems, and shows how to construct certificates of soundness and optimality. The approach is applied to bound analysis for integer or floating-point variables without widening.

Significance. If the termination and correctness results hold, the work offers a practical simplification for invariant computation in program analysis by substituting value iteration for optimization solvers, while avoiding widening and providing certificates. The explicit treatment of bounded versus unbounded cases for rationals and the focus on termination for integers and floats are potentially valuable contributions to the field of numeric abstract interpretation.

major comments (2)
  1. [Abstract and integer/FP section] Abstract and integer/FP termination claim: the assertion that value iteration is guaranteed to terminate for the integer equation systems arising in max-policy iteration is load-bearing for the central claim, yet the abstract (and apparently the rational section's contrast) does not state the precise conditions (e.g., absence of positive cycles, monotonicity bounding ascent, or finite least solution) that prevent divergence over unbounded domains; this needs an explicit lemma or theorem reference.
  2. [Rational numbers section] Rational numbers section on bounded systems: the proof that max-min policy iteration returns the least solution relies on the structure of the min-max affine systems, but without a concrete derivation or counterexample check for the bounded case, it is unclear whether the guarantee extends directly from standard policy iteration results or requires additional assumptions on the equation systems.
minor comments (2)
  1. The manuscript would benefit from an early running example illustrating the transition from max-policy iteration to value iteration on a small integer system.
  2. Notation for the min-policy iteration updates could be made more uniform with the max-policy case to aid readability.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive feedback. We address each major comment below.

read point-by-point responses
  1. Referee: [Abstract and integer/FP section] Abstract and integer/FP termination claim: the assertion that value iteration is guaranteed to terminate for the integer equation systems arising in max-policy iteration is load-bearing for the central claim, yet the abstract (and apparently the rational section's contrast) does not state the precise conditions (e.g., absence of positive cycles, monotonicity bounding ascent, or finite least solution) that prevent divergence over unbounded domains; this needs an explicit lemma or theorem reference.

    Authors: We agree the abstract should reference the precise conditions. Termination for integer and floating-point systems is established in Theorem 3.5, which assumes a finite least solution and absence of positive cycles in the dependency graph. We will revise the abstract to include an explicit reference to this theorem. revision: yes

  2. Referee: [Rational numbers section] Rational numbers section on bounded systems: the proof that max-min policy iteration returns the least solution relies on the structure of the min-max affine systems, but without a concrete derivation or counterexample check for the bounded case, it is unclear whether the guarantee extends directly from standard policy iteration results or requires additional assumptions on the equation systems.

    Authors: The proof in Section 4.1 is tailored to the max-min affine structure and proceeds via monotonic improvement to the least fixed point; it requires the specific form of the right-hand sides and does not follow directly from classical policy iteration without adaptation. We will add a short derivation sketch and an illustrative example in the revised manuscript. revision: yes

Circularity Check

0 steps flagged

No circularity; derivation relies on independent proofs and new algorithmic constructions.

full rationale

The paper introduces replacements of mathematical optimization by value iteration for max-policy iteration over integers and floats, with stated termination guarantees, and proposes min-policy iteration for rationals with proofs of least-solution return for bounded systems plus extensions to unbounded cases and certificates. No self-definitional mappings, fitted inputs renamed as predictions, or load-bearing self-citations appear; the central results are presented as new proofs and constructions that do not reduce to their own inputs by definition or prior author results.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 0 invented entities

The paper relies on standard mathematical properties of value iteration and policy iteration methods; no free parameters or invented entities are introduced in the abstract.

axioms (2)
  • domain assumption Value iteration on max-policy systems over integers and floating-point numbers is guaranteed to terminate.
    Invoked to replace mathematical optimization while preserving correctness.
  • domain assumption Min-policy iteration returns the least solution for bounded max-min systems over the rationals.
    Central to the claim that the method is an alternative to linear programming.

pith-pipeline@v0.9.1-grok · 5702 in / 1193 out tokens · 48655 ms · 2026-06-27T11:03:21.626904+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

34 extracted references · 19 canonical work pages

  1. [1]

    Adjé, A., Gaubert, S., Goubault, E.: Coupling policy iteration with semi- definite relaxation to compute accurate numerical invariants in static anal- ysis. Log. Methods Comput. Sci.8(1) (2012). https://doi.org/10.2168/ LMCS-8(1:1)2012

  2. [2]

    Al Taleb, G

    Adjé, A., Gaubert, S., Goubault, E.: Computing the smallest fixed point of order-preserving nonexpansive mappings arising in positive stochastic games and static analysis of programs. Journal of Mathematical Analysis and Applications410, 227–240 (Feb 2014).https://doi.org/10.1016/j. jmaa.2013.07.076

  3. [3]

    In: Logozzo, F., Peled, D.A., Zuck, L.D

    Bagnara, R., Hill, P.M., Zaffanella, E.: An improved tight closure algorithm for integer octagonal constraints. In: Logozzo, F., Peled, D.A., Zuck, L.D. (eds.) Verification, Model Checking, and Abstract Interpretation, 9th In- ternational Conference, VMCAI 2008, San Francisco, USA, January 7-9, 2008, Proceedings. Lecture Notes in Computer Science, vol. 49...

  4. [4]

    Boyle, M.: Basic perron frobenius theory and inverse spectral prob- lems (2013), https://www-fourier.ujf-grenoble.fr/sites/default/ files/files/fichiers/lecturenotes_boyle1.pdf, lecture notes

  5. [5]

    Cambridge University Press (2010)

    Cook, S., Nguyen, P.: Logical foundations of proof complexity. Cambridge University Press (2010)

  6. [6]

    In: Etessami, K., Rajamani, S.K

    Costan, A., Gaubert, S., Goubault, E., Martel, M., Putot, S.: A policy iterationalgorithmforcomputingfixedpointsinstaticanalysisofprograms. In: Etessami, K., Rajamani, S.K. (eds.) CAV. LNCS, vol. 3576, pp. 462–475. Springer (2005). https://doi.org/10.1007/11513988_46

  7. [7]

    In: Conference Record of the 4th ACM Symposium on Principles of Programming Languages

    Cousot, P., Cousot, R.: Abstract interpretation: a unified lattice model for static analysis of programs by construction or approximation of fix- points. In: Conference Record of the 4th ACM Symposium on Principles of Programming Languages. pp. 238–252. Los Angeles, CA (Jan 1977). https://doi.org/10.1145/512950.5129

  8. [8]

    Journal of Logic and Computation2(4), 511–547 (1992).https://doi.org/10.1093/ LOGCOM/2.4.511

    Cousot, P., Cousot, R.: Abstract interpretation frameworks. Journal of Logic and Computation2(4), 511–547 (1992).https://doi.org/10.1093/ LOGCOM/2.4.511

  9. [9]

    Nu- merische Mathematik40(1), 137–141 (1982).https://doi.org/10.1007/ BF01459082, http://eudml.org/doc/132821

    Dixon, J.D.: Exact solution of linear equations usingp-adic expansions. Nu- merische Mathematik40(1), 137–141 (1982).https://doi.org/10.1007/ BF01459082, http://eudml.org/doc/132821

  10. [10]

    Research Report LIp RR-2002-15, Laboratoire de l’informatique du parallélisme (Mar 2002), https://hal.science/ hal-02102080

    Dumas, J.G., Gautier, T., Giesbrecht, M., Giorgi, P., Hovinen, B., Kaltofen, E., Saunders, B., Turner, W., Villard, G.: LinBox: A generic library for exact linear algebra. Research Report LIp RR-2002-15, Laboratoire de l’informatique du parallélisme (Mar 2002), https://hal.science/ hal-02102080

  11. [11]

    In: Nicola, R.D

    Gaubert, S., Goubault, E., Taly, A., Zennou, S.: Static analysis by policy iteration on relational domains. In: Nicola, R.D. (ed.) Programming Lan- Max-Policy Iteration, Revisited 29 guages and Systems, 16th European Symposium on Programming, ESOP 2007, Held as Part of the Joint European Conferences on Theory and Prac- tics of Software, ETAPS 2007, Braga,...

  12. [12]

    Logical Methods in Com- puter Science8(3) (2012)

    Gawlitza, T.M., Monniaux, D.: Invariant generation through strategy itera- tion in succinctly represented control flow graphs. Logical Methods in Com- puter Science8(3) (2012). https://doi.org/10.2168/LMCS-8(3:29)2012

  13. [13]

    In: Programming Languages and Systems

    Gawlitza, T.M., Seidl, H.: Precise fixpoint computation through strategy iteration. In: Programming Languages and Systems. 16th European Sym- posium on Programming (ESOP). pp. 300–315. LNCS 4421, Springer (2007)

  14. [14]

    In: Duparc, J., Henzinger, T.A

    Gawlitza, T.M., Seidl, H.: Precise relational invariants through strategy it- eration. In: Duparc, J., Henzinger, T.A. (eds.) Computer Science Logic, 21st International Workshop, CSL 2007, 16th Annual Conference of the EACSL, Lausanne, Switzerland, September 11-15, 2007, Proceedings. Lec- ture Notes in Computer Science, vol. 4646, pp. 23–40. Springer (200...

  15. [15]

    In: Bouajjani, A., Maler, O

    Gawlitza, T.M., Seidl, H.: Games through nested fixpoints. In: Bouajjani, A., Maler, O. (eds.) Computer Aided Verification, 21st International Con- ference, CAV 2009, Grenoble, France, June 26 - July 2, 2009. Proceedings. Lecture Notes in Computer Science, vol. 5643, pp. 291–305. Springer (2009). https://doi.org/10.1007/978-3-642-02658-4_24

  16. [16]

    ACM Trans

    Gawlitza, T.M., Seidl, H.: Solving systems of rational equations through strategy iteration. ACM Trans. Program. Lang. Syst. 33(3), 11:1–11:48 (2011). https://doi.org/10.1145/1961204.1961207

  17. [17]

    In: Workshop on Invariant Generation

    Gawlitza,T.M.,Seidl,H.:Abstractinterpretationoverzoneswithoutwiden- ing. In: Workshop on Invariant Generation. p. 12–43. EasyChair, volume 1 of EPiC (2012)

  18. [18]

    Formal Methods in System Design44, 101–148 (Sep 2013)

    Gawlitza, T.M., Seidl, H.: Numerical invariants through convex relaxation and max-strategy iteration. Formal Methods in System Design44, 101–148 (Sep 2013). https://doi.org/10.1007/s10703-013-0190-8

  19. [19]

    Gawlitza, T.M., Seidl, H., Adjé, A., Gaubert, S., Goubault, E.: Abstract interpretation meets convex optimization. J. Symb. Comput.47(12), 1416– 1446 (2012). https://doi.org/10.1016/J.JSC.2011.12.048

  20. [20]

    In: Nielson, H.R., Filé, G

    Gopan, D., Reps, T.W.: Guided static analysis. In: Nielson, H.R., Filé, G. (eds.) Static Analysis, 14th International Symposium, SAS 2007, Kon- gens Lyngby, Denmark, August 22-24, 2007, Proceedings. Lecture Notes in Computer Science, vol. 4634, pp. 349–365. Springer (2007). https: //doi.org/10.1007/978-3-540-74061-2_22

  21. [21]

    In: High-order workshop on au- tomated runtime verification and debugging

    Kwiatkowska, M., Parker, D., Qu, H., Ujma, M.: On incremental quantita- tive verification for probabilistic systems. In: High-order workshop on au- tomated runtime verification and debugging. Manchester, United Kingdom (Dec 2011)

  22. [22]

    Miné, A.: The octagon abstract domain. High. Order Symb. Comput.19(1), 31–100 (2006).https://doi.org/10.1007/S10990-006-8609-1 30 D. Monniaux and H. Seidl

  23. [23]

    Miné, A.: Weakly Relational Numerical Abstract Domains. Ph.D. thesis, École polytechnique (Dec 2004),https://hal.science/tel-00136630

  24. [24]

    In: Nielson, H.R., Filé, G

    Monniaux, D.: Optimal abstraction on real-valued programs. In: Nielson, H.R., Filé, G. (eds.) Static Analysis, 14th International Symposium, SAS 2007, Kongens Lyngby, Denmark, August 22-24, 2007, Proceedings. Lecture Notes in Computer Science, vol. 4634, pp. 104–120. Springer (2007).https: //doi.org/10.1007/978-3-540-74061-2_7

  25. [25]

    In: Shao, Z., Pierce, B.C

    Monniaux, D.: Automatic modular abstractions for linear constraints. In: Shao, Z., Pierce, B.C. (eds.) Proceedings of the 36th ACM SIGPLAN- SIGACT Symposium on Principles of Programming Languages, POPL 2009, Savannah, GA, USA, January 21-23, 2009. pp. 140–151. ACM (2009). https://doi.org/10.1145/1480881.1480899

  26. [26]

    In: Müller-Olm, M., Seidl, H

    Monniaux, D., Schrammel, P.: Speeding up logico-numerical strategy itera- tion. In: Müller-Olm, M., Seidl, H. (eds.) Static Analysis - 21st International Symposium, SAS 2014, Munich, Germany, September 11-13, 2014. Proceed- ings. Lecture Notes in Computer Science, vol. 8723, pp. 253–267. Springer (2014). https://doi.org/10.1007/978-3-319-10936-7_16

  27. [27]

    Wiley–Interscience (1994)

    Puterman, M.L.: Markov Decision Processes: Discrete stochastic dynamic programming. Wiley–Interscience (1994)

  28. [28]

    Roux, P.: Analyse statique de systèmes de contrôle commande : synthèse d’invariants non linéaires. Ph.D. thesis, Université de Toulouse (Dec 2013), https://theses.fr/api/v1/document/2013ESAE0046, main text in En- glish

  29. [29]

    Formal Methods Syst

    Roux, P., Voronin, Y., Sankaranarayanan, S.: Validating numerical semidefinite programming solvers for polynomial invariants. Formal Methods Syst. Des. 53(2), 286–312 (2018).https://doi.org/10.1007/ S10703-017-0302-Y, https://hal.science/hal-01358703/

  30. [30]

    In: Emerson, E.A., Namjoshi, K.S

    Sankaranarayanan,S.,Colón,M.,Sipma,H.B.,Manna,Z.:Efficientstrongly relational polyhedral analysis. In: Emerson, E.A., Namjoshi, K.S. (eds.) Verification, Model Checking, and Abstract Interpretation, 7th Interna- tional Conference, VMCAI 2006, Charleston, SC, USA, January 8-10, 2006, Proceedings. Lecture Notes in Computer Science, vol. 3855, pp. 111–125. S...

  31. [31]

    In: Hermenegildo, M.V., Morales, J.F

    Schwarz, M., Seidl, H.: Octagons revisited - elegant proofs and simplified algorithms. In: Hermenegildo, M.V., Morales, J.F. (eds.) Static Analysis - 30thInternationalSymposium,SAS2023,Cascais,Portugal,October22-24, 2023, Proceedings. Lecture Notes in Computer Science, vol. 14284, pp. 485–

  32. [32]

    Springer (2023).https://doi.org/10.1007/978-3-031-44245-2_21

  33. [33]

    In: Kutsia, T., Voronkov, A

    Seidl, H., Gawlitza, T.M., Schwarz, M.D.: Parametric strategy iteration. In: Kutsia, T., Voronkov, A. (eds.) 6th International Symposium on Sym- bolic Computation in Software Science, SCSS 2014, Gammarth, La Marsa, Tunisia, December 7-8, 2014. EPiC Series in Computing, vol. 30, pp. 62–76. EasyChair (2014). https://doi.org/10.29007/C4KG

  34. [34]

    The FLINT team: FLINT: Fast Library for Number Theory (2025), version 3.2.1, https://flintlib.org