Recognition: 2 theorem links
· Lean TheoremRandomise Alone, Reach as a Team
Pith reviewed 2026-05-15 15:46 UTC · model grok-4.3
The pith
Memoryless strategies suffice for teams to exceed winning probability thresholds in games with private randomness.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
Memoryless strategies are sufficient for the threshold problem in concurrent graph games with distributed randomisation. The result embeds the decision problem in the existential theory of the reals and supports value-iteration algorithms. The threshold problem is NP-hard; almost-sure reachability is NP-complete. Individually Randomised Alternating-time Temporal Logic supplies the semantics for coalitions that lack a shared randomness source.
What carries the argument
Memoryless strategies under the distributed-randomisation model, which reduce the threshold decision to an existential formula over the reals.
If this is right
- Value-iteration algorithms can compute winning probabilities for the threshold problem.
- The threshold problem is NP-hard, so polynomial-time algorithms are unlikely unless P equals NP.
- Almost-sure reachability admits an NP decision procedure and is therefore complete for that class.
- IRATL provides a logical language for specifying probability thresholds when randomness is private to each player.
Where Pith is reading between the lines
- If players could observe one another's random outcomes, memory might again become necessary and the complexity could rise.
- Verification tools for multi-agent systems can therefore prioritise memoryless implementations when randomness sources are known to be private.
- The same reduction technique may apply to related objectives such as safety or parity when the underlying graph remains finite.
Load-bearing premise
Each player's random source is completely private and stays hidden from the opponent and from all other team members throughout the game.
What would settle it
A finite graph game in which the optimal probability above a chosen threshold can be achieved only by a strategy that remembers previous random outcomes or positions would falsify the memoryless sufficiency claim.
Figures
read the original abstract
We study concurrent graph games where n players cooperate against an opponent to reach a set of target states. Unlike traditional settings, we study distributed randomisation: team players do not share a source of randomness, and their private random sources are hidden from the opponent and from each other. We show that memoryless strategies are sufficient for the threshold problem (deciding whether there is a strategy for the team that ensures winning with probability that exceeds a threshold), a result that not only places the problem in the Existential Theory of the Reals (\exists\mathbb{R}) but also enables the construction of value iteration algorithms. We additionally show that the threshold problem is NP-hard. For the almost-sure reachability problem, we prove NP-completeness. We introduce Individually Randomised Alternating-time Temporal Logic (IRATL). This logic extends the standard ATL framework to reason about probability thresholds, with semantics explicitly designed for coalitions that lack a shared source of randomness. On the practical side, we implement and evaluate a solver for the threshold and almost-sure problem based on the algorithms that we develop.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper studies concurrent graph games in which a coalition of n players cooperates against an opponent to reach a target set, using distributed randomization in which each player has a private random source hidden from the opponent and from the other players. The central claims are that memoryless strategies suffice for the threshold reachability problem (placing it in ∃R and enabling value iteration), that the threshold problem is NP-hard, and that almost-sure reachability is NP-complete. The authors introduce Individually Randomised Alternating-time Temporal Logic (IRATL) whose semantics are defined for coalitions without shared randomness and present an implemented solver for the decision problems.
Significance. If the memoryless-sufficiency result holds, the work supplies a complexity classification (∃R membership together with NP-hardness) and a practical algorithmic route (value iteration) for reachability under private, hidden randomness; this extends the standard shared-randomness model of stochastic games and ATL. The NP-completeness result for almost-sure reachability gives a tight bound, and the implemented solver provides an immediate practical contribution for verification of multi-agent systems with individual randomization.
major comments (2)
- [Abstract] Abstract: the claim that memoryless strategies suffice for the threshold problem rests on an encoding of per-state action distributions as existential real variables whose induced reachability probabilities satisfy a system of polynomial inequalities; no explicit system of inequalities, no proof sketch, and no verification that the encoding preserves the private-randomness semantics are supplied, rendering the ∃R membership and value-iteration claims unverifiable from the given text.
- [Abstract] Abstract: the NP-hardness reduction for the threshold problem and the NP-completeness proof for almost-sure reachability are asserted without describing the source problem, the gadget construction, or how the reduction respects the finite turn-based concurrent graph and hidden-private-randomness model; these details are load-bearing for the hardness claims.
minor comments (1)
- The syntax and semantics of IRATL are introduced but not displayed; a short formal definition or example formula would improve readability for readers outside the ATL community.
Simulated Author's Rebuttal
We thank the referee for the careful reading and for highlighting the significance of the memoryless-sufficiency result and the complexity classifications. We address each major comment below with clarifications drawn from the full manuscript and indicate where revisions will strengthen the presentation.
read point-by-point responses
-
Referee: [Abstract] Abstract: the claim that memoryless strategies suffice for the threshold problem rests on an encoding of per-state action distributions as existential real variables whose induced reachability probabilities satisfy a system of polynomial inequalities; no explicit system of inequalities, no proof sketch, and no verification that the encoding preserves the private-randomness semantics are supplied, rendering the ∃R membership and value-iteration claims unverifiable from the given text.
Authors: The explicit system of polynomial inequalities (encoding independent per-player action distributions at each state together with the resulting reachability probabilities) and the proof that this encoding preserves private-randomness semantics appear in Section 3.2 of the full manuscript; the ∃R membership follows immediately from existential quantification over the real variables, and value iteration is justified by the memoryless property established in Theorem 3. We agree the abstract is too terse and will revise it to include a concise sketch of the encoding and its semantic preservation. revision: partial
-
Referee: [Abstract] Abstract: the NP-hardness reduction for the threshold problem and the NP-completeness proof for almost-sure reachability are asserted without describing the source problem, the gadget construction, or how the reduction respects the finite turn-based concurrent graph and hidden-private-randomness model; these details are load-bearing for the hardness claims.
Authors: The NP-hardness reduction for threshold reachability (from 3-SAT) and the NP-completeness proof for almost-sure reachability (from SAT) are fully detailed in Sections 4.1 and 4.2, respectively, with explicit gadget constructions that enforce the concurrent turn-based structure and independent private randomness. We acknowledge that the abstract omits even a high-level description of the source problems and key gadget properties; we will add a brief overview of both reductions to the revised abstract. revision: partial
Circularity Check
No significant circularity in derivation chain
full rationale
The central claim that memoryless strategies suffice for the threshold problem in finite turn-based concurrent graph games with private hidden randomness is presented as a direct result of encoding per-state action distributions as existential real variables satisfying polynomial inequalities. No equations, self-citations, or reductions in the abstract or provided text equate the claimed sufficiency to a fitted parameter, renamed input, or load-bearing self-citation chain. The placement in ∃R and NP-hardness follow from standard complexity reductions in game theory without circular self-definition. The model assumptions (private randomness sources, no shared/observable randomness) are stated explicitly and do not presuppose the memoryless result.
Axiom & Free-Parameter Ledger
axioms (2)
- standard math Finite turn-based concurrent graph games with reachability objectives
- domain assumption Private random sources are independent and hidden from all other agents
invented entities (1)
-
IRATL logic
no independent evidence
Lean theorems connected to this paper
-
IndisputableMonolith/Cost/FunctionalEquation.leanwashburn_uniqueness_aczel unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
We show that memoryless strategies are sufficient for the threshold problem... places the problem in the Existential Theory of the Reals (∃R) but also enables the construction of value iteration algorithms.
-
IndisputableMonolith/Foundation/RealityFromDistinction.leanreality_from_one_distinction unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
We introduce Individually Randomised Alternating-time Temporal Logic (IRATL)... semantics explicitly designed for coalitions that lack a shared source of randomness.
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
Reference graph
Works this paper leans on
-
[1]
In: CONCUR 2001 — Concurrency Theory
de Alfaro, L., Henzinger, T.A., Jhala, R.: Compositional methods for probabilistic systems. In: CONCUR 2001 — Concurrency Theory. pp. 351–365. Springer Berlin Heidelberg, Berlin, Heidelberg (2001)
work page 2001
-
[2]
Alur, R., Henzinger, T.A., Kupferman, O.: Alternating-time temporal logic. J. ACM49(5), 672–713 (2002).https://doi.org/10.1145/585265.585270,https: //doi.org/10.1145/585265.585270
-
[3]
Attoui, A.: Multi-agent based method for reactive systems formal specification and validation. IFAC Proceedings Volumes29(2), 1–6 (1996).https://doi.org/ https://doi.org/10.1016/S1474-6670(17)43768-2, 5th IFAC/IFIP/GI/GMA Workshop on Experience With the Management of Software Projects 1995 (MSP ’95)
-
[4]
CoRRabs/2510.02984(2025).https://doi.org/10
Bertrand, N., Bouyer, P., Lapointe, L., Mascle, C.: Reach together: How popula- tions win repeated games. CoRRabs/2510.02984(2025).https://doi.org/10. 48550/ARXIV.2510.02984,https://doi.org/10.48550/arXiv.2510.02984
-
[5]
Bertrand, N., Bouyer, P., Majumdar, A.: Concurrent parameterized games. In: FSTTCS 2019. LIPIcs, vol. 150, pp. 31:1–31:15. Schloss Dagstuhl - Leibniz- Zentrum für Informatik (2019).https://doi.org/10.4230/LIPICS.FSTTCS.2019. 31
-
[6]
Bertrand, N., Bouyer, P., Majumdar, A.: Synthesizing safe coalition strategies. In: FSTTCS 2020. LIPIcs, vol. 182, pp. 39:1–39:17. Schloss Dagstuhl - Leibniz- Zentrum für Informatik (2020).https://doi.org/10.4230/LIPICS.FSTTCS.2020. 39
-
[7]
The Annals of Mathematical Statis- tics33(2), 719–726 (1962)
Blackwell, D.: Discrete dynamic programming. The Annals of Mathematical Statis- tics33(2), 719–726 (1962)
work page 1962
-
[8]
Brice, L., Henzinger, T.A., Thejaswini, K.S.: Dicey games: Shared sources of ran- domness in distributed systems (2026),https://arxiv.org/abs/2601.18303
work page internal anchor Pith review Pith/arXiv arXiv 2026
-
[9]
In: Pro- ceedings of the Twentieth Annual ACM Symposium on Theory of Computing STOC
Canny, J.: Some algebraic and geometric computations in PSPACE. In: Pro- ceedings of the Twentieth Annual ACM Symposium on Theory of Computing STOC. p. 460–467. STOC ’88, Association for Computing Machinery (1988). https://doi.org/10.1145/62212.62257
-
[10]
Strategy Improvement for Concurrent Reachability and Safety Games
Chatterjee, K., de Alfaro, L., Henzinger, T.A.: Strategy improvement for con- current reachability and safety games. CoRRabs/1201.2834(2012),https: //arxiv.org/abs/1201.2834, preprint version of the combined QEST/SODA re- sults
work page internal anchor Pith review Pith/arXiv arXiv 2012
-
[11]
Chen, T., Forejt, V., Kwiatkowska, M.Z., Parker, D., Simaitis, A.: Prism-games: A model checker for stochastic multi-player games. In: TACAS 2013. Lecture Notes in Computer Science, vol. 7795, pp. 185–191. Springer (2013).https://doi.org/ 10.1007/978-3-642-36742-7_13
-
[12]
Chen, T., Forejt, V., Kwiatkowska, M., Parker, D., Simaitis, A.: Automatic verifi- cation of competitive stochastic systems. In: Proceedings of the 18th International Conference on Tools and Algorithms for the Construction and Analysis of Systems (TACAS). Lecture Notes in Computer Science, vol. 7214, pp. 315–330. Springer (2012).https://doi.org/10.1007/97...
-
[13]
Chen, T., Lu, J.: Probabilistic alternating-time temporal logic and model checking algorithm. In: Lei, J. (ed.) FSKD 2007. pp. 35–39. IEEE Computer Society (2007)
work page 2007
-
[14]
de Alfaro, L., Henzinger, T.A., Kupferman, O.: Concurrent reachability games. Theoretical Computer Science386(3), 188–217 (2007).https://doi.org/https: //doi.org/10.1016/j.tcs.2007.07.008, expressiveness in Concurrency 20 L. Brice, T. A. Henzinger, A. Montaseri, A. Shafiee, and K. S. Thejaswini
-
[15]
In: Nonlinear model predictive control: towards new challenging applications, pp
Diehl, M., Ferreau, H.J., Haverbeke, N.: Efficient numerical methods for nonlin- ear mpc and moving horizon estimation. In: Nonlinear model predictive control: towards new challenging applications, pp. 391–417. Springer (2009)
work page 2009
-
[16]
Etessami, K., Yannakakis, M.: Recursive markov decision processes and recursive stochastic games. J. ACM62(2) (May 2015).https://doi.org/10.1145/2699431, https://doi.org/10.1145/2699431
-
[17]
Forejt, V., Kwiatkowska, M., Norman, G., Parker, D.: Automated Verifica- tion Techniques for Probabilistic Systems, pp. 53–113. Springer Berlin Heidel- berg, Berlin, Heidelberg (2011).https://doi.org/10.1007/978-3-642-21455-4_ 3,https://doi.org/10.1007/978-3-642-21455-4_3
-
[18]
In: Advances in Neural Information Processing Systems (NIPS)
Guestrin, C., Koller, D., Parr, R.: Multiagent planning with factored MDPs. In: Advances in Neural Information Processing Systems (NIPS). vol. 14, pp. 1523–
-
[19]
Gutierrez, J., Najib, M., Perelli, G., Wooldridge, M.J.: EVE: A tool for temporal equilibrium analysis. In: Lahiri, S.K., Wang, C. (eds.) Automated Technology for Verification and Analysis - 16th International Symposium, ATVA 2018, Los An- geles, CA, USA, October 7-10, 2018, Proceedings. Lecture Notes in Computer Science, vol. 11138, pp. 551–557. Springer...
-
[20]
In: Computer Science – Theory and Applications
Hansen, K.A., Ibsen-Jensen, R., Miltersen, P.B.: The complexity of solving reach- ability games using value and strategy iteration. In: Computer Science – Theory and Applications. Springer (2011)
work page 2011
-
[21]
Hensel, C., Junges, S., Katoen, J.P., Quatmann, T., Volk, M.: The probabilis- tic model checker storm. Int. J. Softw. Tools Technol. Transf.24(4), 589–610 (Aug 2022).https://doi.org/10.1007/s10009-021-00633-z,https://doi.org/ 10.1007/s10009-021-00633-z
-
[22]
Immerman, N.: Number of quantifiers is better than number of tape cells. J. Comput. Syst. Sci.22, 384–406 (1981),https://api.semanticscholar.org/ CorpusID:40753086
work page 1981
-
[23]
Journal of Machine Learning Research (JMLR)7(65), 1789–1828 (2006)
Kok, J.R., Vlassis, N.: Collaborative multiagent reinforcement learning by pay- off propagation. Journal of Machine Learning Research (JMLR)7(65), 1789–1828 (2006)
work page 2006
-
[24]
Kraft, D.: A software package for sequential quadratic programming. Tech. Rep. Forschungsbericht FB 88-28, Deutsche Forschungs- und Versuchsanstalt für Luft- und Raumfahrt (DFVLR), Köln, Germany (1988)
work page 1988
-
[25]
Kwiatkowska, M., Norman, G., Parker, D., Santos, G.: Prism-games 3.0: Stochastic game verification with concurrency, equilibria and time. In: Computer Aided Ver- ification - 32nd International Conference, CAV 2020, Los Angeles, CA, USA, July 21-24, 2020, Proceedings, Part II. Lecture Notes in Computer Science, vol. 12225, pp. 475–487. Springer (2020).http...
-
[26]
Formal Methods in System Design58(1), 188–250 (2021)
Kwiatkowska, M., Norman, G., Parker, D., Santos, G.: Automatic verification of concurrent stochastic systems. Formal Methods in System Design58(1), 188–250 (2021)
work page 2021
-
[27]
Křetínský,J.,Meggendorfer,T.,Weininger,M.:Stoppingcriteriaforvalueiteration onstochasticgameswithquantitativeobjectives.In:202338thAnnualACM/IEEE Symposium on Logic in Computer Science (LICS). pp. 1–14 (2023).https://doi. org/10.1109/LICS56636.2023.10175771
-
[28]
Lefèvre, C.: Optimal control of a birth and death epidemic process. Operations Research29(5), 971–982 (1981),http://www.jstor.org/stable/170234 Randomise Alone, Reach as a Team 21
work page 1981
-
[29]
Lomuscio, A., Qu, H., Raimondi, F.: MCMAS: an open-source model checker for the verification of multi-agent systems. Int. J. Softw. Tools Technol. Transf.19(1), 9–30 (2017).https://doi.org/10.1007/S10009-015-0378-X,https://doi.org/ 10.1007/s10009-015-0378-x
-
[30]
Procedia Manufacturing51, 1200–1206 (2020).https://doi.org/https://doi.org/10.1016/j.promfg.2020
Lyu, G., Fazlirad, A., Brennan, R.W.: Multi-agent modeling of cyber-physical systems for iec 61499 based distributed automation. Procedia Manufacturing51, 1200–1206 (2020).https://doi.org/https://doi.org/10.1016/j.promfg.2020. 10.168, 30th International Conference on Flexible Automation and Intelligent Manufacturing (FAIM2021)
-
[31]
de Moura, L., Bjørner, N.: Z3: an efficient smt solver. vol. 4963, pp. 337–340 (04 2008).https://doi.org/10.1007/978-3-540-78800-3_24
-
[32]
In: Logics in Artificial Intelligence
Murano, A., Neider, D., Zimmermann, M.: Robust alternating-time temporal logic. In: Logics in Artificial Intelligence. pp. 796–813. Springer Nature Switzerland, Cham (2023)
work page 2023
-
[33]
In: Theory and Applications of Graphs: Proceedings, Michigan May 11–15, 1976, pp
Parsons, T.D.: Pursuit-evasion in a graph. In: Theory and Applications of Graphs: Proceedings, Michigan May 11–15, 1976, pp. 426–441. Springer (2006)
work page 1976
-
[34]
Puterman, M.L.: Markov Decision Processes: Discrete Stochastic Dynamic Pro- gramming. John Wiley & Sons (1994)
work page 1994
-
[35]
McGraw-Hill, New York, 3rd edn
Rudin, W.: Principles of Mathematical Analysis. McGraw-Hill, New York, 3rd edn. (1976)
work page 1976
-
[36]
Schaefer, M., Cardinal, J., Miltzow, T.: The existential theory of the reals as a com- plexity class: A compendium. CoRRabs/2407.18006(2024).https://doi.org/ 10.48550/ARXIV.2407.18006,https://doi.org/10.48550/arXiv.2407.18006
-
[37]
IEEE Transactions on Automatic Control 62(1), 65–80 (2016)
Shi, D., Elliott, R.J., Chen, T.: On finite-state stochastic modeling and secure estimation of cyber-physical systems. IEEE Transactions on Automatic Control 62(1), 65–80 (2016)
work page 2016
-
[38]
Sorensson, N.: Minisat 2.2 and minisat++ 1.1. http://baldur. iti. uka. de/sat-race- 2010/descriptions/solver_25+ 26. pdf (2010)
work page 2010
-
[39]
Nature Methods17, 261–272 (2020).https://doi.org/10.1038/ s41592-019-0686-2
Virtanen, P., Gommers, R., Oliphant, T.E., Haberland, M., Reddy, T., Courna- peau, D., Burovski, E., Peterson, P., Weckesser, W., Bright, J., van der Walt, S.J., Brett, M., Wilson, J., Millman, K.J., Mayorov, N., Nelson, A.R.J., Jones, E., Kern, R., Larson, E., Carey, C.J., Polat, İ., Feng, Y., Moore, E.W., Vander- Plas, J., Laxalde, D., Perktold, J., Cim...
work page 2020
-
[40]
Handbook of reinforcement learning and con- trol pp
Zhang, K., Yang, Z., Başar, T.: Multi-agent reinforcement learning: A selective overview of theories and algorithms. Handbook of reinforcement learning and con- trol pp. 321–384 (2021)
work page 2021
-
[41]
2023 IEEE International Confer- ence on Robotics and Automation (ICRA) pp
Zhu, E.L., Borrelli, F.: A sequential quadratic programming approach to the so- lution of open-loop generalized nash equilibria. 2023 IEEE International Confer- ence on Robotics and Automation (ICRA) pp. 3211–3217 (2022),https://api. semanticscholar.org/CorpusID:247793988 A The existential theory of the reals Definition 3 (Existential theory of the reals)...
work page 2023
-
[42]
LetV λ σ˚ be the value of the game under collective memoryless strategy¯σ˚ Π with discount factorλ
It was shown in [7], that (this intuition is actually correct) asλÑ1´, the expected reward is equal to the actual reachability probability. LetV λ σ˚ be the value of the game under collective memoryless strategy¯σ˚ Π with discount factorλ. We haveV˚ “lim λÑ1´ V λ σ˚ ěV 1. SinceV 1 is strictly greater thant, we haveV ˚ ąt. By the definition of the limit, t...
-
[43]
We show that the team can win with proba- bility greater thantif and only if the graph has a clique of sizek. Lemma 6.If the graphGcontains a clique of sizek, the team has a memoryless strategy to reachJwith probability1. Proof.Assume that the graphGhas a clique of sizek. Lettc 1, c2, . . . , ckube the vertices of this clique. We define a memoryless colle...
-
[44]
The game moves toWk´1 with a positive probability on the next turn
-
[45]
Letηbe the strategy that plays this memoryless strategy at each states
The game remains inWwith probability 1 on the next turn. Letηbe the strategy that plays this memoryless strategy at each states. We show in the following lemma that for each statesPW, strategyηguarantees reachingTwith probability 1. Lemma 9.If the team players play the memoryless collective strategyη, for every statesPW, the probability of reachingTis 1. ...
-
[46]
Furthermore, we force that each state hasat mostone rank
Consistency of winning set and ranks.A statesis in the winning setWif and only if it has a rank. Furthermore, we force that each state hasat mostone rank. These two ensure that every state inWhas exactly one rank. Target states must have rank 0. We also forces0 to be inW. Φrank :“ ľ sPS ˜ ws Ø Nł k“0 rs,k ¸ ľ sPS ľ 0ďjăkďN ␣prs,j ^r s,kq ľ sPT rs,0 ľ ws0
-
[47]
Φsupport :“ ľ sPS ľ PiPΠ ˜ ws Ñ ł aPAi us,i,a ¸
Non-Empty strategy support.If a state is inW, every team player must have at least one valid action in their support. Φsupport :“ ľ sPS ľ PiPΠ ˜ ws Ñ ł aPAi us,i,a ¸
-
[48]
Safety inW.The strategy must guarantee that the game never leaves W. If the team plays actions from their supports, then for any opponent action b, all possible next states must be inW. Φsafe :“ ľ sPS ľ ¯aPś i Avipsq ľ bPAO ¨ ˝ ˜ ľ PiPΠ us,i,ai ¸ Ñ ľ tPSupppδps,¯a,bqq wt ˛ ‚
-
[49]
Progress (rank reduction).FromsPWwith rankką0, the strategy must have a positive probability to move to a state with a lower rank. Formally, for every opponent actionb, there must be a joint action in the team’s support that leads to a statetwith rank strictly less thank.Φprog “ ľ sPS Nľ k“1 ľ bPAO ¨ ˚˝rs,k Ñ ł ¯aPś i Avipsq ¨ ˝ ˜ ľ PiPΠ us,i,ai ¸ ^ ł tPS...
-
[50]
LetWbe the set of almost-sure winning states underη. Setw s “true for allsPW
-
[51]
For each sPW, letkbe the smallest index such thatsPW k
Define the ranks based on the layersWk constructed in Section 4.1. For each sPW, letkbe the smallest index such thatsPW k. Setr s,k “true and all other rank variables forsto false
-
[52]
This assignment satisfies all constraints by construction
Setu s,i,a “true if and only ifηipsqpaq ą0. This assignment satisfies all constraints by construction. The Consistency con- straints hold because every state inWbelongs to exactly one layerW k. The Non-Empty Support holds becauseηis a valid strategy. The Safety constraint is Randomise Alone, Reach as a Team 39 satisfied because an almost-sure winning stra...
-
[53]
Efficient handling of linear constraints.Our basic constraints are that proba- bilities must sum to 1 and be non-negative. These are linear constraints. Unlike penalty-based methods that may explore infeasible regions during intermediate steps, SLSQP exploits the structure of linear constraints to satisfy them exactly at every iteration [15]. Since the ge...
-
[54]
This means the probability for most actions should be exactly zero
Efficiency with sparse strategies.In concurrent games, the best strategies are often simple: players sometimes rely on just a small number of actions rather than mixing many options. This means the probability for most actions should be exactly zero. SLSQP employs anactive-set strategywhich is designed to quickly identify and lock onto these zero values. ...
-
[55]
Proven performance in dynamic games.SQP-based methods are a standard choice for solving games that evolve over time. For instance, Zhu and Borrelli [41] showed that SQP acts much faster than generic solvers when finding solutions to dynamic games. They found that SQP is particularly good at handling the tight interactions between players. This performance...
-
[56]
Sound under-approximation.Since our optimisation problem is non-convex, SLSQP behaves as a local optimiser. This guarantees that the valuev˚ it returns is a validunder-approximationof the true maximum (v˚ ďP repu kqpsq). In the context of formal verification, this is a “safe” error: we might underestimate the team’s winning chance, but we will never false...
work page 2000
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.