Formal Verification of Minimax Algorithms
Pith reviewed 2026-05-18 14:36 UTC · model grok-4.3
The pith
One practical depth-limited minimax implementation satisfies a new witness-based correctness criterion while a second common variant violates it.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
For depth-limited negamax with alpha-beta pruning and transposition tables the authors introduce a witness-based correctness criterion that accepts a returned value precisely when it can be justified by an explicit game-tree expansion. Under this criterion one practical variant admits a fully mechanized Dafny proof while a second variant is refuted by an explicit counterexample that demonstrates a violation of the witness property.
What carries the argument
The witness-based correctness criterion, which requires that every value returned by depth-limited search with transposition tables be accompanied by an explicit justifying game-tree expansion.
If this is right
- Implementations of depth-limited search must choose transposition-table update rules carefully to preserve the witness property.
- A mechanized proof becomes feasible once the witness criterion is adopted as the target property.
- Counterexamples can be generated mechanically when the witness condition is violated.
- Verification artifacts for both the proof and the counterexample are executable and publicly available.
Where Pith is reading between the lines
- The same witness technique could be tried on other search enhancements such as iterative deepening or move ordering heuristics.
- Engine authors may now treat the witness criterion as an explicit invariant to maintain during future optimizations.
- The existence of a counterexample suggests that some widely used transposition-table implementations may silently return unjustified values in certain edge cases.
Load-bearing premise
The witness-based correctness criterion accurately captures what it means for a depth-limited search value to be acceptable when transposition tables are present.
What would settle it
Running the counterexample variant on the concrete position and move sequence supplied in the paper and checking whether the value it returns can or cannot be justified by a full tree expansion of the reported depth.
Figures
read the original abstract
Minimax-based search algorithms with alpha-beta pruning and transposition tables are a central component of classical game-playing engines and remain widely used in practice. Despite their widespread use, these algorithms are subtle, highly optimized, and notoriously difficult to reason about, making non-obvious errors hard to detect by testing alone. Using the Dafny verification system, we formally verify a range of minimax search algorithms, including variants with alpha-beta pruning and transposition tables. For depth-limited search with transposition tables, we introduce a witness-based correctness criterion that captures when returned values can be justified by an explicit game-tree expansion. We apply this criterion to two practical variants of depth-limited negamax with alpha-beta pruning and transposition tables: for one variant, we obtain a fully mechanized correctness proof, while for the other we construct a concrete counterexample demonstrating a violation of the proposed correctness notion. All verification artifacts, including Dafny proofs and executable Python implementations, are publicly available.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper claims to formally verify minimax search algorithms with alpha-beta pruning and transposition tables using Dafny. For depth-limited search, it introduces a witness-based correctness criterion (that a returned value is acceptable precisely when justified by explicit game-tree expansion). One practical variant of depth-limited negamax with alpha-beta pruning and transposition tables receives a fully mechanized Dafny correctness proof; the other is shown to violate the criterion via an explicit counterexample. All Dafny proofs and executable Python implementations are released publicly.
Significance. If the results hold, the work is significant for delivering machine-checked Dafny proofs and a concrete counterexample for subtle, widely deployed algorithms in game-playing engines. The release of verification artifacts together with the falsifiable witness-based criterion provides reproducible, high-assurance evidence that strengthens trust in these implementations and raises the standard for formal methods in AI search.
minor comments (2)
- The introduction of the witness-based correctness criterion as a modeling choice (rather than a derived property) is stated clearly but would benefit from an early, self-contained illustrative example before the formal definitions.
- Cross-references between the Dafny proof modules and the corresponding Python implementations could be added to the artifact repository README to ease independent reproduction.
Simulated Author's Rebuttal
We thank the referee for their positive assessment of the manuscript and for recommending acceptance. The provided summary accurately captures the paper's contributions, including the witness-based correctness criterion, the mechanized Dafny proof for one variant, and the counterexample for the other.
Circularity Check
No significant circularity
full rationale
The paper explicitly introduces a witness-based correctness criterion as a modeling choice in the section describing the criterion, then delivers a mechanized Dafny proof that one practical variant of depth-limited negamax with alpha-beta pruning and transposition tables satisfies the criterion while constructing a concrete counterexample showing violation for the other variant. All results follow directly from the stated definitions, the Dafny semantics, and the explicitly modeled witness notion without any reduction to fitted parameters, self-citations that bear the central load, or equivalences by construction. The formalization is self-contained against external benchmarks.
Axiom & Free-Parameter Ledger
axioms (2)
- domain assumption Game trees are finite and acyclic with perfect information.
- standard math Dafny's verification semantics correctly capture the operational behavior of the encoded algorithms.
invented entities (1)
-
witness-based correctness criterion
no independent evidence
Reference graph
Works this paper leans on
-
[1]
Bird, R.S., Hughes, J.: The alpha-beta algorithm: An exercise in program trans- formation. Inf. Process. Lett.24(1), 53–57 (1987). https://doi.org/10.1016/0020- 0190(87)90198-0
-
[2]
Breuker, D.: Memory versus Search in Games. Ph.d. thesis, Maastricht University, Maastricht (1998). https://doi.org/10.26481/dis.19981016db, supervised by Hen- drik van den Herik
-
[3]
Fishburn, J.P.: An optimization of alpha-beta search. SIGART Newsl.72, 29–31 (1980). https://doi.org/10.1145/1056447.1056450
-
[4]
Fishburn, J.P.: Another optimization of alpha-beta search. SIGART Newsl.84, 37–38 (1983). https://doi.org/10.1145/1056623.1056628
-
[5]
Knuth, D.E., Moore, R.W.: An analysis of alpha-beta pruning. Artif. Intell.6(4), 293–326 (1975). https://doi.org/10.1016/0004-3702(75)90019-3
-
[6]
Leino, R.: Dafny: An automatic program verifier for functional correctness. In: 16th International Conference, LPAR-16, Dakar, Senegal. pp. 348–370. Springer Berlin Heidelberg (April 2010). https://doi.org/10.1007/978-3-642-17511-4_20
-
[7]
Marsland, T.A.: A review of game-tree pruning. ICCA Journal9(1), 3–19 (1986). https://doi.org/10.3233/ICG-1986-9102
-
[8]
ACM Books, self-published (Jun 2024), https://functional-algorithms- verified.org/
Nipkow, T., et al.: Functional Data Structures and Algorithms: A Proof Assistant Approach. ACM Books, self-published (Jun 2024), https://functional-algorithms- verified.org/
work page 2024
-
[9]
Stockman, G.C.: A minimax algorithm better than alpha-beta? Artif. Intell.12(2), 179–196 (1979). https://doi.org/10.1016/0004-3702(79)90016-X
-
[10]
Wesselink, W.: Minimax verification.https://github.com/wiegerw/minimax (2025), accessed: 2025-07-16
work page 2025
-
[11]
https://en.wikipedia.org/wiki/Negamax (2025), [Online; accessed 09-June-2025]
Wikipedia contributors: Negamax — Wikipedia, the free encyclopedia. https://en.wikipedia.org/wiki/Negamax (2025), [Online; accessed 09-June-2025]
work page 2025
-
[12]
Ying, M.: Floyd-Hoare logic for quantum programs. ACM Trans. Program. Lang. Syst.33(6), 19:1–19:49 (2011). https://doi.org/10.1145/2049706.2049708 10 W. Wesselink et al. Algorithm 3Transposition-table-based search from Wikipedia [11]. Input: A game treeu, an alpha-beta window(α, β)and a transposition tableT. Output: A valuevaluesuch thatis-negamax-tt-valu...
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.