On the Geometry of Games and their Solvers

David Mguni; Julian Ma; Yaqi Sun

arxiv: 2605.29919 · v1 · pith:QXG3WASHnew · submitted 2026-05-28 · 💻 cs.AI · cs.MA

On the Geometry of Games and their Solvers

Yaqi Sun , Julian Ma , David Mguni This is my paper

Pith reviewed 2026-06-29 07:46 UTC · model grok-4.3

classification 💻 cs.AI cs.MA

keywords game theoryequilibrium computationsolver adaptationgeometry of gamesadaptive algorithmsminimax optimizationGAN trainingstructure recognition

0 comments

The pith

A learned low-dimensional map of games reveals continuous regions where particular solver dynamics succeed or overlap.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper claims that equilibrium computation is governed by latent structural properties that place games along a continuous geometry aligned with solver behaviour, rather than isolated discrete classes. A structure recogniser embeds each game into a low-dimensional representation that captures optimisation dynamics, after which a policy selects or mixes primitive solver mechanisms and a residual term corrects locally. This produces both an adaptive solver that adjusts across regimes and a cartography in which games with similar solver needs cluster together. The work shows that fixed primitives systematically mismatch regimes while the learned space organises games according to actual solver performance. If the geometry exists and is learnable, equilibrium finding becomes the joint task of discovering solver primitives and locating games inside the resulting map.

Core claim

Solvability is governed by latent structural properties defining a continuous solver-aligned geometry of games. We formalise this perspective through structure-aware solver synthesis: a learned structure recogniser maps each game to a low-dimensional solver-aligned representation, and a policy maps this representation to effective primitive mechanisms, adapting solver behaviour across regimes. A bounded residual acts as a local corrector and diagnostic signal. The framework yields both an adaptive solver and an analytical lens in which games with similar optimisation dynamics cluster together, revealing continuous regions of algorithmic validity and overlapping solver behaviour.

What carries the argument

The learned structure recogniser that embeds games into a low-dimensional solver-aligned representation used to select or combine primitive mechanisms.

If this is right

Fixed solver primitives exhibit systematic regime mismatch across the game landscape.
Mixtures of primitives become necessary in overlapping regimes rather than any single dominant solver.
The learned representation organises game space into a structured cartography aligned with observed solver behaviour.
A bounded residual term serves as both local corrector and signal for incomplete solver bases.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same representation could be used to interpolate solver behaviour for entirely new game families without retraining from scratch.
Training dynamics in GANs or other minimax learners might be diagnosed by projecting the underlying game into the same geometry.
If the geometry is stable, one could search for new primitive mechanisms by optimising directly in the representation space rather than hand-designing them.

Load-bearing premise

That games possess latent structural properties which can be captured in a low-dimensional representation sufficient to predict which solver dynamics will be effective on them.

What would settle it

Measuring whether distances in the learned representation fail to predict relative performance of different solvers on a diverse held-out collection of games.

Figures

Figures reproduced from arXiv: 2605.29919 by David Mguni, Julian Ma, Yaqi Sun.

**Figure 2.** Figure 2: Overview of the proposed structure-aware solver synthesis framework. [PITH_FULL_IMAGE:figures/full_fig_p004_2.png] view at source ↗

**Figure 3.** Figure 3: Solver behaviour across the game-structure spectrum ( [PITH_FULL_IMAGE:figures/full_fig_p007_3.png] view at source ↗

**Figure 4.** Figure 4: Oracle-best primitive by diagnostic bin. Each heat map shows regions where different [PITH_FULL_IMAGE:figures/full_fig_p007_4.png] view at source ↗

**Figure 5.** Figure 5: Oracle-best primitive across pairwise diagnostic game features. Each panel bins games by [PITH_FULL_IMAGE:figures/full_fig_p014_5.png] view at source ↗

**Figure 6.** Figure 6: Pure-solver sensitivity in learned zˆ space. Each panel shows one pure primitive over the same PCA projection of zˆ. Point colour denotes the oracle-relative retention score. The spatially coherent performance patterns show that zˆ encodes continuous solver-performance sensitivity. The weight maps are spatially organised. GDA receives broad moderate weight over a large central region; OMD and optimistic up… view at source ↗

**Figure 7.** Figure 7: Learned primitive-weight landscapes in PCA [PITH_FULL_IMAGE:figures/full_fig_p016_7.png] view at source ↗

**Figure 8.** Figure 8: Mapping from game features to learned zˆ and solver clusters. The figure from left to right: raw game features provide the original structural coordinates; the recogniser maps these games into the learned zˆ space; the policy head then assigns solver regions or dominant primitive clusters. games outside the focal primitive are drawn faintly for spatial reference. One structural observations stand out. Win … view at source ↗

**Figure 9.** Figure 9: Per-oracle-primitive win/tie/loss regions in shared PCA( [PITH_FULL_IMAGE:figures/full_fig_p017_9.png] view at source ↗

**Figure 10.** Figure 10: Example transition game where the learned convex mixture outperforms its constituent [PITH_FULL_IMAGE:figures/full_fig_p017_10.png] view at source ↗

**Figure 11.** Figure 11: Residual diagnostic map in zˆ space over N = 2,000 probe games. (a) Base game difficulty without residual correction, measured by validation AUC. Spatial continuity, measured by Moran’s I = 0.71, shows that difficult games form coherent regions in latent space. (b) Residual activation intensity. The residual response is also spatially coherent, with Moran’s I = 0.56, indicating that the residual reacts to… view at source ↗

read the original abstract

A central challenge in game theory and learning systems such as GANs is understanding which algorithms can efficiently compute equilibria across the heterogeneous landscape of games. Equilibrium computation is typically studied solver by solver and game class by game class, yielding strong local guarantees but a fragmented view of solver behaviour. Existing discrete taxonomies often provide an incomplete account of where algorithms succeed. We study this problem through a solver-game map linking games to effective solver dynamics. Classical theory identifies isolated regions of this map but provides limited insight into intermediate or overlapping regimes, suggesting that solvability is governed by latent structural properties defining a continuous solver-aligned geometry of games. We formalise this perspective through structure-aware solver synthesis. A learned structure recogniser maps each game to a low-dimensional solver-aligned representation, and a policy maps this representation to effective primitive mechanisms, adapting solver behaviour across regimes. This reveals regions where particular solver dynamics are effective and where mixtures of primitives are required rather than a single dominant solver. A bounded residual acts as a local corrector and diagnostic signal for incomplete solver bases or representations. The framework yields both an adaptive solver and an analytical lens: games with similar optimisation dynamics cluster together, revealing continuous regions of algorithmic validity and overlapping solver behaviour. Empirically, we show that fixed primitives exhibit systematic regime mismatch, while the learned representation organises game space into a structured cartography aligned with solver behaviour. These results suggest viewing equilibrium computation as the joint problem of learning solver mechanisms and mapping the geometry of solvability.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper sketches a learned continuous geometry for mapping games to solver regimes but supplies no math, training details, or results to show the geometry is independent rather than fitted to the signal.

read the letter

The core idea is a data-driven map that puts games into a low-dimensional space where proximity reflects similar solver performance, then uses that to pick or mix primitives. This is a step past the usual discrete taxonomies of game classes and solver guarantees. It treats equilibrium finding as joint structure learning plus mechanism selection, with a residual term for when the base set falls short.

What stands out is the attempt to make the fragmented solver landscape more continuous and adaptive. The structure recogniser plus policy setup could, in principle, let a system handle intermediate regimes where no single classical solver dominates. That framing is cleaner than listing isolated results.

The soft spot is that nothing in the abstract shows the representation is discovering independent latent properties instead of reproducing the training labels on solver success. Without the objective function, the game distribution, or any held-out validation, the clustering could be tautological. The bounded residual is mentioned as a diagnostic but not derived or tested. No equations, no error metrics, no comparison to baselines appear here.

This is for people working on meta-algorithms for games or equilibrium computation who want a new organising lens. It is not yet ready for readers who need reproducible claims or falsifiable predictions.

If the full paper contains concrete derivations, reproducible experiments, and checks against circularity, it deserves referee time. On the abstract alone the contribution is mostly programmatic.

Referee Report

3 major / 0 minor

Summary. The manuscript proposes a framework called structure-aware solver synthesis for equilibrium computation across heterogeneous games. A learned structure recogniser maps each game to a low-dimensional solver-aligned representation; a policy then maps this representation to combinations of primitive solver mechanisms. The approach is claimed to yield both an adaptive solver and an analytical lens that clusters games according to optimisation dynamics, revealing continuous regions of algorithmic validity and overlapping solver regimes. Fixed primitives are said to exhibit systematic regime mismatch, while the learned representation produces a structured cartography aligned with solver behaviour. A bounded residual is introduced as a local corrector and diagnostic signal.

Significance. If the central claims are substantiated with independent validation, the work could provide a valuable shift from discrete taxonomies of games and solvers toward a continuous geometry of solvability. This would be useful for designing adaptive equilibrium-finding methods in AI systems such as GANs. The attempt to formalise latent structural properties that predict solver dynamics is a constructive direction, though the manuscript supplies no derivations, datasets, or metrics to evaluate whether the geometry is discovered rather than constructed.

major comments (3)

[Abstract] Abstract: the claim that 'the learned representation organises game space into a structured cartography aligned with solver behaviour' is presented without any description of the training objective, loss function, datasets, or quantitative metrics used to establish alignment. This makes it impossible to determine whether the alignment reflects independent latent properties or is tautological by construction.
[Abstract] Abstract: the 'solver-aligned representation' is defined via the output of the learned model itself, with no mention of external benchmarks or validation separate from the solver-success signal used in training. This directly undermines the central assumption that latent structural properties exist and are captured independently.
[Abstract] Abstract: the 'bounded residual' is introduced as 'a local corrector and diagnostic signal for incomplete solver bases or representations' but receives no mathematical definition, update rule, or empirical illustration, leaving a load-bearing component of the framework undefined.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for the detailed comments, which correctly identify areas where the abstract is insufficiently precise. We agree that the claims require more supporting detail and will revise both the abstract and relevant sections of the manuscript to supply the missing methodological information, definitions, and validation procedures.

read point-by-point responses

Referee: [Abstract] Abstract: the claim that 'the learned representation organises game space into a structured cartography aligned with solver behaviour' is presented without any description of the training objective, loss function, datasets, or quantitative metrics used to establish alignment. This makes it impossible to determine whether the alignment reflects independent latent properties or is tautological by construction.

Authors: We agree that the abstract does not supply these details. The revised abstract will briefly state the training objective (a supervised loss aligning embeddings to solver performance), the dataset construction, and the quantitative alignment metrics (e.g., correlation between embedding geometry and solver success rates). This addition will make clear how the claimed cartography is evaluated. revision: yes
Referee: [Abstract] Abstract: the 'solver-aligned representation' is defined via the output of the learned model itself, with no mention of external benchmarks or validation separate from the solver-success signal used in training. This directly undermines the central assumption that latent structural properties exist and are captured independently.

Authors: We accept the point that the abstract provides no indication of separate validation. The revision will describe the input features to the structure recogniser (payoff-derived structural descriptors) and will add mention of held-out validation on solver types and game classes not seen during training, to demonstrate that the representation captures properties beyond the direct training signal. revision: yes
Referee: [Abstract] Abstract: the 'bounded residual' is introduced as 'a local corrector and diagnostic signal for incomplete solver bases or representations' but receives no mathematical definition, update rule, or empirical illustration, leaving a load-bearing component of the framework undefined.

Authors: We agree that the abstract leaves the bounded residual without definition or illustration. The revised version will include a concise mathematical definition, the update rule, and reference to empirical results demonstrating its corrective and diagnostic role. revision: yes

Circularity Check

0 steps flagged

No significant circularity; derivation remains self-contained

full rationale

The abstract and description outline a learned structure recogniser that produces a solver-aligned representation used to map to solver primitives, with empirical clustering as output. No equations, training objectives, or self-citations are quoted that would reduce the representation or predictions to the inputs by construction. No load-bearing self-citation chains or fitted parameters renamed as independent predictions appear. The framework is presented as jointly learning mechanisms and geometry, but without explicit reduction to tautology in the given text, the central claims retain independent empirical content.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 2 invented entities

The central claim rests on the existence of learnable latent structural properties that form a continuous solver-aligned geometry; this is an unproven domain assumption introduced to motivate the framework.

axioms (1)

domain assumption Games possess latent structural properties that define a continuous solver-aligned geometry
Invoked when formalising the perspective that solvability is governed by these properties rather than discrete classes.

invented entities (2)

solver-aligned representation no independent evidence
purpose: Low-dimensional embedding of games that predicts effective solver dynamics
Core output of the structure recogniser; no independent evidence provided.
bounded residual no independent evidence
purpose: Local corrector and diagnostic signal for incomplete solver bases
Introduced as part of the framework; no independent evidence provided.

pith-pipeline@v0.9.1-grok · 5792 in / 1345 out tokens · 30260 ms · 2026-06-29T07:46:09.940834+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

28 extracted references · 6 canonical work pages · 1 internal anchor

[1]

Andrychowicz, M

M. Andrychowicz, M. Denil, S. Gomez, M. W. Hoffman, D. Pfau, T. Schaul, B. Shillingford, and N. de Freitas. Learning to learn by gradient descent by gradient descent. InAdvances in Neural Information Processing Systems, volume 29, 2016. URL https://proceedings.neurips. cc/paper/2016/hash/fb87582825f9d28a8d42c5e5e5e8b23d-Abstract.html

2016
[2]

Antonakopoulos, E

K. Antonakopoulos, E. V . Belmega, and P. Mertikopoulos. Adaptive extra-gradient methods for min-max optimization and games.arXiv preprint arXiv:2010.12100, 2020

work page arXiv 2010
[3]

Balduzzi, S

D. Balduzzi, S. Racaniere, J. Martens, J. Foerster, K. Tuyls, and T. Graepel. The mechanics of n-player differentiable games. InProceedings of the 35th International Conference on Machine Learning, volume 80 ofProceedings of Machine Learning Research, pages 354–363. PMLR,
[4]

URLhttps://proceedings.mlr.press/v80/balduzzi18a.html
[5]

Beck and M

A. Beck and M. Teboulle. Mirror descent and nonlinear projected subgradient methods for convex optimization.Operations Research Letters, 31(3):167–175, 2003

2003
[6]

U. Berger. Brown’s original fictitious play.Journal of Economic Theory, 135(1):572–578, 2007

2007
[7]

IVOA Recommendation: Simple Line Access Protocol Version 1.0

O. Candogan, I. Menache, A. Ozdaglar, and P. A. Parrilo. Flows and decompositions of games: Harmonic and potential games.Mathematics of Operations Research, 36(3):474–503, 2011. doi: 10.1287/moor.1110.0500. URLhttps://doi.org/10.1287/moor.1110.0500

work page internal anchor Pith review Pith/arXiv arXiv doi:10.1287/moor.1110.0500 2011
[8]

Candogan, A

O. Candogan, A. Ozdaglar, and P. A. Parrilo. Near-potential games: Geometry and dynamics. ACM Transactions on Economics and Computation (TEAC), 1(2):1–32, 2013

2013
[9]

X. Chen, X. Deng, et al. Settling the complexity of two-player nash equilibrium. InFOCS, volume 6, pages 261–272, 2006

2006
[10]

Y . Chen, X. Deng, C. Li, D. Mguni, J. Wang, X. Yan, and Y . Yang. On the convergence of fictitious play: A decomposition approach.arXiv preprint arXiv:2205.01469, 2022

work page arXiv 2022
[11]

Daskalakis and I

C. Daskalakis and I. Panageas. The limit points of (optimistic) gradient descent in min-max optimization.Advances in neural information processing systems, 31, 2018

2018
[12]

Daskalakis, A

C. Daskalakis, A. Ilyas, V . Syrgkanis, and H. Zeng. Training GANs with optimism. In International Conference on Learning Representations, 2018. URL https://openreview. net/forum?id=SJJySbbAZ

2018
[13]

X. Deng, N. Li, D. Mguni, J. Wang, and Y . Yang. On the complexity of computing markov perfect equilibrium in general-sum stochastic games.National Science Review, 10(1):nwac256, 2023

2023
[14]

Gidel, H

G. Gidel, H. Berard, G. Vignoud, P. Vincent, and S. Lacoste-Julien. A variational inequality perspective on generative adversarial networks. InInternational Conference on Learning Representations, 2019. URLhttps://openreview.net/forum?id=r1laEnA5Ym. 11

2019
[15]

Krishna and T

V . Krishna and T. Sjöström. On the convergence of fictitious play.Mathematics of Operations Research, 23(2):479–511, 1998

1998
[16]

Q. D. La, Y . H. Chew, and B.-H. Soong. Potential games. InPotential Game Theory: Applica- tions in Radio Resource Allocation, pages 23–69. Springer, 2016

2016
[17]

D. S. Leslie, S. Perkins, and Z. Xu. Best-response dynamics in zero-sum stochastic games. Journal of Economic Theory, 189:105095, 2020

2020
[18]

H. Li, W. Huang, Z. Duan, D. H. Mguni, K. Shao, J. Wang, and X. Deng. A survey on algorithms for nash equilibria in finite normal-form games.Computer Science Review, 51:100613, 2024

2024
[19]

A. Matsui. Best response dynamics and socially stable strategies.Journal of Economic Theory, 57(2):343–362, 1992

1992
[20]

Mokhtari, A

A. Mokhtari, A. Ozdaglar, and S. Pattathil. A unified analysis of extra-gradient and opti- mistic gradient methods for saddle point problems: Proximal point approach. InInternational Conference on Artificial Intelligence and Statistics, pages 1497–1507. PMLR, 2020

2020
[21]

Monderer and L

D. Monderer and L. S. Shapley. Fictitious play property for games with identical interests. Journal of economic theory, 68(1):258–265, 1996

1996
[22]

Monderer and L

D. Monderer and L. S. Shapley. Potential games.Games and Economic Behavior, 14(1): 124–143, 1996. doi: 10.1006/game.1996.0044. URL https://doi.org/10.1006/game. 1996.0044

work page doi:10.1006/game.1996.0044 1996
[23]

Nemirovski

A. Nemirovski. Prox-method with rate of convergence O(1/t) for variational inequalities with lipschitz continuous monotone operators and smooth convex-concave saddle point problems. SIAM Journal on Optimization, 15(1):229–251, 2004. doi: 10.1137/S1052623403425629. URL https://doi.org/10.1137/S1052623403425629

work page doi:10.1137/s1052623403425629 2004
[24]

C. H. Papadimitriou and T. Roughgarden. Computing equilibria in multi-player games. In SODA, volume 5, pages 82–91, 2005

2005
[25]

Schulman, S

J. Schulman, S. Levine, P. Abbeel, M. Jordan, and P. Moritz. Trust region policy optimization. InInternational conference on machine learning, pages 1889–1897. PMLR, 2015

2015
[26]

Sparrow, S

C. Sparrow, S. van Strien, and C. Harris. Fictitious play in 3× 3 games: The transition between periodic and chaotic behaviour.Games and Economic Behavior, 63(1):259–291, 2008

2008
[27]

Swenson, R

B. Swenson, R. Murray, and S. Kar. On best-response dynamics in potential games.SIAM Journal on Control and Optimization, 56(4):2734–2767, 2018

2018
[28]

Y . Yang, C. Ma, Z. Ding, S. McAleer, C. Jin, J. Wang, and T. Sandholm. Game-theoretic multiagent reinforcement learning.arXiv preprint arXiv:2011.00583, 2020. 12 Table 3: Summary results on the 10D payoff benchmark. Gap Closure uses the same definition as Table 1. Method Val AUC↓Val Final↓Gap Closure Per-game oracle 0.02512 0.00696 100.0% Learned (soft) ...

work page arXiv 2011

[1] [1]

Andrychowicz, M

M. Andrychowicz, M. Denil, S. Gomez, M. W. Hoffman, D. Pfau, T. Schaul, B. Shillingford, and N. de Freitas. Learning to learn by gradient descent by gradient descent. InAdvances in Neural Information Processing Systems, volume 29, 2016. URL https://proceedings.neurips. cc/paper/2016/hash/fb87582825f9d28a8d42c5e5e5e8b23d-Abstract.html

2016

[2] [2]

Antonakopoulos, E

K. Antonakopoulos, E. V . Belmega, and P. Mertikopoulos. Adaptive extra-gradient methods for min-max optimization and games.arXiv preprint arXiv:2010.12100, 2020

work page arXiv 2010

[3] [3]

Balduzzi, S

D. Balduzzi, S. Racaniere, J. Martens, J. Foerster, K. Tuyls, and T. Graepel. The mechanics of n-player differentiable games. InProceedings of the 35th International Conference on Machine Learning, volume 80 ofProceedings of Machine Learning Research, pages 354–363. PMLR,

[4] [4]

URLhttps://proceedings.mlr.press/v80/balduzzi18a.html

[5] [5]

Beck and M

A. Beck and M. Teboulle. Mirror descent and nonlinear projected subgradient methods for convex optimization.Operations Research Letters, 31(3):167–175, 2003

2003

[6] [6]

U. Berger. Brown’s original fictitious play.Journal of Economic Theory, 135(1):572–578, 2007

2007

[7] [7]

IVOA Recommendation: Simple Line Access Protocol Version 1.0

O. Candogan, I. Menache, A. Ozdaglar, and P. A. Parrilo. Flows and decompositions of games: Harmonic and potential games.Mathematics of Operations Research, 36(3):474–503, 2011. doi: 10.1287/moor.1110.0500. URLhttps://doi.org/10.1287/moor.1110.0500

work page internal anchor Pith review Pith/arXiv arXiv doi:10.1287/moor.1110.0500 2011

[8] [8]

Candogan, A

O. Candogan, A. Ozdaglar, and P. A. Parrilo. Near-potential games: Geometry and dynamics. ACM Transactions on Economics and Computation (TEAC), 1(2):1–32, 2013

2013

[9] [9]

X. Chen, X. Deng, et al. Settling the complexity of two-player nash equilibrium. InFOCS, volume 6, pages 261–272, 2006

2006

[10] [10]

Y . Chen, X. Deng, C. Li, D. Mguni, J. Wang, X. Yan, and Y . Yang. On the convergence of fictitious play: A decomposition approach.arXiv preprint arXiv:2205.01469, 2022

work page arXiv 2022

[11] [11]

Daskalakis and I

C. Daskalakis and I. Panageas. The limit points of (optimistic) gradient descent in min-max optimization.Advances in neural information processing systems, 31, 2018

2018

[12] [12]

Daskalakis, A

C. Daskalakis, A. Ilyas, V . Syrgkanis, and H. Zeng. Training GANs with optimism. In International Conference on Learning Representations, 2018. URL https://openreview. net/forum?id=SJJySbbAZ

2018

[13] [13]

X. Deng, N. Li, D. Mguni, J. Wang, and Y . Yang. On the complexity of computing markov perfect equilibrium in general-sum stochastic games.National Science Review, 10(1):nwac256, 2023

2023

[14] [14]

Gidel, H

G. Gidel, H. Berard, G. Vignoud, P. Vincent, and S. Lacoste-Julien. A variational inequality perspective on generative adversarial networks. InInternational Conference on Learning Representations, 2019. URLhttps://openreview.net/forum?id=r1laEnA5Ym. 11

2019

[15] [15]

Krishna and T

V . Krishna and T. Sjöström. On the convergence of fictitious play.Mathematics of Operations Research, 23(2):479–511, 1998

1998

[16] [16]

Q. D. La, Y . H. Chew, and B.-H. Soong. Potential games. InPotential Game Theory: Applica- tions in Radio Resource Allocation, pages 23–69. Springer, 2016

2016

[17] [17]

D. S. Leslie, S. Perkins, and Z. Xu. Best-response dynamics in zero-sum stochastic games. Journal of Economic Theory, 189:105095, 2020

2020

[18] [18]

H. Li, W. Huang, Z. Duan, D. H. Mguni, K. Shao, J. Wang, and X. Deng. A survey on algorithms for nash equilibria in finite normal-form games.Computer Science Review, 51:100613, 2024

2024

[19] [19]

A. Matsui. Best response dynamics and socially stable strategies.Journal of Economic Theory, 57(2):343–362, 1992

1992

[20] [20]

Mokhtari, A

A. Mokhtari, A. Ozdaglar, and S. Pattathil. A unified analysis of extra-gradient and opti- mistic gradient methods for saddle point problems: Proximal point approach. InInternational Conference on Artificial Intelligence and Statistics, pages 1497–1507. PMLR, 2020

2020

[21] [21]

Monderer and L

D. Monderer and L. S. Shapley. Fictitious play property for games with identical interests. Journal of economic theory, 68(1):258–265, 1996

1996

[22] [22]

Monderer and L

D. Monderer and L. S. Shapley. Potential games.Games and Economic Behavior, 14(1): 124–143, 1996. doi: 10.1006/game.1996.0044. URL https://doi.org/10.1006/game. 1996.0044

work page doi:10.1006/game.1996.0044 1996

[23] [23]

Nemirovski

A. Nemirovski. Prox-method with rate of convergence O(1/t) for variational inequalities with lipschitz continuous monotone operators and smooth convex-concave saddle point problems. SIAM Journal on Optimization, 15(1):229–251, 2004. doi: 10.1137/S1052623403425629. URL https://doi.org/10.1137/S1052623403425629

work page doi:10.1137/s1052623403425629 2004

[24] [24]

C. H. Papadimitriou and T. Roughgarden. Computing equilibria in multi-player games. In SODA, volume 5, pages 82–91, 2005

2005

[25] [25]

Schulman, S

J. Schulman, S. Levine, P. Abbeel, M. Jordan, and P. Moritz. Trust region policy optimization. InInternational conference on machine learning, pages 1889–1897. PMLR, 2015

2015

[26] [26]

Sparrow, S

C. Sparrow, S. van Strien, and C. Harris. Fictitious play in 3× 3 games: The transition between periodic and chaotic behaviour.Games and Economic Behavior, 63(1):259–291, 2008

2008

[27] [27]

Swenson, R

B. Swenson, R. Murray, and S. Kar. On best-response dynamics in potential games.SIAM Journal on Control and Optimization, 56(4):2734–2767, 2018

2018

[28] [28]

Y . Yang, C. Ma, Z. Ding, S. McAleer, C. Jin, J. Wang, and T. Sandholm. Game-theoretic multiagent reinforcement learning.arXiv preprint arXiv:2011.00583, 2020. 12 Table 3: Summary results on the 10D payoff benchmark. Gap Closure uses the same definition as Table 1. Method Val AUC↓Val Final↓Gap Closure Per-game oracle 0.02512 0.00696 100.0% Learned (soft) ...

work page arXiv 2011