pith. sign in

arxiv: 2605.19392 · v1 · pith:OZ2OVOZZnew · submitted 2026-05-19 · 💻 cs.LG

Understanding Dynamics of Adam in Zero-Sum Games: An ODE Approach

Pith reviewed 2026-05-20 07:47 UTC · model grok-4.3

classification 💻 cs.LG
keywords Adamzero-sum gamesODE analysisGAN trainingmomentum parameterslocal convergenceimplicit regularization
0
0 comments X

The pith

In zero-sum games the first- and second-order momentum terms of Adam-DA reverse the convergence roles they play in minimization.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper derives ordinary differential equations that serve as continuous-time limits of the discrete Adam-DA updates used for zero-sum games. These ODEs make it possible to analyze local convergence and implicit gradient regularization in a tractable way. The central result is that the first-order momentum parameter slows convergence while the second-order momentum parameter accelerates it, exactly opposite to the well-known effects in ordinary minimization. The predictions are checked by running GAN training on several architectures and datasets. If the ODE approximation holds, the same reversed momentum behavior should appear in any zero-sum setting where Adam-DA is applied.

Core claim

By taking the continuous-time limit of the Adam-DA iterates, the authors obtain a system of ODEs whose equilibria and stability properties can be studied directly. Analysis of these ODEs shows that raising the first-order momentum coefficient destabilizes the saddle while raising the second-order coefficient stabilizes it; the signs of these effects are reversed relative to the standard minimization case. The same ODEs also reveal an implicit regularization term whose form depends on the momentum parameters in the opposite manner from gradient descent.

What carries the argument

The system of ordinary differential equations obtained as the continuous-time limit of the Adam-DA discrete updates.

If this is right

  • Local convergence of Adam-DA to a saddle can be read off from the eigenvalues of the linearized ODE.
  • The implicit regularization induced by Adam-DA in games takes the opposite functional form from the regularization induced in minimization.
  • Tuning guidelines for Adam-DA in GANs should invert the usual momentum recommendations.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same ODE construction could be applied to other adaptive optimizers such as RMSProp-DA to check whether the momentum reversal is specific to Adam or generic.
  • If the reversal persists in non-convex zero-sum problems, it may explain why small first-order momentum values are often preferred in practice for GAN training.
  • The ODE view suggests a possible continuous-time schedule for the momentum coefficients that could improve stability without changing the discrete algorithm.

Load-bearing premise

The discrete Adam-DA steps with typical learning rates and momentum values stay close enough to their continuous ODE trajectories that local stability and regularization results carry over.

What would settle it

Run Adam-DA on a simple bilinear zero-sum game with known saddle and measure whether increasing the first-order momentum visibly slows convergence or increasing the second-order momentum visibly speeds it up; a reversal of either trend would contradict the claim.

Figures

Figures reproduced from arXiv: 2605.19392 by Weiming Ou, Xiao Wang, Yi Feng.

Figure 1
Figure 1. Figure 1: Trajectories of Adam-DA, Continuous Adam-DA, and SignGDA-flow on three test functions from (Compagnoni et al., 2024b). Continuous Adam-DA closely approximates Adam-DA. Especially in 1(b) and 1(c), where SignGDA-flow either diverges or approaches to a different equilibrium, while the trajectories of the other two methods remain similar. More details are provided in Appendix B.1. between ODEs and algorithms.… view at source ↗
Figure 2
Figure 2. Figure 2: Numerical experiments on quadratic test functions for the local convergence of [PITH_FULL_IMAGE:figures/full_fig_p006_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: The ℓ1 norm of gradients of Adam-DA with varying β and ρ during GANs training. Datasets: CIFAR-10 and STL-10. Architectures: ResNet and CNN. As shown in 3(a), 3(c), 3(e), and 3(g), smaller β values result in smaller gradient norms. According to 3(b), 3(d), 3(f), and 3(h), larger ρ values also lead to smaller gradient norms. Both findings support the thesis. scapes in terms of ℓ1 norm, i.e., regions with lo… view at source ↗
Figure 4
Figure 4. Figure 4: Inception Score for the corresponding experimental settings in Figure [PITH_FULL_IMAGE:figures/full_fig_p008_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: Figures 5(a), 5(b), and 5(c) show the distances of two continuous-time models between Adam, with results averaged over 30 random initial conditions. In the following [PITH_FULL_IMAGE:figures/full_fig_p014_5.png] view at source ↗
Figure 6
Figure 6. Figure 6: Additional experiments with different parameters. [PITH_FULL_IMAGE:figures/full_fig_p014_6.png] view at source ↗
Figure 7
Figure 7. Figure 7: Effect of ϵ. 37 [PITH_FULL_IMAGE:figures/full_fig_p037_7.png] view at source ↗
Figure 8
Figure 8. Figure 8: Self-Attention GAN Experiments on CelebA, Evaluated by FID. In [PITH_FULL_IMAGE:figures/full_fig_p038_8.png] view at source ↗
Figure 9
Figure 9. Figure 9: IIn this figure, we reproduce one set of the experimental results from Section 5 of the submission on CNN GANs trained on the CIFAR-10 dataset. We evaluate performance using FID and include a comparison with the optimistic adaptive method. The conclusion is the same as that in [PITH_FULL_IMAGE:figures/full_fig_p039_9.png] view at source ↗
Figure 10
Figure 10. Figure 10: 2D sweep over (β, ρ) jointly. Each figure represents the final cumulative average gradient norms on 25 GANs training. Each figure shows the final cumulative average gradient norms over 25 GAN training runs. We observe that the upper-left corner of each figure exhibits smaller gradient norms than the lower-right corner, indicating that smaller β and larger ρ guide the optimization trajectories toward flatt… view at source ↗
Figure 11
Figure 11. Figure 11: Sample images generated by the models trained in [PITH_FULL_IMAGE:figures/full_fig_p041_11.png] view at source ↗
Figure 12
Figure 12. Figure 12: Sample images for different β. Architecture: ResNet. Data Set: CIFAR-10. 42 [PITH_FULL_IMAGE:figures/full_fig_p042_12.png] view at source ↗
Figure 13
Figure 13. Figure 13: Sample images for different ρ. Architecture: ResNet. Data Set: CIFAR-10. (a) β = −0.3, ρ = 0.9 (b) β = −0.2, ρ = 0.9 (c) β = 0.0, ρ = 0.9 (d) β = 0.2, ρ = 0.9 (e) β = 0.3, ρ = 0.9 (f) β = 0.5, ρ = 0.9 [PITH_FULL_IMAGE:figures/full_fig_p043_13.png] view at source ↗
Figure 14
Figure 14. Figure 14: Sample images for different β. Architecture: CNN. Data Set: STL-10. 43 [PITH_FULL_IMAGE:figures/full_fig_p043_14.png] view at source ↗
Figure 15
Figure 15. Figure 15: Sample images for different ρ. Architecture: CNN. Data Set: STL-10. 44 [PITH_FULL_IMAGE:figures/full_fig_p044_15.png] view at source ↗
read the original abstract

The remarkable success of the Adam in training neural networks has naturally led to the widespread use of its descent-ascent counterpart, Adam-DA, for solving zero-sum games. Despite its popularity in practice, a rigorous theoretical understanding of Adam-DA still lags behind. In this paper, we derive ordinary differential equations (ODEs) that serve as continuous-time limits of the Adam-DA. These ODEs closely approximate the discrete-time dynamics of Adam-DA, providing a tractable analytical framework for understanding its behavior in zero-sum games. Using this ODE approach, we investigate two fundamental aspects of Adam-DA: local convergence and implicit gradient regularization. Our analysis reveals that the roles of the first- and second-order momentum parameters in zero-sum games are exactly the opposite of their well-documented effects in minimization problems. We validate these predictions through GAN experiments across multiple architectures and datasets, demonstrating the practical implications of this reversed momentum effect.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The paper derives ordinary differential equation (ODE) limits for the discrete Adam-DA algorithm applied to zero-sum games. These ODEs are used to analyze local convergence and implicit gradient regularization, leading to the claim that the first-order momentum parameter (β1) and second-order momentum parameter (β2) play exactly opposite roles compared to their effects in standard minimization problems. The analysis is validated through qualitative GAN experiments on multiple architectures and datasets.

Significance. If the ODE approximation is faithful at practical step sizes and momentum values, the work supplies a useful continuous-time framework for understanding momentum in non-monotone settings and could inform hyperparameter selection for GAN training. The explicit reversal result, if rigorously supported, distinguishes this contribution from prior ODE analyses of Adam in convex or minimization settings.

major comments (2)
  1. [§3] §3 (ODE derivation): The continuous-time limit is obtained via standard Euler discretization and momentum rescaling, but the manuscript provides no explicit error bounds, timescale-separation conditions, or verification that the approximation remains valid for β2 ≈ 0.999 and learning rates ≈ 10^{-3} when the underlying vector field is non-monotone and oscillatory. This assumption is load-bearing for transferring local convergence and regularization conclusions from the ODE to the discrete Adam-DA updates.
  2. [§5] §5 (Experiments): The GAN results are presented without quantitative metrics (e.g., FID scores, convergence rates), ablation controls on β1/β2, or direct comparisons to Adam in minimization tasks that would demonstrate the claimed reversal. This weakens the empirical support for the central theoretical prediction.
minor comments (2)
  1. [§3] The notation for the rescaled momentum terms in the ODE should be aligned more explicitly with the discrete Adam-DA update equations to improve readability.
  2. [Introduction] Add a brief discussion of how the derived ODEs relate to existing continuous-time analyses of Adam in minimization (e.g., prior works on momentum in convex optimization) for clearer positioning.

Simulated Author's Rebuttal

2 responses · 1 unresolved

We thank the referee for the detailed and insightful comments on our manuscript. We have carefully considered each point and outline our responses and planned revisions below.

read point-by-point responses
  1. Referee: [§3] §3 (ODE derivation): The continuous-time limit is obtained via standard Euler discretization and momentum rescaling, but the manuscript provides no explicit error bounds, timescale-separation conditions, or verification that the approximation remains valid for β2 ≈ 0.999 and learning rates ≈ 10^{-3} when the underlying vector field is non-monotone and oscillatory. This assumption is load-bearing for transferring local convergence and regularization conclusions from the ODE to the discrete Adam-DA updates.

    Authors: We thank the referee for this observation. The ODE limit is derived using the standard Euler method with appropriate rescaling of the momentum terms, as is common in the literature on continuous-time analyses of adaptive optimizers. We acknowledge that the manuscript does not provide explicit error bounds or detailed timescale separation conditions, particularly for the non-monotone case. Deriving such bounds rigorously for oscillatory dynamics would require substantial additional analysis. In the revised manuscript, we will include a new subsection in §3 discussing the assumptions underlying the approximation and provide numerical verification by comparing discrete trajectories with the ODE solutions for β2 close to 1 and small learning rates in the context of our GAN experiments. This will offer practical evidence for the validity of the limit in the relevant parameter regime. revision: partial

  2. Referee: [§5] §5 (Experiments): The GAN results are presented without quantitative metrics (e.g., FID scores, convergence rates), ablation controls on β1/β2, or direct comparisons to Adam in minimization tasks that would demonstrate the claimed reversal. This weakens the empirical support for the central theoretical prediction.

    Authors: We agree that incorporating quantitative metrics and ablations would strengthen the empirical section. In the revision, we will augment §5 with FID scores and other relevant quantitative measures for the GAN experiments. We will also add ablation studies on the effects of varying β1 and β2, as well as direct comparisons to the behavior of Adam in standard minimization settings. These changes will better substantiate the claimed reversal of roles for the momentum parameters. revision: yes

standing simulated objections not resolved
  • Deriving explicit error bounds and timescale-separation conditions for the ODE approximation in non-monotone and oscillatory settings.

Circularity Check

0 steps flagged

No circularity: standard ODE limit derivation with independent analysis

full rationale

The paper derives ODEs as continuous-time limits of discrete Adam-DA updates via standard Euler discretization and momentum rescaling techniques. Local convergence and implicit regularization properties are then analyzed directly on the resulting ODE system in the zero-sum setting, yielding the reversed-momentum observation as a consequence of the vector field structure. This chain is self-contained, does not reduce any prediction to a fitted input or prior self-citation by construction, and is externally validated via GAN experiments. No load-bearing step equates to its own inputs.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central claim rests on the validity of taking continuous-time limits of the discrete Adam-DA updates and on the transfer of local stability properties from the resulting ODEs to the original algorithm. No new entities are postulated.

axioms (1)
  • domain assumption Discrete Adam-DA updates admit a continuous-time ODE limit that approximates their trajectory for small learning rates.
    This is the foundational modeling step that converts the discrete optimizer into an analyzable dynamical system.

pith-pipeline@v0.9.0 · 5680 in / 1354 out tokens · 33080 ms · 2026-05-20T07:47:16.248366+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

289 extracted references · 289 canonical work pages · 6 internal anchors

  1. [1]

    Yurii Nesterov , title =. Math. Program. , year =

  2. [2]

    Robinson , title =

    J. Robinson , title =. Annals of Mathematics , year =

  3. [3]

    Brown , title =

    G. Brown , title =. Activity Analysis of Production and Allocation , year =

  4. [4]

    Zur Elektrodynamik bewegter Körper

    Albert Einstein. Zur Elektrodynamik bewegter Körper. Annalen der Physik. 1905

  5. [5]

    The \ Companion

    Michel Goossens and Frank Mittelbach and Alexander Samarin. The \ Companion. 1993

  6. [6]

    Advances in neural information processing systems , volume=

    A unified game-theoretic approach to multiagent reinforcement learning , author=. Advances in neural information processing systems , volume=

  7. [7]

    arXiv preprint arXiv:2011.00583 , year=

    An overview of multi-agent reinforcement learning from game theoretical perspective , author=. arXiv preprint arXiv:2011.00583 , year=

  8. [8]

    Competing in the dark: An efficient algorithm for bandit linear optimization , author=

  9. [9]

    Mathematical programming , volume=

    Primal-dual subgradient methods for convex problems , author=. Mathematical programming , volume=. 2009 , publisher=

  10. [10]

    Foundations and Trends

    Online learning and online convex optimization , author=. Foundations and Trends. 2012 , publisher=

  11. [11]

    Advances in Neural Information Processing Systems , volume=

    Online Learning in Periodic Zero-Sum Games , author=. Advances in Neural Information Processing Systems , volume=

  12. [12]

    Characterization and computation of local

    Ratliff, Lillian J and Burden, Samuel A and Sastry, S Shankar , booktitle=. Characterization and computation of local. 2013 , organization=

  13. [13]

    Lee and Tengyu Ma , Booktitle =

    Rong Ge and Jason D. Lee and Tengyu Ma , Booktitle =. Matrix Completion has No Spurious Local Minimum , Year =

  14. [14]

    CoRR , Title =

    Ngoc. CoRR , Title =

  15. [15]

    Dauphin and Razvan Pascanu and Caglar Gulcehre and Kyunghyun Cho and Surya Ganguli and Yoshua Bengio , Date-Added =

    Yann N. Dauphin and Razvan Pascanu and Caglar Gulcehre and Kyunghyun Cho and Surya Ganguli and Yoshua Bengio , Date-Added =. Identifying and attacking the saddle point problem in high-dimensional non-convex optimization , Urldate =

  16. [16]

    Lee and Ioannis Panageas and Georgios Piliouras and Max Simchowitz and Michael I

    Jason D. Lee and Ioannis Panageas and Georgios Piliouras and Max Simchowitz and Michael I. Jordan and Benjamin Recht , Journal =. First-order methods almost always avoid strict saddle points , Volume =

  17. [17]

    Kakade and Michael I

    Chi Jin and Rong Ge and Praneeth Netrapalli and Sham M. Kakade and Michael I. Jordan , Booktitle =. How to Escape Saddle Points Efficiently , Year =

  18. [18]

    CoRR , volume =

    Songtao Lu and Meisam Razaviyayn and Bo Yang and Kejun Huang and Mingyi Hong , title =. CoRR , volume =

  19. [19]

    Proceedings of the 36th International Conference on Machine Learning,

    Ioannis Panageas and Georgios Piliouras and Xiao Wang , title =. Proceedings of the 36th International Conference on Machine Learning,. 2019 , crossref =

  20. [20]

    First-order methods almost always avoid saddle points: The case of vanishing step-sizes , Year =

    Ioannis Panageas and Georgios Piliouras and Xiao Wang , Booktitle =. First-order methods almost always avoid saddle points: The case of vanishing step-sizes , Year =

  21. [21]

    Gillis , Booktitle =

    N. Gillis , Booktitle =. The Why and How of Nonnegative Matrix Factorization" , Year =

  22. [22]

    D. P. Bertsekas , Date-Added =. Nonlinear Programming , Year =

  23. [23]

    Ho , Date-Added =

    N.D. Ho , Date-Added =. Nonnegative matrix factorization algorithms and applications , Year =

  24. [24]

    Cichocki, R

    A. Cichocki, R. Zdunek, S.I. Amari , Booktitle =. Hierarchical ALS algorithms for nonnegative matrix and 3d tensor factorization , Year =

  25. [25]

    Gonzalez and Yin Zhang , Title =

    Edward F. Gonzalez and Yin Zhang , Title =

  26. [26]

    Journal of Functional Analysis , Pages =

    Felix Otto and Cedric Villani , Title =. Journal of Functional Analysis , Pages =

  27. [27]

    AAMAS , year=

    James Bailey and Georgios Piliouras , title=. AAMAS , year=

  28. [28]

    2006 American Control Conference , pages=

    Fundamental constraints on uncertainty evolution in Hamiltonian systems , author=. 2006 American Control Conference , pages=. 2006 , organization=

  29. [29]

    2017 , publisher=

    Introduction to symplectic topology , author=. 2017 , publisher=

  30. [30]

    Training

    Daskalakis, Constantinos and Ilyas, Andrew and Syrgkanis, Vasilis and Zeng, Haoyang , journal=. Training

  31. [31]

    Advances in neural information processing systems , volume=

    Tight last-iterate convergence rates for no-regret learning in multi-player games , author=. Advances in neural information processing systems , volume=

  32. [32]

    International Conference on Machine Learning , pages=

    Finite-time last-iterate convergence for multi-agent learning in games , author=. International Conference on Machine Learning , pages=. 2020 , organization=

  33. [33]

    SODA , Year =

    Cycles in Adversarial Regularized Learning , Author =. SODA , Year =

  34. [34]

    Optimization despite chaos: Convex relaxations to complex limit sets via Poincar

    Piliouras, Georgios and Shamma, Jeff S , booktitle=. Optimization despite chaos: Convex relaxations to complex limit sets via Poincar. 2014 , organization=

  35. [35]

    Science , volume=

    Superhuman AI for heads-up no-limit poker: Libratus beats top professionals , author=. Science , volume=. 2018 , publisher=

  36. [36]

    and Pouget-Abadie, Jean and Mirza, Mehdi and Xu, Bing and Warde-Farley, David and Ozair, Sherjil and Courville, Aaron and Bengio, Yoshua , title =

    Goodfellow, Ian J. and Pouget-Abadie, Jean and Mirza, Mehdi and Xu, Bing and Warde-Farley, David and Ozair, Sherjil and Courville, Aaron and Bengio, Yoshua , title =. Proceedings of the 27th International Conference on Neural Information Processing Systems - Volume 2 , pages =. 2014 , publisher =

  37. [37]

    2019 , eprint=

    Autocurricula and the Emergence of Innovation from Social Interaction: A Manifesto for Multi-Agent Intelligence Research , author=. 2019 , eprint=

  38. [38]

    International Conference on Learning Representations , year=

    Smooth markets: A basic mechanism for organizing gradient-based learners , author=. International Conference on Learning Representations , year=

  39. [39]

    ICLR , Year=

    The Evolution of Uncertainty of Learning in Games , author=. ICLR , Year=

  40. [40]

    ICML , Year =

    The Mechanics of n-player Differentiable Games , Author =. ICML , Year =

  41. [41]

    2020 , booktitle =

    Yun Kuen Cheung and Georgios Piliouras , title =. 2020 , booktitle =

  42. [42]

    Conference on Learning Theory , pages=

    Vortices instead of equilibria in minmax optimization: Chaos and butterfly effects of online learning in zero-sum games , author=. Conference on Learning Theory , pages=. 2019 , organization=

  43. [43]

    International Conference on Learning Representations , year=

    Chaos of Learning Beyond Zero-sum and Coordination via Game Decompositions , author=. International Conference on Learning Representations , year=

  44. [44]

    Linear Last-iterate Convergence in Constrained Saddle-point Optimization , booktitle =

    Chen. Linear Last-iterate Convergence in Constrained Saddle-point Optimization , booktitle =

  45. [45]

    NeurIPS , year =

    Yang Cai and Argyris Oikonomou and Weiqiang Zheng , title =. NeurIPS , year =

  46. [46]

    NeurIPS , year =

    Eduard Gorbunov and Adrien Taylor and Gauthier Gidel , title =. NeurIPS , year =

  47. [47]

    ICLR , year=

    Optimistic mirror descent in saddle-point problems: Going the extra (gradient) mile , author=. ICLR , year=

  48. [48]

    Extragradient Method:

    Eduard Gorbunov and Nicolas Loizou and Gauthier Gidel , editor =. Extragradient Method:. International Conference on Artificial Intelligence and Statistics,

  49. [49]

    Pseudo holomorphic curves in symplectic manifolds , Volume =

    Misha Gromov , Journal =. Pseudo holomorphic curves in symplectic manifolds , Volume =

  50. [50]

    Differential Equations and Dynamical Systems

    Lawrence Perko. Differential Equations and Dynamical Systems. 2001

  51. [51]

    Introduction to Symplectic Topology

    Dust McDuff and Dietmar Salamon. Introduction to Symplectic Topology. 2017

  52. [52]

    Algorithmic Game Theory

    Noam Nisan and Tim Roughgarden and Eva Tardos and Vijay Varian. Algorithmic Game Theory. 2007

  53. [53]

    Foundations of Physics , volume=

    The symplectic camel and the uncertainty principle: The tip of an iceberg? , author=. Foundations of Physics , volume=. 2009 , publisher=

  54. [54]

    Nature , volume=

    The symplectic camel , author=. Nature , volume=

  55. [55]

    What is symplectic gemoetry , journal=

    Dusa Mcduff , year=. What is symplectic gemoetry , journal=

  56. [56]

    Russian Mathematical Surveys , volume=

    First steps in symplectic topology , author=. Russian Mathematical Surveys , volume=. 1986 , publisher=

  57. [57]

    Proceedings of the 2018 ACM Conference on Economics and Computation , pages=

    Multiplicative weights update in zero-sum games , author=. Proceedings of the 2018 ACM Conference on Economics and Computation , pages=

  58. [58]

    Adaptive learning in continuous games: Optimal regret bounds and convergence to

    Hsieh, Yu-Guan and Antonakopoulos, Kimon and Mertikopoulos, Panayotis , booktitle=. Adaptive learning in continuous games: Optimal regret bounds and convergence to. 2021 , organization=

  59. [59]

    International Conference on Machine Learning , pages=

    The limits of min-max optimization algorithms: Convergence to spurious non-critical sets , author=. International Conference on Machine Learning , pages=. 2021 , organization=

  60. [60]

    nature , volume=

    Mastering the game of Go with deep neural networks and tree search , author=. nature , volume=. 2016 , publisher=

  61. [61]

    2010 , publisher=

    Symplectic geometric algorithms for Hamiltonian systems , author=. 2010 , publisher=

  62. [62]

    ACM SIGecom Exchanges , volume=

    Game dynamics as the meaning of a game , author=. ACM SIGecom Exchanges , volume=. 2019 , publisher=

  63. [63]

    Scientific reports , volume=

    -rank: Multi-agent evaluation by evolution , author=. Scientific reports , volume=. 2019 , publisher=

  64. [64]

    Conference on Learning Theory , pages=

    Learning in matrix games can be arbitrarily complex , author=. Conference on Learning Theory , pages=. 2021 , organization=

  65. [65]

    arXiv preprint arXiv:2005.12649 , year=

    On the impossibility of global convergence in multi-loss optimization , author=. arXiv preprint arXiv:2005.12649 , year=

  66. [66]

    Conference on Learning Theory , pages=

    Finite regret and cycles with fixed step-size via alternating gradient descent-ascent , author=. Conference on Learning Theory , pages=. 2020 , organization=

  67. [67]

    Physica D: Nonlinear Phenomena , volume=

    Some aspects of Hamiltonian systems and symplectic algorithms , author=. Physica D: Nonlinear Phenomena , volume=. 1994 , publisher=

  68. [68]

    2006 , publisher=

    Elements of information theory , author=. 2006 , publisher=

  69. [69]

    Advances in Neural Information Processing Systems , volume=

    Alternating mirror descent for constrained min-max games , author=. Advances in Neural Information Processing Systems , volume=

  70. [70]

    Fuzzy Optimization and Decision Making , volume=

    Uncertain bimatrix game with applications , author=. Fuzzy Optimization and Decision Making , volume=. 2013 , publisher=

  71. [71]

    Advances in Neural Information Processing Systems , volume=

    Stochastic variance reduction methods for saddle-point problems , author=. Advances in Neural Information Processing Systems , volume=

  72. [72]

    arXiv preprint arXiv:1909.06946 , year=

    A stochastic proximal point algorithm for saddle-point problems , author=. arXiv preprint arXiv:1909.06946 , year=

  73. [73]

    ICML , year =

    Mengxiao Zhang and Peng Zhao and Haipeng Luo and Zhi-Hua Zhou , title =. ICML , year =

  74. [74]

    Advances in Neural Information Processing Systems , volume=

    Reducing noise in gan training with variance reduced extragradient , author=. Advances in Neural Information Processing Systems , volume=

  75. [75]

    Advances in Neural Information Processing Systems , volume=

    Stochastic recursive gradient descent ascent for stochastic nonconvex-strongly-concave minimax problems , author=. Advances in Neural Information Processing Systems , volume=

  76. [76]

    Advances in Neural Information Processing Systems , volume=

    Global convergence and variance reduction for a class of nonconvex-nonconcave minimax problems , author=. Advances in Neural Information Processing Systems , volume=

  77. [77]

    The 22nd International Conference on Artificial Intelligence and Statistics , year=

    Negative momentum for improved game dynamics , author=. The 22nd International Conference on Artificial Intelligence and Statistics , year=

  78. [78]

    Dynamic Games and Applications , volume=

    On the expected number of internal equilibria in random evolutionary games with correlated payoff matrix , author=. Dynamic Games and Applications , volume=. 2019 , publisher=

  79. [79]

    ICLR , year=

    A variational inequality perspective on generative adversarial networks , author=. ICLR , year=

  80. [80]

    ICLR , year=

    Convergence of gradient methods on bilinear zero-sum games , author=. ICLR , year=

Showing first 80 references.