Random Reshuffling-Based Distributed Nash Equilibrium Seeking
Pith reviewed 2026-05-13 18:58 UTC · model grok-4.3
The pith
Random reshuffling enables linear convergence to a neighborhood of Nash equilibria with constant steps and exact convergence with diminishing steps in distributed partial-information games.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
For the distributed partial-decision-information case the random reshuffling algorithm converges linearly to a neighborhood of the Nash equilibrium under constant parameters and converges exactly to the Nash equilibrium almost surely and in mean square under diminishing parameters, after first establishing a descent-type bound for the full-information case via an intermediate reference trajectory and a shuffling variance that capture the epoch-wise dynamics induced by reshuffling.
What carries the argument
The shuffling variance together with an intermediate reference trajectory that characterize the epoch-wise dynamics of the random reshuffling iterates.
If this is right
- The algorithm achieves higher steady-state accuracy and better long-horizon performance than conventional with-replacement SGD.
- The same convergence statements hold for both the full-information benchmark and the partial-decision-information extension.
- Numerical tests on an EV charging game and a nonquadratic edge resource admission game confirm the accuracy advantage.
- Convergence relies on the standard strong monotonicity and Lipschitz conditions that produce the descent bounds.
Where Pith is reading between the lines
- The without-replacement structure may reduce the total number of communication rounds needed to reach a target accuracy in large networks.
- The same reference-trajectory technique could be tested on other multi-agent problems such as distributed potential games or variational inequalities.
- Asynchronous or time-varying network versions would be a direct extension that preserves the epoch-wise analysis.
- Replacing SGD with random reshuffling inside existing distributed learning pipelines could be checked by measuring variance reduction per epoch.
Load-bearing premise
The game is strongly monotone and Lipschitz continuous and players maintain sufficiently accurate local estimates of the joint action profile.
What would settle it
Simulate the EV charging game with constant parameters and check whether the distance to the known Nash equilibrium decreases at a linear rate or plateaus at a positive error level after many epochs.
Figures
read the original abstract
This paper studies random reshuffling (RR)-based distributed Nash equilibrium seeking for noncooperative games. The game is motivated as a sample-average approximation of an underlying expected-value stochastic game, while the algorithmic focus is placed on the resulting finite-sum equilibrium problem. Unlike existing distributed stochastic Nash equilibrium methods that mainly rely on with-replacement sampling, the proposed approach incorporates without-replacement component updates into equilibrium computation over networks. We first consider a full-information benchmark, for which an intermediate reference trajectory and a shuffling variance are introduced to characterize the epoch-wise dynamics induced by RR. The method is then extended to the more practical partial-decision-information setting, where each player updates its action using local estimates of the joint action profile. For the full-information case, a descent-type bound is established for the RR iterates. For the distributed partial-decision-information case, it is shown that, under constant parameters, the proposed algorithm converges linearly to a neighborhood of the Nash equilibrium, while under diminishing parameters, it converges exactly to the Nash equilibrium almost surely and in mean square. Numerical experiments on an EV charging game and a nonquadratic edge resource admission game demonstrate that RR consistently outperforms the conventional with-replacement SGD baseline in both steady-state accuracy and long-horizon performance.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper proposes random reshuffling (RR)-based distributed algorithms for Nash equilibrium seeking in noncooperative games formulated as finite-sum problems. For the full-information benchmark, an intermediate reference trajectory and shuffling variance are used to derive an epoch-wise descent bound. The approach is extended to the partial-decision-information setting via a linear consensus protocol on local action estimates; under constant parameters the iterates converge linearly to a neighborhood of the Nash equilibrium, while under diminishing parameters they converge almost surely and in mean square to the exact equilibrium. Numerical results on an EV charging game and a nonquadratic edge resource admission game show RR outperforming with-replacement SGD baselines in accuracy and long-horizon performance.
Significance. If the stated convergence results hold, the work provides a practical improvement over existing with-replacement distributed stochastic Nash methods by exploiting without-replacement sampling, which yields tighter epoch-wise bounds and better empirical steady-state accuracy. The partial-information extension via consensus is relevant for networked settings, and the numerical experiments supply independent empirical grounding separate from the derivations.
major comments (3)
- [§3] §3 (full-information analysis): the descent-type bound relies on the introduced shuffling variance term; an explicit upper bound on this variance in terms of the strong monotonicity constant μ and the Lipschitz constant L of the pseudo-gradient should be stated to make the linear rate transparent.
- [§4.2] §4.2 (partial-info, constant step-size case): the neighborhood radius is claimed to be O(α) where α is the step size, but the perturbation from consensus error (controlled by the network spectral gap) must be shown not to cancel the contraction; a precise expression for the radius in terms of α, the spectral gap, and the variance of the local estimates is needed.
- [Theorem 2] Theorem 2 (diminishing step-size case): the almost-sure and mean-square convergence claims require verifying that the summability conditions on the step-size sequence α_k and the consensus gain are compatible with the standard stochastic approximation lemmas used; the current sketch leaves the precise choice of diminishing schedules implicit.
minor comments (3)
- [Abstract] The abstract should list the standing assumptions (strong monotonicity, Lipschitz continuity, connected network) rather than deferring them entirely to the main text.
- [Numerical experiments] In the numerical section, report the exact diminishing step-size schedule (e.g., α_k = c/(k+1)^β) and include standard deviation across independent runs for the convergence plots.
- [Notation and §4] Notation for the local estimates of the joint action profile should be unified between the full-information and partial-information sections to avoid reader confusion.
Simulated Author's Rebuttal
We thank the referee for the positive evaluation and the precise suggestions for improving clarity. We address each major comment below and will incorporate the requested explicit bounds and schedule specifications into the revised manuscript.
read point-by-point responses
-
Referee: [§3] §3 (full-information analysis): the descent-type bound relies on the introduced shuffling variance term; an explicit upper bound on this variance in terms of the strong monotonicity constant μ and the Lipschitz constant L of the pseudo-gradient should be stated to make the linear rate transparent.
Authors: We agree that an explicit upper bound on the shuffling variance would render the linear rate fully transparent. In the revised version we will insert a new supporting lemma (Lemma 3.2) that derives Var_shuf ≤ (L² / m) Σ_i ||x_i - x̄||², which is further bounded by (2L² / μ) times the distance to the equilibrium under strong monotonicity. This yields an explicit contraction factor of the form 1 - c μ α + O(α² L²) for the epoch-wise descent, making the dependence on μ and L immediate. revision: yes
-
Referee: [§4.2] §4.2 (partial-info, constant step-size case): the neighborhood radius is claimed to be O(α) where α is the step size, but the perturbation from consensus error (controlled by the network spectral gap) must be shown not to cancel the contraction; a precise expression for the radius in terms of α, the spectral gap, and the variance of the local estimates is needed.
Authors: We thank the referee for this observation. The current proof already shows that the consensus error is O(α) and does not cancel the contraction provided the spectral gap λ₂ > 0 and α is sufficiently small. In the revision we will state the precise radius explicitly as R = O(α / (μ λ₂)) + O(σ² / μ), where σ² bounds the local estimate variance, thereby confirming that the neighborhood remains O(α) and the linear rate is preserved. revision: yes
-
Referee: [Theorem 2] Theorem 2 (diminishing step-size case): the almost-sure and mean-square convergence claims require verifying that the summability conditions on the step-size sequence α_k and the consensus gain are compatible with the standard stochastic approximation lemmas used; the current sketch leaves the precise choice of diminishing schedules implicit.
Authors: The referee is correct that the schedules must be stated explicitly. We will augment Theorem 2 with the concrete conditions α_k = Θ(k^{-β}) for β ∈ (1/2, 1] and consensus gain γ_k = Θ(k^{-δ}) with δ > β, which satisfy ∑ α_k = ∞, ∑ α_k² < ∞, and the required summability for the stochastic approximation lemmas invoked in the proof. The revised proof sketch will verify compatibility line by line. revision: yes
Circularity Check
No significant circularity; derivation relies on external monotonicity and approximation tools
full rationale
The paper's central claims rest on standard strong monotonicity and Lipschitz continuity of the pseudo-gradient to derive epoch-wise descent bounds for random reshuffling via an auxiliary reference trajectory. The partial-decision-information extension augments the update with a linear consensus protocol whose error is bounded by the network spectral gap and treated as an additive perturbation. These steps invoke established stochastic approximation results and do not reduce any prediction or rate to a fitted parameter or self-citation by construction; numerical experiments on EV charging and resource games supply independent empirical checks.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption The underlying stochastic game admits a sample-average finite-sum approximation whose Nash equilibrium is well-defined and unique
Reference graph
Works this paper leans on
-
[1]
Cdma uplink power control as a noncooperative game,
T. Alpcan, T. Başar, R. Srikant, and E. Altman, “Cdma uplink power control as a noncooperative game,”Wireless Networks, vol. 8, no. 6, pp. 659–670, 2002
work page 2002
-
[2]
A.-H. Mohsenian-Rad, V. W. S. Wong, J. Jatskevich, R. Schober, and A. Leon-Garcia, “Autonomous demand-side management based on game-theoretic energy consumption scheduling for the future smart grid,”IEEE Transactions on Smart Grid, vol. 1, no. 3, pp. 320–331, 2010
work page 2010
-
[3]
Efficiency loss in a network resource allocation game: the case of elastic supply,
R. Johari, S. Mannor, and J. Tsitsiklis, “Efficiency loss in a network resource allocation game: the case of elastic supply,” IEEE Transactions on Automatic Control, vol. 50, no. 11, pp. 1712–1724, 2005
work page 2005
-
[4]
Game theoretic motion planning for multi-robot racing,
Z. Wang, R. Spica, and M. Schwager, “Game theoretic motion planning for multi-robot racing,” inDistributed Autonomous Robotic Systems, N. Correll, M. Schwager, and M. Otte, Eds. Cham: Springer International Publishing, 2019, pp. 225–238
work page 2019
-
[5]
F. Yousefian, A. Nedić, and U. V. Shanbhag, “A distributed adaptive steplength stochastic approximation method for monotone stochastic nash games,” in2013 American control conference. IEEE, 2013, pp. 4765–4770
work page 2013
-
[6]
Distributed computation of equilibria in misspecified convex stochastic nash games,
H. Jiang, U. V. Shanbhag, and S. P. Meyn, “Distributed computation of equilibria in misspecified convex stochastic nash games,”IEEE Transactions on Automatic Control, vol. 63, no. 2, pp. 360–371, 2017
work page 2017
-
[7]
Z. Feng, G. Hu, X. Dong, and J. Lü, “Adaptively distributed nash equilibrium seeking of noncooperative games for uncertain heterogeneous linear multi-agent systems,”IEEE Transactions on Network Science and Engineering, vol. 10, no. 6, pp. 3871–3882, 2023
work page 2023
-
[8]
Y.HuangandJ.Hu,“Distributedstochasticnashequilibrium learning in locally coupled network games with unknown parameters,” inLearning for Dynamics and Control Conference. PMLR, 2022, pp. 342–354
work page 2022
-
[9]
Y. Pang and G. Hu, “Distributed nash equilibrium seeking with limited cost function knowledge via a consensus-based gradient-free method,”IEEE Transactions on Automatic Control, vol. 66, no. 4, pp. 1832–1839, 2020
work page 2020
-
[10]
Stochastic extragradientwithrandomreshuffling:Improvedconvergence for variational inequalities,
K. Emmanouilidis, R. Vidal, and N. Loizou, “Stochastic extragradientwithrandomreshuffling:Improvedconvergence for variational inequalities,” inInternational Conference on ArtificialIntelligenceandStatistics. PMLR, 2024, pp. 3682– 3690
work page 2024
-
[11]
Random shuffling beats sgd after finite epochs,
J. Haochen and S. Sra, “Random shuffling beats sgd after finite epochs,” inInternational Conference on Machine Learning. PMLR, 2019, pp. 2624–2633
work page 2019
-
[12]
Without-replacement sampling for stochastic gradient methods,
O. Shamir, “Without-replacement sampling for stochastic gradient methods,”Advances in neural information processing systems, vol. 29, 2016
work page 2016
-
[13]
Stochastic learning under random reshuffling with constant step-sizes,
B. Ying, K. Yuan, S. Vlaski, and A. H. Sayed, “Stochastic learning under random reshuffling with constant step-sizes,” IEEE Transactions on Signal Processing, vol. 67, no. 2, pp. 474–489, 2018
work page 2018
-
[14]
Distributed random reshuffling over networks,
K. Huang, X. Li, A. Milzarek, S. Pu, and J. Qiu, “Distributed random reshuffling over networks,”IEEE Transactions on Signal Processing, vol. 71, pp. 1143–1158, 2023
work page 2023
-
[15]
The sample average approximation method for stochastic discrete optimization,
A. J. Kleywegt, A. Shapiro, and T. Homem-de Mello, “The sample average approximation method for stochastic discrete optimization,”SIAM Journal on optimization, vol. 12, no. 2, pp. 479–502, 2002
work page 2002
-
[16]
Stochastic nash equilibrium problems: sample average approximation and applications,
H. Xu and D. Zhang, “Stochastic nash equilibrium problems: sample average approximation and applications,” Computational Optimization and Applications, vol. 55, no. 3, pp. 597–645, 2013
work page 2013
-
[17]
Distributed stochastic nash equilibrium seeking under heavy-tailed noises,
C. Sun, B. Chen, J. Wang, Z. Wang, and L. Yu, “Distributed stochastic nash equilibrium seeking under heavy-tailed noises,”Automatica, vol. 173, p. 112081, 2025
work page 2025
-
[18]
Distributed nash equilibrium seeking over time-varying directed communication networks,
D. T. A. Nguyen, D. T. Nguyen, and A. Nedić, “Distributed nash equilibrium seeking over time-varying directed communication networks,”IEEE Transactions on Control of Network Systems, 2025
work page 2025
-
[19]
Geometric convergence of gradient play algorithms for distributed nash equilibrium seeking,
T. Tatarenko, W. Shi, and A. Nedić, “Geometric convergence of gradient play algorithms for distributed nash equilibrium seeking,”IEEE Transactions on Automatic Control, vol. 66, no. 11, pp. 5342–5353, 2020
work page 2020
-
[20]
Real and complex monotone communication games,
G. Scutari, F. Facchinei, J. Pang, and D. P. Palomar, “Real and complex monotone communication games,” CoRR, vol. abs/1212.6235, 2012. [Online]. Available: http://arxiv.org/abs/1212.6235
-
[21]
Random reshuffling: Simple analysis with vast improvements,
K. Mishchenko, A. Khaled, and P. Richtárik, “Random reshuffling: Simple analysis with vast improvements,” Advances in Neural Information Processing Systems, vol. 33, pp. 17309–17320, 2020
work page 2020
- [22]
-
[23]
Optimal day-ahead charging scheduling of electric vehicles through an aggregative game model,
Z. Liu, Q. Wu, S. Huang, L. Wang, M. Shahidehpour, and Y. Xue, “Optimal day-ahead charging scheduling of electric vehicles through an aggregative game model,”IEEE Transactions on Smart Grid, vol. 9, no. 5, pp. 5173–5184, 2018
work page 2018
-
[24]
Z. Liu, Q. Wu, S. S. Oren, S. Huang, R. Li, and L. Cheng, “Distribution locational marginal pricing for optimal electric vehicle charging through chance constrained mixed-integer programming,”IEEE Transactions on Smart Grid, vol. 9, no. 2, pp. 644–654, 2018
work page 2018
-
[25]
Distributed optimal charging of electric vehicles for demand response and load shaping,
C. L. Floch, F. Belletti, S. Saxena, A. M. Bayen, and S. Moura, “Distributed optimal charging of electric vehicles for demand response and load shaping,” inProceedings of the 54th IEEE Conference on Decision and Control (CDC), 2015, pp. 6570–6576
work page 2015
-
[26]
Resource management in mobile edge computing: A comprehensive survey,
X. Zhang and S. Debroy, “Resource management in mobile edge computing: A comprehensive survey,”ACM Computing Surveys, vol. 55, no. 13s, pp. 1–37, 2023
work page 2023
-
[27]
Joint task offloading and resource allocation for multi-server mobile-edge computing networks,
T. X. Tran and D. Pompili, “Joint task offloading and resource allocation for multi-server mobile-edge computing networks,”IEEE Transactions on Vehicular Technology, vol. 68, no. 1, pp. 856–868, 2019. 11
work page 2019
-
[28]
X. Liu, J. Zheng, M. Zhang, Y. Li, R. Wang, and Y. He, “A game-based computing resource allocation scheme of edge server in vehicular edge computing networks considering diverse task offloading modes,”Sensors, vol. 24, no. 1, p. 69, 2024
work page 2024
-
[29]
Z. Sun, G. Sun, Y. Liu, J. Wang, and D. Cao, “Bargain- match: A game theoretical approach for resource allocation and task offloading in vehicular edge computing networks,” IEEE Transactions on Mobile Computing, vol. 23, no. 2, pp. 1655–1673, 2024. 12
work page 2024
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.