arxiv: 2604.02858 · v1 · submitted 2026-04-03 · 🧮 math.OC

Random Reshuffling-Based Distributed Nash Equilibrium Seeking

Jun Hu , Chao Sun , Chen Bo , Jianzheng Wang , Zheming Wang This is my paper

Pith reviewed 2026-05-13 18:58 UTC · model grok-4.3

classification 🧮 math.OC

keywords random reshufflingdistributed Nash equilibriumpartial decision informationstochastic gamesfinite-sum equilibriumconvergence analysis

0 comments

The pith

Random reshuffling enables linear convergence to a neighborhood of Nash equilibria with constant steps and exact convergence with diminishing steps in distributed partial-information games.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper develops a random reshuffling approach for distributed Nash equilibrium seeking, motivated by finite-sum approximations of stochastic games. It first builds a full-information benchmark that introduces a reference trajectory and shuffling variance to bound epoch-wise dynamics from without-replacement updates. The method extends to the partial-decision-information case where each player relies on local estimates of the joint action profile. With constant parameters the iterates converge linearly to a neighborhood of the equilibrium; with diminishing parameters they reach the exact equilibrium almost surely and in mean square. A reader would care because the without-replacement sampling yields measurably better steady-state accuracy and long-horizon behavior than standard with-replacement stochastic gradient methods on networks.

Core claim

For the distributed partial-decision-information case the random reshuffling algorithm converges linearly to a neighborhood of the Nash equilibrium under constant parameters and converges exactly to the Nash equilibrium almost surely and in mean square under diminishing parameters, after first establishing a descent-type bound for the full-information case via an intermediate reference trajectory and a shuffling variance that capture the epoch-wise dynamics induced by reshuffling.

What carries the argument

The shuffling variance together with an intermediate reference trajectory that characterize the epoch-wise dynamics of the random reshuffling iterates.

If this is right

The algorithm achieves higher steady-state accuracy and better long-horizon performance than conventional with-replacement SGD.
The same convergence statements hold for both the full-information benchmark and the partial-decision-information extension.
Numerical tests on an EV charging game and a nonquadratic edge resource admission game confirm the accuracy advantage.
Convergence relies on the standard strong monotonicity and Lipschitz conditions that produce the descent bounds.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The without-replacement structure may reduce the total number of communication rounds needed to reach a target accuracy in large networks.
The same reference-trajectory technique could be tested on other multi-agent problems such as distributed potential games or variational inequalities.
Asynchronous or time-varying network versions would be a direct extension that preserves the epoch-wise analysis.
Replacing SGD with random reshuffling inside existing distributed learning pipelines could be checked by measuring variance reduction per epoch.

Load-bearing premise

The game is strongly monotone and Lipschitz continuous and players maintain sufficiently accurate local estimates of the joint action profile.

What would settle it

Simulate the EV charging game with constant parameters and check whether the distance to the known Nash equilibrium decreases at a linear rate or plateaus at a positive error level after many epochs.

Figures

Figures reproduced from arXiv: 2604.02858 by Chao Sun, Chen Bo, Jianzheng Wang, Jun Hu, Zheming Wang.

**Figure 2.** Figure 2: Convergence comparison for the EV charging game [PITH_FULL_IMAGE:figures/full_fig_p006_2.png] view at source ↗

**Figure 3.** Figure 3: Convergence comparison for the edge resource admis [PITH_FULL_IMAGE:figures/full_fig_p007_3.png] view at source ↗

read the original abstract

This paper studies random reshuffling (RR)-based distributed Nash equilibrium seeking for noncooperative games. The game is motivated as a sample-average approximation of an underlying expected-value stochastic game, while the algorithmic focus is placed on the resulting finite-sum equilibrium problem. Unlike existing distributed stochastic Nash equilibrium methods that mainly rely on with-replacement sampling, the proposed approach incorporates without-replacement component updates into equilibrium computation over networks. We first consider a full-information benchmark, for which an intermediate reference trajectory and a shuffling variance are introduced to characterize the epoch-wise dynamics induced by RR. The method is then extended to the more practical partial-decision-information setting, where each player updates its action using local estimates of the joint action profile. For the full-information case, a descent-type bound is established for the RR iterates. For the distributed partial-decision-information case, it is shown that, under constant parameters, the proposed algorithm converges linearly to a neighborhood of the Nash equilibrium, while under diminishing parameters, it converges exactly to the Nash equilibrium almost surely and in mean square. Numerical experiments on an EV charging game and a nonquadratic edge resource admission game demonstrate that RR consistently outperforms the conventional with-replacement SGD baseline in both steady-state accuracy and long-horizon performance.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

Random reshuffling gives measurable gains over standard SGD for distributed Nash seeking under the usual monotonicity assumptions, with the partial-info extension and reference-trajectory analysis as the main technical moves.

read the letter

The paper shows that swapping in random reshuffling for the usual with-replacement sampling improves both steady-state accuracy and long-run behavior in distributed Nash equilibrium computation. The full-information case gets a descent bound via an auxiliary reference trajectory that tracks the epoch-wise shuffle; the partial-information version adds a linear consensus step on local copies of the joint action and treats the resulting error as a controlled perturbation. Under constant steps the iterates converge linearly to a neighborhood whose radius scales with the step size; diminishing steps yield almost-sure and mean-square convergence to the exact equilibrium. The two numerical examples—an EV charging game and a non-quadratic resource admission game—both show the reshuffling version pulling ahead of the with-replacement baseline on the same networks and step-size schedules. Those plots supply independent empirical support that is separate from the derivations. The analysis adapts standard stochastic-approximation tools and does not appear to contain internal contradictions. The central assumptions—strong monotonicity and Lipschitz continuity of the pseudo-gradient, plus a connected network—are standard for this literature and are not hidden. They do restrict the result to games with a unique equilibrium and reasonably well-behaved gradients, so the practical scope is narrower than the title might suggest. The partial-info extension also requires that the consensus error stays small enough not to destroy the contraction, which holds under the stated spectral-gap condition but adds another layer of tuning. Overall the work is a clean incremental step rather than a reorientation of the field. It is aimed at researchers who already work on distributed game algorithms for energy or network resource problems and who care about finite-sum versus stochastic settings. The combination of a new sampling scheme, a workable analysis device, and concrete experiments is enough to justify sending it to referees for a full check of the proofs and constants.

Referee Report

3 major / 3 minor

Summary. The paper proposes random reshuffling (RR)-based distributed algorithms for Nash equilibrium seeking in noncooperative games formulated as finite-sum problems. For the full-information benchmark, an intermediate reference trajectory and shuffling variance are used to derive an epoch-wise descent bound. The approach is extended to the partial-decision-information setting via a linear consensus protocol on local action estimates; under constant parameters the iterates converge linearly to a neighborhood of the Nash equilibrium, while under diminishing parameters they converge almost surely and in mean square to the exact equilibrium. Numerical results on an EV charging game and a nonquadratic edge resource admission game show RR outperforming with-replacement SGD baselines in accuracy and long-horizon performance.

Significance. If the stated convergence results hold, the work provides a practical improvement over existing with-replacement distributed stochastic Nash methods by exploiting without-replacement sampling, which yields tighter epoch-wise bounds and better empirical steady-state accuracy. The partial-information extension via consensus is relevant for networked settings, and the numerical experiments supply independent empirical grounding separate from the derivations.

major comments (3)

[§3] §3 (full-information analysis): the descent-type bound relies on the introduced shuffling variance term; an explicit upper bound on this variance in terms of the strong monotonicity constant μ and the Lipschitz constant L of the pseudo-gradient should be stated to make the linear rate transparent.
[§4.2] §4.2 (partial-info, constant step-size case): the neighborhood radius is claimed to be O(α) where α is the step size, but the perturbation from consensus error (controlled by the network spectral gap) must be shown not to cancel the contraction; a precise expression for the radius in terms of α, the spectral gap, and the variance of the local estimates is needed.
[Theorem 2] Theorem 2 (diminishing step-size case): the almost-sure and mean-square convergence claims require verifying that the summability conditions on the step-size sequence α_k and the consensus gain are compatible with the standard stochastic approximation lemmas used; the current sketch leaves the precise choice of diminishing schedules implicit.

minor comments (3)

[Abstract] The abstract should list the standing assumptions (strong monotonicity, Lipschitz continuity, connected network) rather than deferring them entirely to the main text.
[Numerical experiments] In the numerical section, report the exact diminishing step-size schedule (e.g., α_k = c/(k+1)^β) and include standard deviation across independent runs for the convergence plots.
[Notation and §4] Notation for the local estimates of the joint action profile should be unified between the full-information and partial-information sections to avoid reader confusion.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for the positive evaluation and the precise suggestions for improving clarity. We address each major comment below and will incorporate the requested explicit bounds and schedule specifications into the revised manuscript.

read point-by-point responses

Referee: [§3] §3 (full-information analysis): the descent-type bound relies on the introduced shuffling variance term; an explicit upper bound on this variance in terms of the strong monotonicity constant μ and the Lipschitz constant L of the pseudo-gradient should be stated to make the linear rate transparent.

Authors: We agree that an explicit upper bound on the shuffling variance would render the linear rate fully transparent. In the revised version we will insert a new supporting lemma (Lemma 3.2) that derives Var_shuf ≤ (L² / m) Σ_i ||x_i - x̄||², which is further bounded by (2L² / μ) times the distance to the equilibrium under strong monotonicity. This yields an explicit contraction factor of the form 1 - c μ α + O(α² L²) for the epoch-wise descent, making the dependence on μ and L immediate. revision: yes
Referee: [§4.2] §4.2 (partial-info, constant step-size case): the neighborhood radius is claimed to be O(α) where α is the step size, but the perturbation from consensus error (controlled by the network spectral gap) must be shown not to cancel the contraction; a precise expression for the radius in terms of α, the spectral gap, and the variance of the local estimates is needed.

Authors: We thank the referee for this observation. The current proof already shows that the consensus error is O(α) and does not cancel the contraction provided the spectral gap λ₂ > 0 and α is sufficiently small. In the revision we will state the precise radius explicitly as R = O(α / (μ λ₂)) + O(σ² / μ), where σ² bounds the local estimate variance, thereby confirming that the neighborhood remains O(α) and the linear rate is preserved. revision: yes
Referee: [Theorem 2] Theorem 2 (diminishing step-size case): the almost-sure and mean-square convergence claims require verifying that the summability conditions on the step-size sequence α_k and the consensus gain are compatible with the standard stochastic approximation lemmas used; the current sketch leaves the precise choice of diminishing schedules implicit.

Authors: The referee is correct that the schedules must be stated explicitly. We will augment Theorem 2 with the concrete conditions α_k = Θ(k^{-β}) for β ∈ (1/2, 1] and consensus gain γ_k = Θ(k^{-δ}) with δ > β, which satisfy ∑ α_k = ∞, ∑ α_k² < ∞, and the required summability for the stochastic approximation lemmas invoked in the proof. The revised proof sketch will verify compatibility line by line. revision: yes

Circularity Check

0 steps flagged

No significant circularity; derivation relies on external monotonicity and approximation tools

full rationale

The paper's central claims rest on standard strong monotonicity and Lipschitz continuity of the pseudo-gradient to derive epoch-wise descent bounds for random reshuffling via an auxiliary reference trajectory. The partial-decision-information extension augments the update with a linear consensus protocol whose error is bounded by the network spectral gap and treated as an additive perturbation. These steps invoke established stochastic approximation results and do not reduce any prediction or rate to a fitted parameter or self-citation by construction; numerical experiments on EV charging and resource games supply independent empirical checks.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central claims rest on standard domain assumptions from stochastic game theory and optimization that are not enumerated in the abstract; no free parameters or invented entities are introduced beyond the algorithmic construction itself.

axioms (1)

domain assumption The underlying stochastic game admits a sample-average finite-sum approximation whose Nash equilibrium is well-defined and unique
Motivation stated in abstract for treating the problem as finite-sum equilibrium computation

pith-pipeline@v0.9.0 · 5522 in / 1265 out tokens · 40937 ms · 2026-05-13T18:58:08.621256+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

29 extracted references · 29 canonical work pages

[1]

Cdma uplink power control as a noncooperative game,

T. Alpcan, T. Başar, R. Srikant, and E. Altman, “Cdma uplink power control as a noncooperative game,”Wireless Networks, vol. 8, no. 6, pp. 659–670, 2002

work page 2002
[2]

Autonomous demand-side management based on game-theoretic energy consumption scheduling for the future smart grid,

A.-H. Mohsenian-Rad, V. W. S. Wong, J. Jatskevich, R. Schober, and A. Leon-Garcia, “Autonomous demand-side management based on game-theoretic energy consumption scheduling for the future smart grid,”IEEE Transactions on Smart Grid, vol. 1, no. 3, pp. 320–331, 2010

work page 2010
[3]

Efficiency loss in a network resource allocation game: the case of elastic supply,

R. Johari, S. Mannor, and J. Tsitsiklis, “Efficiency loss in a network resource allocation game: the case of elastic supply,” IEEE Transactions on Automatic Control, vol. 50, no. 11, pp. 1712–1724, 2005

work page 2005
[4]

Game theoretic motion planning for multi-robot racing,

Z. Wang, R. Spica, and M. Schwager, “Game theoretic motion planning for multi-robot racing,” inDistributed Autonomous Robotic Systems, N. Correll, M. Schwager, and M. Otte, Eds. Cham: Springer International Publishing, 2019, pp. 225–238

work page 2019
[5]

A distributed adaptive steplength stochastic approximation method for monotone stochastic nash games,

F. Yousefian, A. Nedić, and U. V. Shanbhag, “A distributed adaptive steplength stochastic approximation method for monotone stochastic nash games,” in2013 American control conference. IEEE, 2013, pp. 4765–4770

work page 2013
[6]

Distributed computation of equilibria in misspecified convex stochastic nash games,

H. Jiang, U. V. Shanbhag, and S. P. Meyn, “Distributed computation of equilibria in misspecified convex stochastic nash games,”IEEE Transactions on Automatic Control, vol. 63, no. 2, pp. 360–371, 2017

work page 2017
[7]

Adaptively distributed nash equilibrium seeking of noncooperative games for uncertain heterogeneous linear multi-agent systems,

Z. Feng, G. Hu, X. Dong, and J. Lü, “Adaptively distributed nash equilibrium seeking of noncooperative games for uncertain heterogeneous linear multi-agent systems,”IEEE Transactions on Network Science and Engineering, vol. 10, no. 6, pp. 3871–3882, 2023

work page 2023
[8]

Distributedstochasticnashequilibrium learning in locally coupled network games with unknown parameters,

Y.HuangandJ.Hu,“Distributedstochasticnashequilibrium learning in locally coupled network games with unknown parameters,” inLearning for Dynamics and Control Conference. PMLR, 2022, pp. 342–354

work page 2022
[9]

Distributed nash equilibrium seeking with limited cost function knowledge via a consensus-based gradient-free method,

Y. Pang and G. Hu, “Distributed nash equilibrium seeking with limited cost function knowledge via a consensus-based gradient-free method,”IEEE Transactions on Automatic Control, vol. 66, no. 4, pp. 1832–1839, 2020

work page 2020
[10]

Stochastic extragradientwithrandomreshuffling:Improvedconvergence for variational inequalities,

K. Emmanouilidis, R. Vidal, and N. Loizou, “Stochastic extragradientwithrandomreshuffling:Improvedconvergence for variational inequalities,” inInternational Conference on ArtificialIntelligenceandStatistics. PMLR, 2024, pp. 3682– 3690

work page 2024
[11]

Random shuffling beats sgd after finite epochs,

J. Haochen and S. Sra, “Random shuffling beats sgd after finite epochs,” inInternational Conference on Machine Learning. PMLR, 2019, pp. 2624–2633

work page 2019
[12]

Without-replacement sampling for stochastic gradient methods,

O. Shamir, “Without-replacement sampling for stochastic gradient methods,”Advances in neural information processing systems, vol. 29, 2016

work page 2016
[13]

Stochastic learning under random reshuffling with constant step-sizes,

B. Ying, K. Yuan, S. Vlaski, and A. H. Sayed, “Stochastic learning under random reshuffling with constant step-sizes,” IEEE Transactions on Signal Processing, vol. 67, no. 2, pp. 474–489, 2018

work page 2018
[14]

Distributed random reshuffling over networks,

K. Huang, X. Li, A. Milzarek, S. Pu, and J. Qiu, “Distributed random reshuffling over networks,”IEEE Transactions on Signal Processing, vol. 71, pp. 1143–1158, 2023

work page 2023
[15]

The sample average approximation method for stochastic discrete optimization,

A. J. Kleywegt, A. Shapiro, and T. Homem-de Mello, “The sample average approximation method for stochastic discrete optimization,”SIAM Journal on optimization, vol. 12, no. 2, pp. 479–502, 2002

work page 2002
[16]

Stochastic nash equilibrium problems: sample average approximation and applications,

H. Xu and D. Zhang, “Stochastic nash equilibrium problems: sample average approximation and applications,” Computational Optimization and Applications, vol. 55, no. 3, pp. 597–645, 2013

work page 2013
[17]

Distributed stochastic nash equilibrium seeking under heavy-tailed noises,

C. Sun, B. Chen, J. Wang, Z. Wang, and L. Yu, “Distributed stochastic nash equilibrium seeking under heavy-tailed noises,”Automatica, vol. 173, p. 112081, 2025

work page 2025
[18]

Distributed nash equilibrium seeking over time-varying directed communication networks,

D. T. A. Nguyen, D. T. Nguyen, and A. Nedić, “Distributed nash equilibrium seeking over time-varying directed communication networks,”IEEE Transactions on Control of Network Systems, 2025

work page 2025
[19]

Geometric convergence of gradient play algorithms for distributed nash equilibrium seeking,

T. Tatarenko, W. Shi, and A. Nedić, “Geometric convergence of gradient play algorithms for distributed nash equilibrium seeking,”IEEE Transactions on Automatic Control, vol. 66, no. 11, pp. 5342–5353, 2020

work page 2020
[20]

Real and complex monotone communication games,

G. Scutari, F. Facchinei, J. Pang, and D. P. Palomar, “Real and complex monotone communication games,” CoRR, vol. abs/1212.6235, 2012. [Online]. Available: http://arxiv.org/abs/1212.6235

work page arXiv 2012
[21]

Random reshuffling: Simple analysis with vast improvements,

K. Mishchenko, A. Khaled, and P. Richtárik, “Random reshuffling: Simple analysis with vast improvements,” Advances in Neural Information Processing Systems, vol. 33, pp. 17309–17320, 2020

work page 2020
[22]

Introduction to optimization,

B. T. Polyak, “Introduction to optimization,” 1987

work page 1987
[23]

Optimal day-ahead charging scheduling of electric vehicles through an aggregative game model,

Z. Liu, Q. Wu, S. Huang, L. Wang, M. Shahidehpour, and Y. Xue, “Optimal day-ahead charging scheduling of electric vehicles through an aggregative game model,”IEEE Transactions on Smart Grid, vol. 9, no. 5, pp. 5173–5184, 2018

work page 2018
[24]

Distribution locational marginal pricing for optimal electric vehicle charging through chance constrained mixed-integer programming,

Z. Liu, Q. Wu, S. S. Oren, S. Huang, R. Li, and L. Cheng, “Distribution locational marginal pricing for optimal electric vehicle charging through chance constrained mixed-integer programming,”IEEE Transactions on Smart Grid, vol. 9, no. 2, pp. 644–654, 2018

work page 2018
[25]

Distributed optimal charging of electric vehicles for demand response and load shaping,

C. L. Floch, F. Belletti, S. Saxena, A. M. Bayen, and S. Moura, “Distributed optimal charging of electric vehicles for demand response and load shaping,” inProceedings of the 54th IEEE Conference on Decision and Control (CDC), 2015, pp. 6570–6576

work page 2015
[26]

Resource management in mobile edge computing: A comprehensive survey,

X. Zhang and S. Debroy, “Resource management in mobile edge computing: A comprehensive survey,”ACM Computing Surveys, vol. 55, no. 13s, pp. 1–37, 2023

work page 2023
[27]

Joint task offloading and resource allocation for multi-server mobile-edge computing networks,

T. X. Tran and D. Pompili, “Joint task offloading and resource allocation for multi-server mobile-edge computing networks,”IEEE Transactions on Vehicular Technology, vol. 68, no. 1, pp. 856–868, 2019. 11

work page 2019
[28]

A game-based computing resource allocation scheme of edge server in vehicular edge computing networks considering diverse task offloading modes,

X. Liu, J. Zheng, M. Zhang, Y. Li, R. Wang, and Y. He, “A game-based computing resource allocation scheme of edge server in vehicular edge computing networks considering diverse task offloading modes,”Sensors, vol. 24, no. 1, p. 69, 2024

work page 2024
[29]

Bargain- match: A game theoretical approach for resource allocation and task offloading in vehicular edge computing networks,

Z. Sun, G. Sun, Y. Liu, J. Wang, and D. Cao, “Bargain- match: A game theoretical approach for resource allocation and task offloading in vehicular edge computing networks,” IEEE Transactions on Mobile Computing, vol. 23, no. 2, pp. 1655–1673, 2024. 12

work page 2024