pith. sign in

arxiv: 2606.12675 · v1 · pith:YIGNSS3Anew · submitted 2026-06-10 · 🧮 math.OC · cs.DC

A Communication Complexity Lower Bound for Nonuniformly Convex Consensus Optimization

Pith reviewed 2026-06-27 08:29 UTC · model grok-4.3

classification 🧮 math.OC cs.DC
keywords communication complexitydecentralized optimizationnonuniform convexitytime-varying networkslower boundsconsensus optimizationspectral graph theory
0
0 comments X

The pith

Nonuniform convexity in decentralized optimization over time-varying networks requires Ω(χ_G √κ_g log(n/χ_G) log(1/ε)) communication rounds.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper proves a lower bound showing that convex decentralized optimization, where n nodes must reach consensus on the global minimizer using only neighbor exchanges, takes at least that many rounds when convexity is nonuniform. This bound involves the network Laplacian condition number χ_G and the global objective condition number κ_g, plus logarithmic terms in network size and accuracy. A sympathetic reader cares because prior results assumed uniform regularity across local functions and achieved lower round counts; the new result separates the two regimes and indicates that nonuniformity forces strictly more communication. The proof builds hard instances via spectral graph constructions on time-varying networks.

Core claim

We prove a new lower bound of Ω(χ_G √κ_g log(n/χ_G) log(1/ε)) communication rounds for convex decentralized optimization over time-varying networks, showing the round complexity attainable under uniform regularity cannot be matched in the nonuniform regime. The construction rests on spectral graph theory: we embed time-rotating star gadgets into the edges of an expander and patch them to preserve spectral connectivity.

What carries the argument

time-rotating star gadgets embedded into the edges of an expander graph and patched while preserving the spectral connectivity properties of the time-varying Laplacians

If this is right

  • The attainable round complexity for uniform-regularity problems cannot be achieved when local functions have differing convexity parameters.
  • The dependence on χ_G and √κ_g separates the communication cost from the uniform case.
  • Logarithmic factors in n/χ_G and 1/ε remain necessary even after the network and objective condition numbers are accounted for.
  • The lower bound holds for synchronous oracle-based communication constrained to neighbor exchanges.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • Existing algorithms optimized for uniform convexity will require additional rounds or fail to reach the claimed accuracy on some nonuniform instances.
  • The gadget-embedding technique may apply to other time-varying topologies beyond expanders to derive similar separations.
  • Adaptive or topology-aware communication schedules could be needed to approach the bound in practice.

Load-bearing premise

Time-rotating star gadgets can be embedded into the edges of an expander graph and patched while preserving the required spectral connectivity properties of the time-varying Laplacians to create a valid hard instance.

What would settle it

An algorithm that reaches ε-accuracy in o(χ_G √κ_g log(n/χ_G) log(1/ε)) rounds on the paper's constructed time-varying network instances with nonuniform convexity would falsify the lower bound.

Figures

Figures reproduced from arXiv: 2606.12675 by Demyan Yarmoshik, Maxim Klimenko.

Figure 1
Figure 1. Figure 1: Consider vertices s, t ∈ VH whose distance in the graph H is equal to its diameter ∆(H). This construction guarantees that in order to transmit information from s to t at least (τ + 1)∆(H) communication rounds are required. Indeed, to transmit information from vertex v to vertex w such that (v, w) ∈ EH, two consecutive communications with the same central vertex are needed, which requires τ + 1 communicati… view at source ↗
Figure 2
Figure 2. Figure 2: 4.1.1 Patch Procedure The patch procedure G0 → G adds edges to G0 to increase its spectral connectivity without supporting infection spread. Let F be an expander on VG0 with maximum degree dF . Define G1 = (VG0 , EG0 ∪ EF ). An edge (v, w) ∈ EF is called bad if it extends node boundary, i.e. v ∈ U, w ∈ U¯ \B. Construct G by replacing each bad edge (v, w) with two edges (v, b),(b, w), where nodes b ∈ B are … view at source ↗
read the original abstract

We study the communication complexity of convex decentralized optimization over time-varying networks, where $n$ nodes hold private functions and must agree on the global minimizer using only synchronous exchanges with neighbors. The cost is the number of communication rounds to reach accuracy $\varepsilon$ -- a measure akin to round complexity in the LOCAL model, but constrained by nodes sharing only oracle responses. We prove a new lower bound of $\Omega\!\left(\chi_{\mathcal G} \sqrt{\kappa_g}\,\log\frac{n}{\chi_{\mathcal G}}\log\frac1\varepsilon\right)$ communication rounds, where $\chi_{\mathcal G}$ is the condition number of the network Laplacians and $\kappa_g$ that of the global objective, showing the round complexity attainable under uniform regularity cannot be matched in the nonuniform regime. The construction rests on spectral graph theory: we embed time-rotating star gadgets into the edges of an expander and patch them to preserve spectral connectivity.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

1 major / 2 minor

Summary. The paper studies communication complexity of convex decentralized optimization over time-varying networks, where n nodes hold private functions and must reach ε-accuracy on the global minimizer via neighbor exchanges. It proves a lower bound of Ω(χ_G √κ_g log(n/χ_G) log(1/ε)) communication rounds, with χ_G the condition number of the network Laplacians and κ_g that of the global objective. The proof constructs a hard instance by embedding time-rotating star gadgets into the edges of an expander graph and patching them to preserve the required spectral properties while enforcing nonuniform convexity.

Significance. If the construction is valid, the result establishes a separation showing that round complexities attainable under uniform regularity cannot be matched in the nonuniform regime. The explicit spectral-graph construction (rather than fitted parameters) and the focus on time-varying networks are strengths; the bound is parameter-free in the stated sense and directly addresses a gap between uniform and nonuniform settings.

major comments (1)
  1. [Construction of hard instance] Construction section (around the embedding and patching of time-rotating star gadgets): the claim that the patched time-varying Laplacians simultaneously maintain the expander's spectral connectivity (to control χ_G) and force the nonuniform convexity lower bound requires explicit verification that the eigenvalue bounds survive the patching operation without inflating χ_G or violating the star-gadget rotation schedule. If the patching alters the smallest nonzero eigenvalue by more than a constant factor, the Ω(χ_G √κ_g ...) scaling would not hold.
minor comments (2)
  1. [Abstract and Section 1] Notation: define χ_G and κ_g at first use with explicit reference to the Laplacian and objective definitions.
  2. [Main theorem statement] The log(n/χ_G) term should be justified with a brief remark on how the expander size and gadget density interact.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for the careful reading and for identifying the need for more explicit verification in the hard-instance construction. We address the concern point-by-point below and will revise the manuscript to strengthen the exposition of the eigenvalue bounds.

read point-by-point responses
  1. Referee: Construction section (around the embedding and patching of time-rotating star gadgets): the claim that the patched time-varying Laplacians simultaneously maintain the expander's spectral connectivity (to control χ_G) and force the nonuniform convexity lower bound requires explicit verification that the eigenvalue bounds survive the patching operation without inflating χ_G or violating the star-gadget rotation schedule. If the patching alters the smallest nonzero eigenvalue by more than a constant factor, the Ω(χ_G √κ_g ...) scaling would not hold.

    Authors: The construction embeds the star gadgets on a vanishing fraction of expander edges and applies a localized, low-rank perturbation to each time-varying Laplacian. By the matrix perturbation bounds already invoked in Section 4 (Weyl's inequality applied to the difference between the patched and unpatched Laplacians), the smallest nonzero eigenvalue changes by at most a multiplicative factor of 2, which is absorbed into the Ω(·) notation and leaves the χ_G scaling unchanged. The rotation schedule of each gadget is confined to its own edge set and does not interact with the expander's global connectivity pattern, so the prescribed time-varying sequence is preserved. We will add a dedicated lemma (new Lemma 4.3) that makes this perturbation analysis fully explicit, including the precise constant factor. revision: yes

Circularity Check

0 steps flagged

No significant circularity detected

full rationale

The paper establishes a communication lower bound via an explicit construction in spectral graph theory that embeds and patches time-rotating star gadgets into an expander while preserving Laplacian condition numbers and nonuniform convexity. No steps reduce by definition to fitted parameters, self-referential definitions, or load-bearing self-citations; the derivation is presented as a direct proof from the constructed hard instance and is self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central claim rests on standard tools from spectral graph theory for constructing a hard instance; no free parameters or new entities are mentioned in the abstract.

axioms (1)
  • standard math Spectral properties of expander graphs, Laplacian condition numbers, and time-varying network connectivity
    Invoked to embed and patch the time-rotating star gadgets while preserving the required spectral features for the lower bound.

pith-pipeline@v0.9.1-grok · 5691 in / 1233 out tokens · 26471 ms · 2026-06-27T08:29:23.597546+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

43 extracted references · 1 linked inside Pith

  1. [1]

    and Linial, N

    Hoory, S. and Linial, N. and Wigderson, A. , title =. Bulletin of the American Mathematical Society , year =

  2. [2]

    and Gasanov, E

    Kovalev, D. and Gasanov, E. and Gasnikov, A. and Richtárik, P. , title =. Advances in Neural Information Processing Systems , volume =

  3. [3]

    Combinatorial scientific computing , volume=

    Spectral graph theory , author=. Combinatorial scientific computing , volume=. 2012 , publisher=

  4. [4]

    International Conference on Machine Learning , pages=

    Is consensus acceleration possible in decentralized optimization over slowly time-varying networks? , author=. International Conference on Machine Learning , pages=. 2023 , organization=

  5. [5]

    2008 , publisher=

    A proof of Alon's second eigenvalue conjecture and related problems , author=. 2008 , publisher=

  6. [6]

    Nesterov, Yurii , title =

  7. [7]

    Proceedings of the 34th International Conference on Machine Learning-Volume 70 , pages=

    Optimal algorithms for smooth and strongly convex distributed optimization in networks , author=. Proceedings of the 34th International Conference on Machine Learning-Volume 70 , pages=. 2017 , organization=

  8. [8]

    Journal of Machine Learning Research , volume=

    Optimal convergence rates for convex distributed optimization in networks , author=. Journal of Machine Learning Research , volume=

  9. [9]

    International Conference on Machine Learning , pages=

    ADOM: accelerated decentralized optimization method for time-varying networks , author=. International Conference on Machine Learning , pages=. 2021 , organization=

  10. [10]

    Kovalev, D. A. and Kupavskii, A. B. and Rogozin, A. V. and Yarmoshik, D. V. , title =. Uspekhi Matematicheskikh Nauk , year =. rm10278 , eprinttype =

  11. [11]

    IEEE Transactions on Automatic Control , volume=

    Distributed subgradient methods for multi-agent optimization , author=. IEEE Transactions on Automatic Control , volume=. 2009 , publisher=

  12. [12]

    IEEE Transactions on Automatic Control , volume=

    Constrained consensus and optimization in multi-agent networks , author=. IEEE Transactions on Automatic Control , volume=. 2010 , publisher=

  13. [13]

    High-Dimensional Optimization and Probability: With a View Towards Data Science , pages=

    Recent theoretical advances in decentralized distributed convex optimization , author=. High-Dimensional Optimization and Probability: With a View Towards Data Science , pages=. 2022 , publisher=

  14. [14]

    Computational Management Science , volume=

    Decentralized optimization with affine constraints over time-varying networks , author=. Computational Management Science , volume=. 2024 , publisher=

  15. [15]

    Advances in Neural Information Processing Systems , volume=

    Moshpit sgd: Communication-efficient decentralized training on heterogeneous unreliable devices , author=. Advances in Neural Information Processing Systems , volume=

  16. [16]

    Advances in neural information processing systems , volume=

    Can decentralized algorithms outperform centralized algorithms? a case study for decentralized parallel stochastic gradient descent , author=. Advances in neural information processing systems , volume=

  17. [17]

    , author=

    1-bit stochastic gradient descent and its application to data-parallel distributed training of speech DNNs. , author=. Interspeech , volume=. 2014 , organization=

  18. [18]

    Advances in Neural Information Processing Systems , volume=

    EF21: A new, simpler, theoretically better, and practically faster error feedback , author=. Advances in Neural Information Processing Systems , volume=

  19. [19]

    Journal of Machine Learning Research , volume=

    On biased compression for distributed learning , author=. Journal of Machine Learning Research , volume=

  20. [20]

    SIAM review , volume=

    Fastest mixing Markov chain on a graph , author=. SIAM review , volume=. 2004 , publisher=

  21. [21]

    SC22: International Conference for High Performance Computing, Networking, Storage and Analysis , pages=

    Polarfly: A cost-effective and flexible low-diameter topology , author=. SC22: International Conference for High Performance Computing, Networking, Storage and Analysis , pages=. 2022 , organization=

  22. [22]

    IEEE Micro , year=

    Ub-mesh: a hierarchically localized nd-fullmesh datacenter network architecture , author=. IEEE Micro , year=

  23. [23]

    Advances in Neural Information Processing Systems , volume=

    Exponential graph is provably efficient for decentralized deep training , author=. Advances in Neural Information Processing Systems , volume=

  24. [24]

    Advances in neural information processing systems , volume=

    Communication complexity of distributed convex learning and optimization , author=. Advances in neural information processing systems , volume=

  25. [25]

    Journal of machine learning research , volume=

    Multi-consensus decentralized accelerated gradient descent , author=. Journal of machine learning research , volume=

  26. [26]

    Uspekhi Mat

    On the complexity of decentralized optimization via global function parameters , author=. Uspekhi Mat. Nauk , volume=. 2026 , note=

  27. [27]

    Advances in Neural Information Processing Systems , volume=

    Throughput-optimal topology design for cross-silo federated learning , author=. Advances in Neural Information Processing Systems , volume=

  28. [28]

    arXiv preprint arXiv:2603.20735 , year=

    Optimality in Decentralized Optimization under Bandwidth Constraints , author=. arXiv preprint arXiv:2603.20735 , year=

  29. [29]

    SIAM Journal on computing , volume=

    Locality in distributed graph algorithms , author=. SIAM Journal on computing , volume=. 1992 , publisher=

  30. [30]

    2024 , school=

    Local Complexity: New Results and Bridges to Other Fields , author=. 2024 , school=

  31. [31]

    International Conference on Machine Learning , pages=

    Kernel methods for cooperative multi-agent contextual bandits , author=. International Conference on Machine Learning , pages=. 2020 , organization=

  32. [32]

    arXiv preprint arXiv:2604.18615 , year=

    Locality, Not Spectral Mixing, Governs Direct Propagation in Distributed Offline Dynamic Programming , author=. arXiv preprint arXiv:2604.18615 , year=

  33. [33]

    arXiv preprint arXiv:2306.08670 , year=

    Simple Opinion Dynamics for No-Regret Learning , author=. arXiv preprint arXiv:2306.08670 , year=

  34. [34]

    Advances in Neural Information Processing Systems , volume=

    On the optimal time complexities in decentralized stochastic asynchronous optimization , author=. Advances in Neural Information Processing Systems , volume=

  35. [35]

    2024 , school=

    Optimization algorithms for decentralized, distributed and collaborative machine learning , author=. 2024 , school=

  36. [36]

    International Conference on Learning Representations , volume=

    Decentralized optimization with coupled constraints , author=. International Conference on Learning Representations , volume=

  37. [37]

    Journal of Machine Learning Research , volume=

    Accelerated gradient tracking over time-varying graphs for decentralized optimization , author=. Journal of Machine Learning Research , volume=

  38. [38]

    Advances in Neural Information Processing Systems , volume=

    Optimal and practical algorithms for smooth and strongly convex decentralized optimization , author=. Advances in Neural Information Processing Systems , volume=

  39. [39]

    Mathematical Programming , volume=

    Lower bounds for finding stationary points I , author=. Mathematical Programming , volume=. 2020 , publisher=

  40. [40]

    arXiv preprint arXiv:2307.12562 , year=

    Decentralized optimization over slowly time-varying graphs: Algorithms and lower bounds , author=. arXiv preprint arXiv:2307.12562 , year=

  41. [41]

    Theory of cryptography conference , pages=

    Calibrating noise to sensitivity in private data analysis , author=. Theory of cryptography conference , pages=. 2006 , organization=

  42. [42]

    Foundations and trends

    The algorithmic foundations of differential privacy , author=. Foundations and trends. 2014 , publisher=

  43. [43]

    International Conference on Machine Learning , pages=

    The Privacy Power of Correlated Noise in Decentralized Learning , author=. International Conference on Machine Learning , pages=. 2024 , organization=