pith. sign in

arxiv: 2509.05288 · v2 · submitted 2025-09-05 · 💻 cs.LG · math.OC

Learning to accelerate distributed ADMM using graph neural networks

Pith reviewed 2026-05-18 18:43 UTC · model grok-4.3

classification 💻 cs.LG math.OC
keywords distributed optimizationADMMgraph neural networkshyperparameter learningconvergence accelerationmessage passingunrolled optimizationdecentralized algorithms
0
0 comments X

The pith

Distributed ADMM iterations fit inside graph neural network message passing so a network can learn adaptive step sizes and weights for faster convergence.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper establishes that the alternating direction method of multipliers can be rewritten as message passing on a graph, allowing a GNN to predict iteration-specific step sizes and communication weights directly from the current iterates. By unrolling a fixed number of ADMM steps and training the GNN end-to-end to reduce the distance to the solution on a given problem class, the method keeps the original convergence guarantees while improving practical speed. A sympathetic reader would care because distributed optimization appears in large-scale machine learning and control, where slow or hyperparameter-sensitive convergence is a recurring cost. The learned variant is shown to deliver both quicker progress within the training budget and better final accuracy on new instances drawn from the same distribution.

Core claim

Distributed ADMM iterations can be expressed within the message-passing framework of graph neural networks. A GNN is then trained to output adaptive step sizes and communication weights from the current iterates; the combined system is unrolled for a fixed number of steps and optimized end-to-end to minimize solution error after those steps, while the underlying ADMM updates preserve their theoretical convergence properties.

What carries the argument

The rewriting of ADMM's local minimization and dual update steps as graph message-passing layers that let the GNN emit per-iteration hyperparameters.

If this is right

  • Convergence speed improves both inside and outside the exact iteration count used during training.
  • Solution quality after a fixed computational budget is higher than with hand-chosen ADMM parameters.
  • The method retains ADMM convergence guarantees because the learned values are inserted into the original update rules.
  • Generalization holds for new instances sampled from the same problem distribution used in training.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same GNN embedding idea could be applied to other first-order distributed methods that admit a graph interpretation.
  • In deployed systems the learned predictor might replace manual grid search over step sizes and penalty parameters.
  • Robustness can be tested by feeding the trained network instances drawn from a modestly shifted distribution.
  • Hybrid schemes become possible in which the GNN supplies guidance only for the first many iterations and then hands control back to standard ADMM.

Load-bearing premise

A GNN trained on a specific problem class will output hyperparameters that keep the algorithm convergent and improve performance on new instances from the same distribution.

What would settle it

Collect a fresh set of problem instances drawn from the training distribution, run both standard ADMM and the learned variant for the same number of iterations, and check whether the learned version consistently reaches a target accuracy in fewer iterations or with lower final error.

Figures

Figures reproduced from arXiv: 2509.05288 by Daniel Hern\'andez Escobar, Henri Doerks, Jens Sj\"olund, Paul H\"ausner.

Figure 1
Figure 1. Figure 1: Computation structure of iteration k of the decentralized distributed ADMM algorithm computed as two message-passing steps in our proposed GNN (see Algorithm 2.2). The initial node features V k contain the previous ADMM iterates (x k i , yk i , λk i ) for each node i and the non￾zero edge features E describe the weighted pairwise connection of nodes in the network which is fixed across iterations. After tw… view at source ↗
Figure 2
Figure 2. Figure 2: Semi-log plots of the relative objective for the two problem classes across all 20 iterations. The network is only unrolled for the first 10 iterations during training as in the previous results. One exception is the learned communication matrix for the least-squares problem which leads to a higher objective value for most of the iterations even though the distance to the minimizer gets reduced as we have … view at source ↗
read the original abstract

Distributed optimization is fundamental to large-scale machine learning and control applications. Among existing methods, the alternating direction method of multipliers (ADMM) has gained popularity due to its strong convergence guarantees and suitability for decentralized computation. However, ADMM can suffer from slow convergence and high sensitivity to hyperparameter choices. In this work, we show that distributed ADMM iterations can be naturally expressed within the message-passing framework of graph neural networks (GNNs). Building on this connection, we propose learning adaptive step sizes and communication weights through a GNN that predicts these yperparameters based on the current iterates. By unrolling ADMM for a fixed number of iterations, we train the network end-to-end to minimize the solution distance after these iterations for a given problem class, while preserving the algorithm's convergence properties. Numerical experiments demonstrate that our learned variant consistently improves convergence speed and solution quality compared to standard ADMM, both within the trained computational budget and beyond. The code is available at https://github.com/paulhausner/learning-distributed-admm.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The paper claims that distributed ADMM iterations can be expressed as message-passing operations on graphs, allowing a GNN to learn adaptive step sizes and communication weights from current iterates. The network is trained end-to-end by unrolling a fixed number of ADMM steps and minimizing the distance to the optimum for a given problem class, while aiming to preserve convergence properties. Numerical experiments are reported to show consistent improvements in convergence speed and solution quality over standard ADMM, both inside and outside the training horizon.

Significance. If the learned parameters reliably accelerate ADMM without introducing instability, the work would offer a practical bridge between classical distributed optimization and learned accelerators for specific problem distributions. The explicit code release supports reproducibility and external verification of the empirical claims.

major comments (2)
  1. [Abstract and §5] Abstract and §5 (Numerical Experiments): the central claim that improvements hold 'both within the trained computational budget and beyond' rests on fixed-horizon unrolling that minimizes distance only after K steps. No analysis, additional long-horizon experiments, or stability checks are provided to confirm that GNN-predicted state-dependent parameters maintain convergence (e.g., satisfying the standard ADMM conditions on ρ) or prevent oscillation/divergence on fresh instances drawn from the same distribution after the training horizon.
  2. [§4] §4 (Method): while the message-passing reformulation of ADMM is clean, the paper does not derive or verify that the learned, input-dependent weights and step sizes continue to satisfy the convergence assumptions of classical ADMM once the GNN is evaluated outside the exact training distribution or for iteration counts >K.
minor comments (2)
  1. [Abstract] Abstract: 'yperparameters' appears to be a typo for 'hyperparameters'.
  2. [§5] The description of the exact problem distributions used for training versus testing, and any hyperparameter sensitivity analysis, would benefit from additional detail to support the generalization claims.

Simulated Author's Rebuttal

2 responses · 1 unresolved

We thank the referee for the constructive feedback. Below we provide point-by-point responses to the major comments. We will make revisions to address the concerns about empirical validation and clarification of theoretical aspects.

read point-by-point responses
  1. Referee: [Abstract and §5] Abstract and §5 (Numerical Experiments): the central claim that improvements hold 'both within the trained computational budget and beyond' rests on fixed-horizon unrolling that minimizes distance only after K steps. No analysis, additional long-horizon experiments, or stability checks are provided to confirm that GNN-predicted state-dependent parameters maintain convergence (e.g., satisfying the standard ADMM conditions on ρ) or prevent oscillation/divergence on fresh instances drawn from the same distribution after the training horizon.

    Authors: We agree that the training procedure uses fixed-horizon unrolling. In the experiments of §5, we do evaluate the learned ADMM for a larger number of iterations than K on unseen problem instances and report sustained improvements without observed divergence. However, we acknowledge the absence of dedicated long-horizon analysis or formal stability checks. We will revise the abstract and §5 to emphasize that the 'beyond' claim is based on empirical observation up to a moderate extension of the horizon, and add additional plots showing performance for up to 2K or 3K iterations with discussion of stability. A complete theoretical analysis of convergence for the learned parameters remains an open question and is noted as future work. revision: partial

  2. Referee: [§4] §4 (Method): while the message-passing reformulation of ADMM is clean, the paper does not derive or verify that the learned, input-dependent weights and step sizes continue to satisfy the convergence assumptions of classical ADMM once the GNN is evaluated outside the exact training distribution or for iteration counts >K.

    Authors: The reformulation shows that each ADMM iteration corresponds to a message-passing step where the GNN predicts the step sizes and weights based on local iterates. During training, we constrain the outputs to be positive to mimic the standard ADMM parameter ρ > 0. Nevertheless, we do not derive that these state-dependent predictions necessarily fulfill all classical convergence conditions for every possible input or for arbitrarily large iteration counts. We will update §4 to include a clearer statement that while the structure preserves the ADMM form, the convergence guarantees are inherited only heuristically and are supported by the empirical results rather than proven for the learned case. This limitation will be explicitly discussed. revision: yes

standing simulated objections not resolved
  • A formal derivation verifying that the GNN-predicted parameters satisfy ADMM convergence assumptions outside the training distribution and for iteration counts exceeding K.

Circularity Check

0 steps flagged

No circularity: training objective independent of performance claims

full rationale

The paper connects distributed ADMM iterations to GNN message passing as a structural observation, then defines a GNN that outputs state-dependent step sizes and weights. These are trained by unrolling a fixed horizon K and minimizing distance to the optimum after K steps. This loss is independent of the reported long-term convergence improvements and does not reduce the empirical results to the inputs by construction. No self-citation chain, uniqueness theorem, or ansatz is invoked to force the outcome; the central claims rest on numerical experiments for a given problem class, with released code enabling external checks. The derivation remains self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

1 free parameters · 1 axioms · 0 invented entities

The central claim rests on the ability of the learned GNN to generalize across problem instances and on the assumption that unrolling a fixed number of iterations is sufficient to capture long-term convergence behavior. No new physical entities are introduced.

free parameters (1)
  • GNN weights
    The parameters of the graph neural network are fitted during end-to-end training to minimize solution distance after a fixed number of ADMM iterations.
axioms (1)
  • domain assumption Distributed ADMM iterations can be exactly expressed as message-passing operations on a graph.
    This equivalence is the foundation for applying GNNs; it is stated in the abstract and forms the modeling choice.

pith-pipeline@v0.9.0 · 5721 in / 1333 out tokens · 25594 ms · 2026-05-18T18:43:17.198370+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

44 extracted references · 44 canonical work pages · 2 internal anchors

  1. [1]

    Tutorial on amortized optimization.Foundations and Trends in Machine Learning, 16(5):592–732, 2023

    Brandon Amos. Tutorial on amortized optimization.Foundations and Trends in Machine Learning, 16(5):592–732, 2023. 2, 7

  2. [2]

    Optnet: Differentiable optimization as a layer in neural networks

    Brandon Amos and J Zico Kolter. Optnet: Differentiable optimization as a layer in neural networks. InInternational conference on machine learning, pages 136–145. PMLR, 2017. 6

  3. [3]

    Learning to learn by gradient descent by gradient descent.Advances in neural information processing systems, 29, 2016

    Marcin Andrychowicz, Misha Denil, Sergio Gomez, Matthew W Hoffman, David Pfau, Tom Schaul, Brendan Shillingford, and Nando De Freitas. Learning to learn by gradient descent by gradient descent.Advances in neural information processing systems, 29, 2016. 2

  4. [4]

    Principled acceleration of iterative numerical methods using machine learning

    Sohei Arisaka and Qianxiao Li. Principled acceleration of iterative numerical methods using machine learning. InInternational Conference on Machine Learning, pages 1041–1059. PMLR,

  5. [5]

    Accelerated forward- backward optimization using deep learning.SIAM Journal on Optimization, 34(2):1236–1263,

    Sebastian Banert, Jevgenija Rudzusika, Ozan Öktem, and Jonas Adler. Accelerated forward- backward optimization using deep learning.SIAM Journal on Optimization, 34(2):1236–1263,

  6. [6]

    Rina Foygel Barber and Emil Y . Sidky. Convergence for nonconvex admm, with applications to ct imaging.Journal of Machine Learning Research, 25(38):1–46, 2024. 4

  7. [7]

    Relational inductive biases, deep learning, and graph networks

    Peter W Battaglia, Jessica B Hamrick, Victor Bapst, Alvaro Sanchez-Gonzalez, Vinicius Zambaldi, Mateusz Malinowski, Andrea Tacchetti, David Raposo, Adam Santoro, Ryan Faulkner, et al. Relational inductive biases, deep learning, and graph networks.arXiv preprint arXiv:1806.01261, 2018. 1, 2, 3, 5

  8. [8]

    Distributed opti- mization and statistical learning via the alternating direction method of multipliers.Foundations and Trends in Machine Learning, 3(1):1–122, 2011

    Stephen Boyd, Neal Parikh, Eric Chu, Borja Peleato, and Jonathan Eckstein. Distributed opti- mization and statistical learning via the alternating direction method of multipliers.Foundations and Trends in Machine Learning, 3(1):1–122, 2011. ISSN 1935-8237. 1, 2, 4

  9. [9]

    JAX: composable transformations of Python+NumPy programs, 2018

    James Bradbury, Roy Frostig, Peter Hawkins, Matthew James Johnson, Chris Leary, Dougal Maclaurin, George Necula, Adam Paszke, Jake VanderPlas, Skye Wanderman-Milne, and Qiao Zhang. JAX: composable transformations of Python+NumPy programs, 2018. URL http://github.com/jax-ml/jax. 6, 15

  10. [10]

    A simple yet effective baseline for non-attributed graph classification

    Chen Cai and Yusu Wang. A simple yet effective baseline for non-attributed graph classification. arXiv preprint arXiv:1811.03508, 2018. 6, 15

  11. [11]

    Greedy low-rank gra- dient compression for distributed learning with convergence guarantees.arXiv preprint arXiv:2507.08784, 2025

    Chuyan Chen, Yutong He, Pengrui Li, Weichen Jia, and Kun Yuan. Greedy low-rank gra- dient compression for distributed learning with convergence guarantees.arXiv preprint arXiv:2507.08784, 2025. 9

  12. [12]

    Learning to optimize: A primer and a benchmark.Journal of Machine Learning Research, 23(189):1–59, 2022

    Tianlong Chen, Xiaohan Chen, Wuyang Chen, Howard Heaton, Jialin Liu, Zhangyang Wang, and Wotao Yin. Learning to optimize: A primer and a benchmark.Journal of Machine Learning Research, 23(189):1–59, 2022. 1, 2, 6

  13. [13]

    Expressive power of graph neural networks for (mixed-integer) quadratic programs.arXiv preprint arXiv:2406.05938,

    Ziang Chen, Xiaohan Chen, Jialin Liu, Xinshang Wang, and Wotao Yin. Expressive power of graph neural networks for (mixed-integer) quadratic programs.arXiv preprint arXiv:2406.05938,

  14. [14]

    3 10 Learning to accelerate distributed ADMM using graph neural networks

  15. [15]

    Privacy-preserving distributed optimization and learning

    Ziqin Chen and Yongqiang Wang. Privacy-preserving distributed optimization and learning. arXiv preprint arXiv:2403.00157, 2024. 3

  16. [16]

    Q-shed: Distributed optimization at the edge via hessian eigenvectors quantization

    Nicolò Dal Fabbro, Michele Rossi, Luca Schenato, and Subhrakanti Dey. Q-shed: Distributed optimization at the edge via hessian eigenvectors quantization. InICC 2023-IEEE International Conference on Communications, pages 4403–4408. IEEE, 2023. 3

  17. [17]

    Bertsekas

    Jonathan Eckstein and Dimitri P. Bertsekas. An alternating direction method for linear program- ming. Laboratory for Information and Decision Systems, Massachusetts Institute of Technology,

  18. [18]

    On random graphs i.Publicationes Mathematicae Debrecen, 6:290–297,

    P Erdös and A Rényi. On random graphs i.Publicationes Mathematicae Debrecen, 6:290–297,

  19. [19]

    Learning preconditioners for inverse problems

    Patrick Fahy, Mohammad Golbabaee, and Matthias J Ehrhardt. Greedy learning to optimize with convergence guarantees.arXiv preprint arXiv:2406.00260, 2024. 3

  20. [20]

    A dual algorithm for the solution of nonlinear variational problems via finite element approximation.Computers & mathematics with applications (1987), 2(1):17–40, 1976

    Daniel Gabay and Bertrand Mercier. A dual algorithm for the solution of nonlinear variational problems via finite element approximation.Computers & mathematics with applications (1987), 2(1):17–40, 1976. ISSN 0898-1221. 1, 4

  21. [21]

    Gradient methods with online scaling part i

    Wenzhi Gao, Ya-Chi Chu, Yinyu Ye, and Madeleine Udell. Gradient methods with online scaling part i. theoretical foundations.arXiv preprint arXiv:2505.23081, 2025. 3

  22. [22]

    Neural message passing for quantum chemistry

    Justin Gilmer, Samuel S Schoenholz, Patrick F Riley, Oriol Vinyals, and George E Dahl. Neural message passing for quantum chemistry. InInternational conference on machine learning, pages 1263–1272. PMLR, 2017. 2, 4

  23. [23]

    Glowinski and A

    R. Glowinski and A. Marroco. Sur l’approximation, par éléments finis d’ordre un, et la résolution, par pénalisation-dualité d’une classe de problèmes de dirichlet non linéaires.Revue française d’automatique, informatique, recherche opérationnelle. Analyse numérique, 9(R2): 41–76, 1975. ISSN 0397-9342. 1, 4

  24. [24]

    Jraph: A library for graph neural networks in jax., 2020

    Jonathan Godwin*, Thomas Keck*, Peter Battaglia, Victor Bapst, Thomas Kipf, Yujia Li, Kimberly Stachenfeld, Petar Veliˇckovi´c, and Alvaro Sanchez-Gonzalez. Jraph: A library for graph neural networks in jax., 2020. URLhttp://github.com/deepmind/jraph. 15

  25. [25]

    Neural incomplete factorization: learning preconditioners for the conjugate gradient method.Transactions on Machine Learning Research,

    Paul Häusner, Ozan Öktem, and Jens Sjölund. Neural incomplete factorization: learning preconditioners for the conjugate gradient method.Transactions on Machine Learning Research,

  26. [26]

    Learning incomplete factorization preconditioners for GMRES

    Paul Häusner, Aleix Nieto Juscafresa, and Jens Sjölund. Learning incomplete factorization preconditioners for GMRES. InProceedings of the 6th Northern Lights Deep Learning Confer- ence (NLDL), volume 265 ofProceedings of Machine Learning Research, pages 85–99. PMLR,

  27. [27]

    A mathematics- inspired learning-to-optimize framework for decentralized optimization.arXiv preprint arXiv:2410.01700, 2024

    Yutong He, Qiulin Shang, Xinmeng Huang, Jialin Liu, and Kun Yuan. A mathematics- inspired learning-to-optimize framework for decentralized optimization.arXiv preprint arXiv:2410.01700, 2024. 3

  28. [28]

    Flax: A neural network library and ecosystem for JAX, 2024

    Jonathan Heek, Anselm Levskaya, Avital Oliver, Marvin Ritter, Bertrand Rondepierre, Andreas Steiner, and Marc van Zee. Flax: A neural network library and ecosystem for JAX, 2024. URL http://github.com/google/flax. 15

  29. [29]

    Accelerating quadratic optimization with reinforcement learning

    Jeffrey Ichnowski, Paras Jain, Bartolomeo Stellato, Goran Banjac, Michael Luo, Francesco Borrelli, Joseph E Gonzalez, Ion Stoica, and Ken Goldberg. Accelerating quadratic optimization with reinforcement learning. InAdvances in Neural Information Processing Systems, volume 34, pages 21043–21055, 2021. 2

  30. [30]

    Convergence rate of distributed ADMM over networks

    Ali Makhdoumi and Asuman Ozdaglar. Convergence rate of distributed ADMM over networks. IEEE transactions on automatic control, 62(10):5082–5095, 2017. ISSN 0018-9286. 2, 4, 9, 16

  31. [31]

    Algorithm unrolling: Interpretable, efficient deep learning for signal and image processing.IEEE Signal Processing Magazine, 38(2):18–44,

    Vishal Monga, Yuelong Li, and Yonina C Eldar. Algorithm unrolling: Interpretable, efficient deep learning for signal and image processing.IEEE Signal Processing Magazine, 38(2):18–44,

  32. [32]

    Distributed learn-to-optimize: Limited communications opti- mization over networks via deep unfolded distributed ADMM.IEEE Transactions on Mobile Computing, 2024

    Yoav Noah and Nir Shlezinger. Distributed learn-to-optimize: Limited communications opti- mization over networks via deep unfolded distributed ADMM.IEEE Transactions on Mobile Computing, 2024. 3 11 Learning to accelerate distributed ADMM using graph neural networks

  33. [33]

    Exploring the power of graph neural networks in solving linear optimization problems

    Chendi Qian, Didier Chételat, and Christopher Morris. Exploring the power of graph neural networks in solving linear optimization problems. InInternational Conference on Artificial Intelligence and Statistics, pages 1432–1440. PMLR, 2024. 3

  34. [34]

    Learning algorithm hyperparameters for fast para- metric convex optimization.arXiv preprint arXiv:2411.15717, 2024

    Rajiv Sambharya and Bartolomeo Stellato. Learning algorithm hyperparameters for fast para- metric convex optimization.arXiv preprint arXiv:2411.15717, 2024. 2

  35. [35]

    Learning to warm- start fixed-point optimization algorithms.Journal of Machine Learning Research, 25(166): 1–46, 2024

    Rajiv Sambharya, Georgina Hall, Brandon Amos, and Bartolomeo Stellato. Learning to warm- start fixed-point optimization algorithms.Journal of Machine Learning Research, 25(166): 1–46, 2024. 2

  36. [36]

    Deep distributed optimization for large-scale quadratic program- ming.arXiv preprint arXiv:2412.12156, 2024

    Augustinos D Saravanos, Hunter Kuperman, Alex Oshin, Arshiya Taj Abdul, Vincent Pacelli, and Evangelos A Theodorou. Deep distributed optimization for large-scale quadratic program- ming.arXiv preprint arXiv:2412.12156, 2024. 2, 9

  37. [37]

    The graph neural network model.IEEE transactions on neural networks, 20(1):61–80, 2008

    Franco Scarselli, Marco Gori, Ah Chung Tsoi, Markus Hagenbuchner, and Gabriele Monfardini. The graph neural network model.IEEE transactions on neural networks, 20(1):61–80, 2008. 1

  38. [38]

    Graph-based neural acceleration for nonnegative matrix factorization, 2022

    Jens Sjölund and Maria Bånkestad. Graph-based neural acceleration for nonnegative matrix factorization, 2022. 3

  39. [39]

    The admm algorithm for distributed quadratic problems: Parameter selection and constraint preconditioning.IEEE Transactions on Signal Processing, 64(2):290–305, 2015

    Andre Teixeira, Euhanna Ghadimi, Iman Shames, Henrik Sandberg, and Mikael Johansson. The admm algorithm for distributed quadratic problems: Parameter selection and constraint preconditioning.IEEE Transactions on Signal Processing, 64(2):290–305, 2015. 3

  40. [40]

    Instance Normalization: The Missing Ingredient for Fast Stylization

    Dmitry Ulyanov, Andrea Vedaldi, and Victor Lempitsky. Instance normalization: The missing ingredient for fast stylization.arXiv preprint arXiv:1607.08022, 2016. 6

  41. [41]

    Neural algorithmic reasoning.Patterns, 2(7), 2021

    Petar Veli ˇckovi´c and Charles Blundell. Neural algorithmic reasoning.Patterns, 2(7), 2021. 2

  42. [42]

    Distributed least squares solver for network linear equations.Automatica, 113:108798, 2020

    Tao Yang, Jemin George, Jiahu Qin, Xinlei Yi, and Junfeng Wu. Distributed least squares solver for network linear equations.Automatica, 113:108798, 2020. 7

  43. [43]

    Accelerating admm for efficient simulation and optimization.ACM Transactions on Graphics (TOG), 38(6):1–21, 2019

    Juyong Zhang, Yue Peng, Wenqing Ouyang, and Bailin Deng. Accelerating admm for efficient simulation and optimization.ACM Transactions on Graphics (TOG), 38(6):1–21, 2019. 1

  44. [44]

    Asynchronous distributed admm for consensus optimization

    Ruiliang Zhang and James Kwok. Asynchronous distributed admm for consensus optimization. In Eric P. Xing and Tony Jebara, editors,Proceedings of the 31st International Conference on Machine Learning, volume 32 ofProceedings of Machine Learning Research, pages 1701–1709, Bejing, China, 22–24 Jun 2014. PMLR. 7 12 Learning to accelerate distributed ADMM usin...