Distributed Algorithm for the Global Optimal Controller of Nonlinear Multi-Agent Systems

Juanjuan Xu; Ruixue Li; Wenjing Yang; Xun Li; Zhaorong Zhang

arxiv: 2604.05443 · v1 · submitted 2026-04-07 · 🧮 math.OC

Distributed Algorithm for the Global Optimal Controller of Nonlinear Multi-Agent Systems

Ruixue Li , Wenjing Yang , Zhaorong Zhang , Xun Li , Juanjuan Xu This is my paper

Pith reviewed 2026-05-10 19:36 UTC · model grok-4.3

classification 🧮 math.OC

keywords distributed optimal controlnonlinear multi-agent systemsHamilton-Jacobi-Bellman equationprivate informationglobal optimal controllerdistributed algorithm

0 comments

The pith

A distributed algorithm computes the global optimal controller for nonlinear multi-agent systems by approximating the Hamilton-Jacobi-Bellman equation using only local information.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper investigates distributed optimal control for nonlinear multi-agent systems where each agent's state and dynamics are private and shared only among communicating agents. This information structure makes traditional methods that use all dynamic structures ineffective. The authors propose a distributed algorithm that approximates the Hamilton-Jacobi-Bellman equation in a distributed way to find the global optimal controller. This approach is relevant for applications like collaborative industrial control where confidentiality is important. Numerical simulations demonstrate that the algorithm is effective in practice.

Core claim

Under practical information structures where state and system dynamics of each agent are private and can only be shared among communicating agents, a distributed algorithm achieves the global optimal controller for nonlinear multi-agent systems through distributed approximation of the Hamilton-Jacobi-Bellman equation.

What carries the argument

Distributed approximation of the Hamilton-Jacobi-Bellman equation, which computes the optimal control policy based on local neighbor communications and private agent dynamics.

If this is right

The global optimal controller is obtained without requiring full access to all agents' dynamics.
Traditional distributed control methods become ineffective, necessitating this new approach.
Practical numerical simulations confirm the effectiveness of the proposed algorithm.
This enables collaborative control in fields requiring industrial confidentiality.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

If the distributed approximation converges to the centralized solution, it could apply to broader classes of optimal control problems with communication constraints.
Testing the algorithm against known centralized solutions in benchmark multi-agent systems would verify its accuracy.
Extensions might include handling time-varying topologies or adding robustness to communication delays.

Load-bearing premise

A distributed approximation of the Hamilton-Jacobi-Bellman equation can recover the true global optimum when each agent only accesses local neighbor information and private dynamics.

What would settle it

Compare the controller generated by the distributed algorithm to the one obtained from solving the centralized Hamilton-Jacobi-Bellman equation for a known nonlinear multi-agent system and check if they produce equivalent performance.

Figures

Figures reproduced from arXiv: 2604.05443 by Juanjuan Xu, Ruixue Li, Wenjing Yang, Xun Li, Zhaorong Zhang.

**Figure 2.** Figure 2: , and the weighting matrices of the cost function (2) are set as Q =   2I, −2I, 0, 0, 0 −2I, 6I, −2I, 0, −2I 0, −2I, 2I, 0, 0 0, 0, 0, 2I, −2I 0, −2I, 0, −2I, 4I   , R = 0.01I [PITH_FULL_IMAGE:figures/full_fig_p006_2.png] view at source ↗

**Figure 3.** Figure 3: The trajectories of [PITH_FULL_IMAGE:figures/full_fig_p007_3.png] view at source ↗

**Figure 4.** Figure 4: The trajectories of [PITH_FULL_IMAGE:figures/full_fig_p007_4.png] view at source ↗

**Figure 5.** Figure 5: The trajectories of [PITH_FULL_IMAGE:figures/full_fig_p007_5.png] view at source ↗

**Figure 6.** Figure 6: The trajectories of [PITH_FULL_IMAGE:figures/full_fig_p008_6.png] view at source ↗

**Figure 7.** Figure 7: The trajectories of [PITH_FULL_IMAGE:figures/full_fig_p008_7.png] view at source ↗

read the original abstract

In this paper, we investigate the distributed optimal control problem for a kind of nonlinear multi-agent systems. In particular,both the state and the system dynamic structures of each agent are private and can only be shared among communicating agents.This type of information structure is inevitable in fields such as collaborative control for industrial confidentiality, and renders traditional distributed control methods using all systems' dynamic structures ineffective. The primary contribution is the proposal of a distributed algorithm for the global optimal controller under such practical information structure via distributed approximation of the Hamilton-Jacobi-Bellman equation. Practical numerical simulation demonstrates the effectiveness of the proposed algorithm.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper gives a distributed HJB approximation for nonlinear MAS optimal control under private dynamics, but supplies no convergence proof or error bound so the global-optimality claim rests on simulation alone.

read the letter

The core idea is a distributed algorithm that lets each agent approximate the Hamilton-Jacobi-Bellman equation using only neighbor information and its own private dynamics, then applies the result to compute a global controller for the multi-agent system. This directly targets the practical constraint that full system matrices and states cannot be shared in many industrial settings. The numerical example shows the closed-loop trajectories behave reasonably, which is the main concrete output the authors provide. That is the useful piece: it translates an existing approximation technique to a privacy-preserving information structure that standard distributed control papers usually ignore. The construction itself appears incremental rather than foundational; the HJB approximation step follows the usual policy-iteration or value-iteration pattern, and the novelty is mainly in how the communication graph is used to keep dynamics local. The soft spot is the missing theory. The abstract and the stress-test note both highlight the absence of any derivation, convergence rate, or invariance argument showing that the local approximations recover the true centralized optimum. Without those, the simulation cannot rule out the possibility that the method converges to a different value function or drifts under the stated neighbor-only access. Minor implementation details such as how the distributed updates are initialized or terminated are also not visible. This work is aimed at researchers who already work on distributed optimal control for nonlinear agents and need to accommodate privacy constraints. A reader looking for a ready-to-use algorithm with guarantees will be disappointed, but someone hunting for ideas on how to decentralize HJB solvers might extract a useful construction. The paper deserves a serious referee because the problem statement is timely and the simulation offers a starting point; the referees can then press for the missing analysis rather than desk-rejecting on scope alone. I would send it out for review with the expectation that substantial revision on the convergence question will be required.

Referee Report

1 major / 2 minor

Summary. The paper investigates the distributed optimal control problem for nonlinear multi-agent systems in which each agent's state and dynamics are private and shared only among neighbors. It proposes a distributed algorithm that approximates the Hamilton-Jacobi-Bellman equation locally to compute the global optimal controller under this limited information structure, with effectiveness illustrated by numerical simulations.

Significance. A rigorously justified distributed method for global optimality in nonlinear MAS under confidentiality constraints would be valuable for practical collaborative control applications. The numerical demonstration indicates potential feasibility, but the absence of any convergence analysis or error bounds for the distributed HJB approximation substantially weakens the significance of the claimed contribution.

major comments (1)

The central claim that the distributed HJB approximation recovers the exact global optimal controller (despite each agent accessing only local neighbor information and private dynamics) is load-bearing yet unsupported: the manuscript contains no convergence proof, invariance argument, or error bound establishing that local value-function approximations converge to the centralized HJB solution.

minor comments (2)

Abstract: missing space after 'In particular,' ('In particular,both').
The abstract refers to 'practical numerical simulation' but provides no description of the simulation setup, system dimensions, or performance metrics used to demonstrate effectiveness.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for the constructive comments and the opportunity to improve our manuscript. We address the major comment as follows.

read point-by-point responses

Referee: The central claim that the distributed HJB approximation recovers the exact global optimal controller (despite each agent accessing only local neighbor information and private dynamics) is load-bearing yet unsupported: the manuscript contains no convergence proof, invariance argument, or error bound establishing that local value-function approximations converge to the centralized HJB solution.

Authors: We acknowledge that the manuscript does not include a formal convergence proof or error bounds for the distributed HJB approximation. The algorithm is constructed such that the local approximations are designed to align with the global HJB solution through neighbor information exchange, and its effectiveness is illustrated via numerical simulations. To address this valid concern, we will revise the manuscript to include an invariance argument and error bounds, assuming a connected undirected graph and bounded approximation errors in the value function. This will rigorously justify the recovery of the global optimal controller. revision: yes

Circularity Check

0 steps flagged

No significant circularity in the proposed distributed HJB approximation

full rationale

The paper presents a new distributed algorithm for global optimal control of nonlinear MAS under private state and dynamics information, constructed via approximation of the HJB equation. The abstract and description frame this as an original proposal without any equations, fitted parameters, or self-citations that reduce the output to the inputs by construction. No load-bearing steps are shown to be self-definitional, renamed known results, or forced by prior author work. The derivation is self-contained as a constructive method, consistent with the reader's assessment of no obvious circularity.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The proposal rests on the standard existence of solutions to the HJB equation for the given nonlinear systems and on the assumption that neighbor communication suffices for the distributed approximation to converge; no free parameters or new entities are explicitly introduced in the abstract.

axioms (1)

domain assumption The nonlinear multi-agent system admits a solution to the Hamilton-Jacobi-Bellman equation under the given information structure.
Invoked implicitly when claiming that distributed approximation yields the global optimal controller.

pith-pipeline@v0.9.0 · 5401 in / 1175 out tokens · 88338 ms · 2026-05-10T19:36:26.961100+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

distributed approximation of the Hamilton-Jacobi-Bellman equation... each agent only utilizes information from its own information structure Si(t)
IndisputableMonolith/Foundation/ArithmeticFromLogic.lean LogicNat induction unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

value iteration... lim k→∞ V^{k+1}(t,x) = V(t,x)

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

29 extracted references · 29 canonical work pages

[1]

J., & Gawthrop, P

Chen, W.-H., Ballance, D. J., & Gawthrop, P. J. (2003). Optimal control of nonlinear systems: A predictive control approach. Automatica, 39(4), 633–641

work page 2003
[2]

Chen, F., & Ren, W. (2019). On the control of multi-agent systems: A survey. Foundations and Trends in Systems and Control, 6(4), 339–499

work page 2019
[3]

E., Ho, Y.-C., & Siouris, G

Bryson, A. E., Ho, Y.-C., & Siouris, G. M. (1979). Applied optimal control: Optimization, estimation, and control [Book review]. IEEE Transactions on Systems, Man, and Cybernetics, 9(6), 366–367

work page 1979
[4]

C., & Krishnakumar, K

Govindarajan, N., de Visser, C. C., & Krishnakumar, K. (2014). A sparse collocation method for solving time- dependent HJB equations using multivariate B-spline. Automatica, 50(9), 2234–2244

work page 2014
[5]

Sideris, A., & Bobrow, J. E. (2005). An efficient sequential linear quadratic algorithm for solving nonlinear optimal control problems. IEEE Transactions on Automatic Control , 50(12), 2043–2047

work page 2005
[6]

Lin, Q., Loxton, R., & Teo, K. L. (2014). The control parameterization method for nonlinear optimal control: A survey. Journal of Industrial and Management Optimization , 10(1), 275–309

work page 2014
[7]

Bertsekas, D. P. (2017). Value and policy iterations in optimal control and adaptive dynamic programming. IEEE Transactions on Neural Networks and Learning Systems , 28(3), 500–509

work page 2017
[8]

Tang, G., & Hauser, K. (2019). A data-driven indirect method for nonlinear optimal control. Astrodynamics, 3(4), 345-359

work page 2019
[9]

Bian, T., & Jiang, Z.-P. (2022). Reinforcement learning and adaptive optimal control for continuous-time nonlinear systems: A value iteration approach. IEEE Transactions on Neural Networks and Learning Systems , 33(7), 2781–2790

work page 2022
[10]

P., K¨ ohler, J., Zanelli, A., Bennani, S., & Zeilinger, M

Leeman, A. P., K¨ ohler, J., Zanelli, A., Bennani, S., & Zeilinger, M. N. (2025). Robust nonlinear optimal control via system level synthesis. IEEE Transactions on Automatic Control, 70(7), 4780–4787. 10

work page 2025
[11]

S., Callegari, J

Chaves, L. S., Callegari, J. M. S., Araujo, L. S., & Brandao, D. I. (2025). Impact of latency and packet error on communication in centralized microgrid control: Modeling and guidelines. IEEE Access, 13, 82732–82746

work page 2025
[12]

Sforni, L., Carnevale, G., & Notarstefano, G. (2025). A distributed feedback-based framework for nonlinear aggregative optimal control. IEEE Transactions on Automatic Control, 70(6), 3784–3799

work page 2025
[13]

Zhang, F., Tan, C., Wang, W., & Zhang, H. (2015). Approximate method of distributed control for continuous- time multi-agent systems. In Proceedings of the 34th Chinese Control Conference (CCC) (pp. 6974–6979)

work page 2015
[14]

Cao, Y., Yu, W., Ren, W., & Chen, G. (2013). An overview of recent progress in the study of distributed multi-agent coordination. IEEE Transactions on Industrial Informatics , 9(1), 427–438

work page 2013
[15]

Yi, P., Hong, Y., & Liu, F. (2016). Initialization-free distributed algorithms for optimal resource allocation with feasibility constraints and application to economic dispatch of power systems. Automatica, 74, 259–269

work page 2016
[16]

Jin, N., Xu, J., & Zhang, H. (2023). Distributed optimal consensus control of multiagent systems involving state and control dependent multiplicative noise. IEEE Transactions on Automatic Control , 68(12), 7787–7794

work page 2023
[17]

Wang, Q., Duan, Z., & Wang, J. (2020). Distributed optimal consensus control algorithm for continuous-time multi-agent systems. IEEE Transactions on Circuits and Systems II: Express Briefs, 67(1), 102–106

work page 2020
[18]

H., & Lewis, F

Movric, K. H., & Lewis, F. L. (2014). Cooperative optimal control for multi-agent systems on directed graph topologies. IEEE Transactions on Automatic Control , 59(3), 769–774

work page 2014
[19]

Motee, N., & Jadbabaie, A. (2008). Optimal control of spatially distributed systems. IEEE Transactions on Automatic Control, 53(7), 1616–1629

work page 2008
[20]

Yang, W., Zhang, Z., & Xu, J. (2025). Distributed Solving of Linear Quadratic Optimal Controller with Terminal State Constraint. arXiv Preprint arXiv:2504.05631

work page arXiv 2025
[21]

Mutoh, Y., & Kuribara, S. (2016). Control of quadrotor unmanned aerial vehicles using exact linearization technique with the static state feedback. Journal of Automation and Control Engineering, 340–346

work page 2016
[22]

Guo, X., Wei, G., Yao, M., & Zhang, P. (2022). Consensus control for multiple Euler-Lagrange systems based on high- order disturbance observer: an event-triggered approach. IEEE/CAA Journal of Automatica Sinica , 9(5), 945–948

work page 2022
[23]

Peng, Z., Wang, D., Li, T., & Han, M. (2020). Output- feedback cooperative formation maneuvering of autonomous surface vehicles with connectivity preservation and collision avoidance. IEEE Transactions on Cybernetics , 50(6), 2527–2535

work page 2020
[24]

Borrelli, F., & Keviczky, T. (2008). Distributed LQR design for identical dynamically decoupled systems. IEEE Transactions on Automatic Control , 53(8), 1901–1912

work page 2008
[25]

Beppu, H., Maruta, I., & Fujimoto, K. (2020). Approximate Dynamic Programming with Gaussian Processes for Optimal Control of Continuous-Time Nonlinear Systems. IFAC- PapersOnLine, 53(2), 6715-6722

work page 2020
[26]

Ren, W., & Sorensen, N. (2008). Distributed coordination architecture for multi-robot formation control. Robotics and Autonomous Systems , 56(4), 324–333

work page 2008
[27]

Consolini, L., Morbidi, F., Prattichizzo, D., & Tosques, M. (2008). Leader–follower formation control of nonholonomic mobile robots with input constraints. Automatica, 44(5), 1343–1349

work page 2008
[28]

Chen, H. (2002). Stochastic approximation and its applications. Kluwer

work page 2002
[29]

O., & Murray, R

Saber, R. O., & Murray, R. M. (2003). Consensus protocols for networks of dynamic agents. In Proceedings of the 2003 American Control Conference (pp. 951–956). 11

work page 2003

[1] [1]

J., & Gawthrop, P

Chen, W.-H., Ballance, D. J., & Gawthrop, P. J. (2003). Optimal control of nonlinear systems: A predictive control approach. Automatica, 39(4), 633–641

work page 2003

[2] [2]

Chen, F., & Ren, W. (2019). On the control of multi-agent systems: A survey. Foundations and Trends in Systems and Control, 6(4), 339–499

work page 2019

[3] [3]

E., Ho, Y.-C., & Siouris, G

Bryson, A. E., Ho, Y.-C., & Siouris, G. M. (1979). Applied optimal control: Optimization, estimation, and control [Book review]. IEEE Transactions on Systems, Man, and Cybernetics, 9(6), 366–367

work page 1979

[4] [4]

C., & Krishnakumar, K

Govindarajan, N., de Visser, C. C., & Krishnakumar, K. (2014). A sparse collocation method for solving time- dependent HJB equations using multivariate B-spline. Automatica, 50(9), 2234–2244

work page 2014

[5] [5]

Sideris, A., & Bobrow, J. E. (2005). An efficient sequential linear quadratic algorithm for solving nonlinear optimal control problems. IEEE Transactions on Automatic Control , 50(12), 2043–2047

work page 2005

[6] [6]

Lin, Q., Loxton, R., & Teo, K. L. (2014). The control parameterization method for nonlinear optimal control: A survey. Journal of Industrial and Management Optimization , 10(1), 275–309

work page 2014

[7] [7]

Bertsekas, D. P. (2017). Value and policy iterations in optimal control and adaptive dynamic programming. IEEE Transactions on Neural Networks and Learning Systems , 28(3), 500–509

work page 2017

[8] [8]

Tang, G., & Hauser, K. (2019). A data-driven indirect method for nonlinear optimal control. Astrodynamics, 3(4), 345-359

work page 2019

[9] [9]

Bian, T., & Jiang, Z.-P. (2022). Reinforcement learning and adaptive optimal control for continuous-time nonlinear systems: A value iteration approach. IEEE Transactions on Neural Networks and Learning Systems , 33(7), 2781–2790

work page 2022

[10] [10]

P., K¨ ohler, J., Zanelli, A., Bennani, S., & Zeilinger, M

Leeman, A. P., K¨ ohler, J., Zanelli, A., Bennani, S., & Zeilinger, M. N. (2025). Robust nonlinear optimal control via system level synthesis. IEEE Transactions on Automatic Control, 70(7), 4780–4787. 10

work page 2025

[11] [11]

S., Callegari, J

Chaves, L. S., Callegari, J. M. S., Araujo, L. S., & Brandao, D. I. (2025). Impact of latency and packet error on communication in centralized microgrid control: Modeling and guidelines. IEEE Access, 13, 82732–82746

work page 2025

[12] [12]

Sforni, L., Carnevale, G., & Notarstefano, G. (2025). A distributed feedback-based framework for nonlinear aggregative optimal control. IEEE Transactions on Automatic Control, 70(6), 3784–3799

work page 2025

[13] [13]

Zhang, F., Tan, C., Wang, W., & Zhang, H. (2015). Approximate method of distributed control for continuous- time multi-agent systems. In Proceedings of the 34th Chinese Control Conference (CCC) (pp. 6974–6979)

work page 2015

[14] [14]

Cao, Y., Yu, W., Ren, W., & Chen, G. (2013). An overview of recent progress in the study of distributed multi-agent coordination. IEEE Transactions on Industrial Informatics , 9(1), 427–438

work page 2013

[15] [15]

Yi, P., Hong, Y., & Liu, F. (2016). Initialization-free distributed algorithms for optimal resource allocation with feasibility constraints and application to economic dispatch of power systems. Automatica, 74, 259–269

work page 2016

[16] [16]

Jin, N., Xu, J., & Zhang, H. (2023). Distributed optimal consensus control of multiagent systems involving state and control dependent multiplicative noise. IEEE Transactions on Automatic Control , 68(12), 7787–7794

work page 2023

[17] [17]

Wang, Q., Duan, Z., & Wang, J. (2020). Distributed optimal consensus control algorithm for continuous-time multi-agent systems. IEEE Transactions on Circuits and Systems II: Express Briefs, 67(1), 102–106

work page 2020

[18] [18]

H., & Lewis, F

Movric, K. H., & Lewis, F. L. (2014). Cooperative optimal control for multi-agent systems on directed graph topologies. IEEE Transactions on Automatic Control , 59(3), 769–774

work page 2014

[19] [19]

Motee, N., & Jadbabaie, A. (2008). Optimal control of spatially distributed systems. IEEE Transactions on Automatic Control, 53(7), 1616–1629

work page 2008

[20] [20]

Yang, W., Zhang, Z., & Xu, J. (2025). Distributed Solving of Linear Quadratic Optimal Controller with Terminal State Constraint. arXiv Preprint arXiv:2504.05631

work page arXiv 2025

[21] [21]

Mutoh, Y., & Kuribara, S. (2016). Control of quadrotor unmanned aerial vehicles using exact linearization technique with the static state feedback. Journal of Automation and Control Engineering, 340–346

work page 2016

[22] [22]

Guo, X., Wei, G., Yao, M., & Zhang, P. (2022). Consensus control for multiple Euler-Lagrange systems based on high- order disturbance observer: an event-triggered approach. IEEE/CAA Journal of Automatica Sinica , 9(5), 945–948

work page 2022

[23] [23]

Peng, Z., Wang, D., Li, T., & Han, M. (2020). Output- feedback cooperative formation maneuvering of autonomous surface vehicles with connectivity preservation and collision avoidance. IEEE Transactions on Cybernetics , 50(6), 2527–2535

work page 2020

[24] [24]

Borrelli, F., & Keviczky, T. (2008). Distributed LQR design for identical dynamically decoupled systems. IEEE Transactions on Automatic Control , 53(8), 1901–1912

work page 2008

[25] [25]

Beppu, H., Maruta, I., & Fujimoto, K. (2020). Approximate Dynamic Programming with Gaussian Processes for Optimal Control of Continuous-Time Nonlinear Systems. IFAC- PapersOnLine, 53(2), 6715-6722

work page 2020

[26] [26]

Ren, W., & Sorensen, N. (2008). Distributed coordination architecture for multi-robot formation control. Robotics and Autonomous Systems , 56(4), 324–333

work page 2008

[27] [27]

Consolini, L., Morbidi, F., Prattichizzo, D., & Tosques, M. (2008). Leader–follower formation control of nonholonomic mobile robots with input constraints. Automatica, 44(5), 1343–1349

work page 2008

[28] [28]

Chen, H. (2002). Stochastic approximation and its applications. Kluwer

work page 2002

[29] [29]

O., & Murray, R

Saber, R. O., & Murray, R. M. (2003). Consensus protocols for networks of dynamic agents. In Proceedings of the 2003 American Control Conference (pp. 951–956). 11

work page 2003