arxiv: 2605.13269 · v1 · submitted 2026-05-13 · 📡 eess.SY · cs.SY

Recognition: unknown

Submodular Multi-Agent Policy Learning for Online Distributed Task Allocation in Open Multi-Agent Systems

Jing Liu , Yangyang Yang , Luca Ballotta , Fangfei Li , Yang Tang , Ruggero Carli

Authors on Pith no claims yet

Pith reviewed 2026-05-14 18:41 UTC · model grok-4.3

classification 📡 eess.SY cs.SY

keywords multi-agent reinforcement learningsubmodular optimizationtask allocationpolicy gradientdynamic regretpartition matroid

0 comments

The pith

The Partition Multilinear Extension supplies unbiased marginal gradients from submodular difference rewards for decentralized categorical policies.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

This paper shows how to adapt policy gradients to multi-agent systems where agents choose from categorical distributions and the team utility is submodular. The key step is defining the Partition Multilinear Extension, a continuous surrogate whose expected value matches the utility obtained from the agents' independent categorical choices. Submodular difference rewards then give unbiased estimates of the marginal gradients of this extension, which support a projected stochastic gradient method with a 1/2-approximation guarantee per stage. The resulting algorithm, SubMAPG, runs in a centralized-training decentralized-execution mode and extends to open systems via graph neural networks, delivering sublinear dynamic regret when the optimal marginals change slowly.

Core claim

The paper establishes the Partition Multilinear Extension as the right continuous relaxation for expected team utility under factorized categorical policies, proves that submodular difference rewards yield unbiased estimators of its marginal gradients, and uses these estimators to obtain a stagewise 1/2-approximation and sublinear dynamic regret for the projected dynamics in slowly varying environments.

What carries the argument

Partition Multilinear Extension (PME): a continuous relaxation of expected utility under independent categorical policies that admits unbiased marginal gradients from submodular difference rewards.

If this is right

The SubMAPG framework achieves a stagewise 1/2-approximation to the optimal PME value.
It guarantees sublinear dynamic regret bounded by the path length of the optimal marginals.
Graph neural network policies enable handling of time-varying numbers of agents and targets.
Empirical results show superiority over local greedy and shared-reward methods in coverage and tracking tasks.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same relaxation technique may apply to other partition-constrained problems outside reinforcement learning.
Dynamic regret analysis could be used to tune learning rates in real-time robotic systems.
Combining PME with other reward shaping methods might further reduce variance in multi-agent settings.

Load-bearing premise

The team utility function must be submodular for the difference rewards to supply unbiased gradient information.

What would settle it

A direct measurement showing that the policy-gradient estimator has nonzero bias when the utility function is not submodular.

Figures

Figures reproduced from arXiv: 2605.13269 by Fangfei Li, Jing Liu, Luca Ballotta, Ruggero Carli, Yang Tang, Yangyang Yang.

**Figure 1.** Figure 1: An open multi-agent system for task allocation. Mobile sensors collaboratively track mobile targets under time-varying agent populations, dynamic tasks, limited communication, and bounded sensing. A central difficulty in these problems is that team utilities are typically non-additive across agents. For example, overlapping fields of view in [PITH_FULL_IMAGE:figures/full_fig_p002_1.png] view at source ↗

**Figure 2.** Figure 2: Main empirical results. Top row: information coverage on the Gaussian Process and [PITH_FULL_IMAGE:figures/full_fig_p009_2.png] view at source ↗

**Figure 3.** Figure 3: The Actor-Critic architecture of SubMAPG-G. The [PITH_FULL_IMAGE:figures/full_fig_p036_3.png] view at source ↗

**Figure 4.** Figure 4: Training performance in the closed information coverage environment under three density [PITH_FULL_IMAGE:figures/full_fig_p038_4.png] view at source ↗

**Figure 5.** Figure 5: Coverage performance and regret in the closed information coverage environment. Columns [PITH_FULL_IMAGE:figures/full_fig_p039_5.png] view at source ↗

**Figure 6.** Figure 6: Coverage performance and regret in the open information coverage environment with [PITH_FULL_IMAGE:figures/full_fig_p039_6.png] view at source ↗

**Figure 7.** Figure 7: Tracking performance in open environments under different initial population sizes. [PITH_FULL_IMAGE:figures/full_fig_p041_7.png] view at source ↗

**Figure 8.** Figure 8: Reward ablation in the tracking task. SubMAPG-M uses the agent-wise difference reward [PITH_FULL_IMAGE:figures/full_fig_p041_8.png] view at source ↗

**Figure 9.** Figure 9: Zero-shot scalability from 12 to 48 agents and targets using a policy trained on systems [PITH_FULL_IMAGE:figures/full_fig_p042_9.png] view at source ↗

**Figure 10.** Figure 10: Tracking performance under different target-motion compositions. Columns correspond [PITH_FULL_IMAGE:figures/full_fig_p043_10.png] view at source ↗

**Figure 11.** Figure 11: Tracking performance under different communication ranges with fixed sensing radius [PITH_FULL_IMAGE:figures/full_fig_p043_11.png] view at source ↗

read the original abstract

This paper studies multi-agent reinforcement learning with submodular team utilities for online distributed task allocation. In this setting, each agent selects one action from a local categorical policy, so feasible joint actions form a partition matroid over agent-action pairs. Classical multilinear extensions use independent Bernoulli sampling and therefore do not match the categorical policies executed by decentralized agents. To address this mismatch, we introduce the Partition Multilinear Extension (PME), a continuous relaxation whose value equals the expected team utility under factorized categorical policies. We prove that submodular difference rewards provide unbiased PME marginal-gradient information and yield a stagewise score-function policy-gradient estimator. Based on this connection, we propose SubMAPG, a centralized-training decentralized-execution policy-gradient framework with masked categorical policies and submodular difference-reward training signals. For the associated PME marginal-space projected stochastic-gradient dynamics, we prove a stagewise 1/2-approximation guarantee and sublinear dynamic regret in slowly varying environments, measured by the path length of the optimal PME marginals. To handle open systems with time-varying agents and targets, we instantiate SubMAPG with graph neural network policies. Experiments on multi-robot coverage and multi-target tracking show that SubMAPG outperforms local greedy and shared-reward baselines and is competitive with centralized myopic greedy strategies.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper introduces the Partition Multilinear Extension to match categorical policies in submodular multi-agent settings and proves 1/2-approximation plus sublinear regret for the fixed-agent case, but the open-system claims rest on an unanalyzed GNN extension.

read the letter

The main takeaway is that this paper gives a tailored continuous relaxation, the Partition Multilinear Extension, so that submodular team utilities can be optimized with the exact categorical policies that decentralized agents execute. They connect it to submodular difference rewards to produce unbiased marginal gradients and then run projected SGD on the marginals to obtain a stagewise 1/2-approximation and sublinear dynamic regret measured by path length of the optimal marginals, at least when the agent count is fixed.

Referee Report

1 major / 2 minor

Summary. The paper introduces the Partition Multilinear Extension (PME) to relax expected team utility under factorized categorical policies for submodular multi-agent task allocation. It proposes SubMAPG, a CTDE policy-gradient algorithm using submodular difference rewards to obtain unbiased PME marginal gradients, proves a stagewise 1/2-approximation guarantee together with sublinear dynamic regret (w.r.t. path length of optimal PME marginals) for projected SGD on the fixed-dimensional marginal space, and instantiates the method with GNN policies to accommodate open systems with time-varying agents and targets. Experiments on multi-robot coverage and multi-target tracking report outperformance relative to local greedy and shared-reward baselines.

Significance. If the guarantees hold, the work supplies a principled continuous relaxation and regret analysis for submodular MARL under partition-matroid constraints, together with a practical GNN extension for open environments. The explicit link between submodular difference rewards and unbiased marginal gradients, plus the stagewise approximation result, would be useful for distributed robotics and sensor-network applications.

major comments (1)

[Theoretical analysis and abstract] The stagewise 1/2-approximation and sublinear dynamic-regret bounds are derived for projected SGD on the PME marginal space under a fixed number of agents (fixed dimension and fixed product-of-simplices feasible set). The open-system regime advertised in the title and abstract changes both the dimension and the feasible set over time via GNN policies, yet the manuscript presents the GNN instantiation only as a practical extension without extending the path-length regret analysis or the projection argument to the time-varying case. This is load-bearing for the central claim of handling open multi-agent systems.

minor comments (2)

[Experiments] The experimental section should report error bars, number of independent runs, and any statistical significance tests for the performance comparisons.
[Assumptions and open-system extension] Clarify whether the submodularity assumption on the team utility must be re-verified after each change in agent or target count in the open-system setting.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for the detailed and constructive feedback. The central theoretical results are derived under a fixed-agent assumption, and we will revise the manuscript to clarify the scope of the guarantees while retaining the practical GNN extension for open systems.

read point-by-point responses

Referee: [Theoretical analysis and abstract] The stagewise 1/2-approximation and sublinear dynamic-regret bounds are derived for projected SGD on the PME marginal space under a fixed number of agents (fixed dimension and fixed product-of-simplices feasible set). The open-system regime advertised in the title and abstract changes both the dimension and the feasible set over time via GNN policies, yet the manuscript presents the GNN instantiation only as a practical extension without extending the path-length regret analysis or the projection argument to the time-varying case. This is load-bearing for the central claim of handling open multi-agent systems.

Authors: We agree that the 1/2-approximation and sublinear dynamic-regret bounds are proved for projected SGD on a fixed-dimensional PME marginal space (fixed product of simplices). The GNN policy is introduced as a practical mechanism that handles time-varying agent and target cardinalities via dynamic graph construction and action masking, without a corresponding extension of the regret analysis to varying dimensions. We will revise the abstract, title, and introduction to state explicitly that the approximation and regret guarantees apply to the fixed-agent setting, while the open-system experiments demonstrate empirical performance. Extending the path-length regret analysis to time-varying dimensions is a nontrivial open question that we leave for future work; the current analysis supplies the necessary foundation via the PME and submodular-difference-reward connection. revision: partial

Circularity Check

0 steps flagged

PME definition and regret analysis are self-contained from first principles with no reduction to inputs

full rationale

The paper introduces the Partition Multilinear Extension (PME) explicitly as the continuous relaxation whose value equals the expected team utility under factorized categorical policies. It proves that submodular difference rewards supply unbiased PME marginal-gradient information, yielding a score-function estimator. The stagewise 1/2-approximation guarantee and sublinear dynamic regret (w.r.t. path length of optimal PME marginals) are then derived for the projected SGD dynamics in the fixed-dimensional marginal space. These steps follow directly from the submodularity assumption and standard projected-gradient analysis without any fitted parameters renamed as predictions, self-definitional loops, or load-bearing self-citations. The GNN instantiation for open systems is presented as a practical extension rather than a theoretical extension of the bounds, but this does not create circularity in the core derivation chain.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 1 invented entities

The central claims rest on the domain assumption that the team utility is submodular and on the newly introduced PME relaxation; no free parameters or invented entities with independent evidence are stated.

axioms (1)

domain assumption The team utility function is submodular
Required for difference rewards to yield unbiased marginal gradients with respect to the PME.

invented entities (1)

Partition Multilinear Extension (PME) no independent evidence
purpose: Continuous relaxation whose value equals expected team utility under factorized categorical policies
Newly defined to resolve the mismatch between standard multilinear extensions and categorical policies over partition matroids.

pith-pipeline@v0.9.0 · 5550 in / 1399 out tokens · 71409 ms · 2026-05-14T18:41:34.456389+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

138 extracted references · 6 canonical work pages · 2 internal anchors

[1]

IEEE Transactions on Automatic Control , year=

Optimization and learning in open multi-agent systems , author=. IEEE Transactions on Automatic Control , year=
[2]

Communication- and Computation-Efficient Distributed Submodular Optimization in Robot Mesh Networks , year=

Xu, Zirui and Garimella, Sandilya Sai and Tzoumas, Vasileios , journal=. Communication- and Computation-Efficient Distributed Submodular Optimization in Robot Mesh Networks , year=
[3]

Automatica , volume=

Minimax Persistent Monitoring of a network system , author=. Automatica , volume=. 2023 , publisher=

2023
[4]

Automatica , volume=

A sub-modular receding horizon solution for mobile multi-agent persistent monitoring , author=. Automatica , volume=. 2021 , publisher=

2021
[5]

IEEE Transactions on Robotics , volume=

Long-horizon multi-robot rearrangement planning for construction assembly , author=. IEEE Transactions on Robotics , volume=. 2022 , publisher=

2022
[6]

Threshold Bundle-based Task Allocation for Multiple Aerial Robots , volume =

Li, Teng and Shin, Hyo-Sang and Tsourdos, Antonios , journal =. Threshold Bundle-based Task Allocation for Multiple Aerial Robots , volume =. 2020 , publisher=

2020
[7]

IEEE Transactions on Circuits and Systems I: Regular Papers , volume=

Minimal leader selection in general linear multi-agent systems with switching topologies: Leveraging submodularity ratio , author=. IEEE Transactions on Circuits and Systems I: Regular Papers , volume=. 2023 , publisher=

2023
[8]

IEEE Transactions on Automatic Control , volume=

Utility design for distributed resource allocation—part I: Characterizing and optimizing the exact price of anarchy , author=. IEEE Transactions on Automatic Control , volume=. 2019 , publisher=

2019
[9]

IEEE Transactions on Automatic Control , volume=

Randomized greedy sensor selection: Leveraging weak submodularity , author=. IEEE Transactions on Automatic Control , volume=. 2020 , publisher=

2020
[10]

IEEE Transactions on Automatic Control , volume=

Robust maximization of correlated submodular functions under cardinality and matroid constraints , author=. IEEE Transactions on Automatic Control , volume=. 2021 , publisher=

2021
[11]

Aerospace Science and Technology , volume=

An iterative strategy for task assignment and path planning of distributed multiple unmanned aerial vehicles , author=. Aerospace Science and Technology , volume=. 2019 , publisher=

2019
[12]

IEEE Transactions on Automatic Control , volume=

Jacobi-style iteration for distributed submodular maximization , author=. IEEE Transactions on Automatic Control , volume=. 2022 , publisher=

2022
[13]

IEEE Transactions on Robotics , volume=

Risk-aware submodular optimization for multirobot coordination , author=. IEEE Transactions on Robotics , volume=. 2022 , publisher=

2022
[14]

Distributed Matching-By-Clone

Samiei, Arezoo and Sun, Liang , journal=. Distributed Matching-By-Clone. 2024 , publisher=

2024
[15]

IEEE Transactions on Mobile Computing , volume=

Distributed energy-efficient multi-UAV navigation for long-term communication coverage by deep reinforcement learning , author=. IEEE Transactions on Mobile Computing , volume=. 2019 , publisher=

2019
[16]

IEEE Transactions on Cybernetics , volume=

Solving dynamic multiobjective problem via autoencoding evolutionary search , author=. IEEE Transactions on Cybernetics , volume=. 2020 , publisher=

2020
[17]

IEEE Transactions on Automatic Control , volume=

Event-triggered fixed-time attitude consensus with fixed and switching topologies , author=. IEEE Transactions on Automatic Control , volume=. 2022 , publisher=

2022
[18]

An adaptive distributed auction algorithm and its application to multi-

Wang, Yu and Li, HuiPing and Yao, Yao , journal=. An adaptive distributed auction algorithm and its application to multi-. 2023 , publisher=

2023
[19]

IEEE Transactions on Cognitive and Developmental Systems , year=

Matching-Based Capture-the-Flag Games for Multi-Agent Systems , author=. IEEE Transactions on Cognitive and Developmental Systems , year=
[20]

IEEE Transactions on Robotics , volume=

Consensus-based decentralized auctions for robust task allocation , author=. IEEE Transactions on Robotics , volume=. 2009 , publisher=

2009
[21]

IFAC-PapersOnLine , volume=

Greedy Decentralized Auction-based Task Allocation for Multi-Agent Systems , author=. IFAC-PapersOnLine , volume=. 2021 , publisher=

2021
[22]

IEEE Transactions on Robotics , volume=

Anonymous hedonic game for task allocation in a large-scale multiple agent system , author=. IEEE Transactions on Robotics , volume=. 2018 , publisher=

2018
[23]

2021 American Control Conference (ACC) , pages=

Decentralized game-theoretic control for dynamic task allocation problems for multi-agent systems , author=. 2021 American Control Conference (ACC) , pages=. 2021 , organization=

2021
[24]

IEEE Transactions on Robotics , volume=

A hierarchical coordination framework for joint perception-action tasks in composite robot teams , author=. IEEE Transactions on Robotics , volume=. 2021 , publisher=

2021
[25]

Yang, Mingyu and Zhao, Jian and Hu, Xunhan and Zhou, Wengang and Zhu, Jiangcheng and Li, Houqiang , journal=
[26]

IFAC-PapersOnLine , pages =

Distributed Greedy Algorithm for Satellite Assignment Problem with Submodular Utility Function , volume =. IFAC-PapersOnLine , pages =. 2015 , publisher=

2015
[27]

Automatica , volume=

Distributed greedy algorithm for multi-agent task assignment problem with submodular utility functions , author=. Automatica , volume=. 2019 , publisher=

2019
[28]

Automatica , volume=

Exploiting submodularity to quantify near-optimality in multi-agent coverage problems , author=. Automatica , volume=. 2019 , publisher=

2019
[29]

IEEE Transactions on Automatic Control , volume=

Utility design for distributed resource allocation—part II: applications to submodular, covering, and supermodular problems , author=. IEEE Transactions on Automatic Control , volume=. 2021 , publisher=

2021
[30]

Automatica , volume=

Leader selection in networks under switching topologies with antagonistic interactions , author=. Automatica , volume=. 2022 , publisher=

2022
[31]

Automatica , volume=

Distributed strategy selection: A submodular set function maximization approach , author=. Automatica , volume=. 2023 , publisher=

2023
[32]

Distributed multirobot task assignment via consensus

Shorinwa, Ola and Haksar, Ravi N and Washington, Patrick and Schwager, Mac , journal=. Distributed multirobot task assignment via consensus. 2023 , publisher=

2023
[33]

Fulkerson , author=

Mathematical programming study 8 - Polyhedral Combinatorics-Dedicated to the memory of D.R. Fulkerson , author=. Mathematical Programming , volume =
[34]

An analysis of approximations for maximizing submodular set functions—

Nemhauser, George L and Wolsey, Laurence A and Fisher, Marshall L , journal=. An analysis of approximations for maximizing submodular set functions—. 1978 , publisher=

1978
[35]

An analysis of approximations for maximizing submodular set functions—

Fisher, Marshall L and Nemhauser, George L and Wolsey, Laurence A , booktitle=. An analysis of approximations for maximizing submodular set functions—. 2009 , publisher=

2009
[36]

IEEE Transactions on Automatic Control , volume=

Robust and adaptive sequential submodular optimization , author=. IEEE Transactions on Automatic Control , volume=. 2022 , publisher=

2022
[37]

Oceanographic Literature Review , volume=

Fundamentals of fluid mechanics , author=. Oceanographic Literature Review , volume=
[38]

IEEE Transactions on Automatic Control , year=

Submodular maximization with limited function access , author=. IEEE Transactions on Automatic Control , year=
[39]

IEEE Transactions on Robotics , volume=

Robust task scheduling for heterogeneous robot teams under capability uncertainty , author=. IEEE Transactions on Robotics , volume=. 2022 , publisher=

2022
[40]

Automatica , volume=

Event-triggered attitude consensus with absolute and relative attitude measurements , author=. Automatica , volume=. 2020 , publisher=

2020
[41]

IEEE Transactions on Robotics , volume=

Resilient active information acquisition with teams of robots , author=. IEEE Transactions on Robotics , volume=. 2021 , publisher=

2021
[42]

Automatica , volume=

Policies for risk-aware sensor data collection by mobile agents , author=. Automatica , volume=. 2022 , publisher=

2022
[43]

arXiv preprint arXiv:2103.12370 , year=

Multi-Robot Task Allocation--Complexity and Approximation , author=. arXiv preprint arXiv:2103.12370 , year=

work page arXiv
[44]

1999 , publisher=

An introduction to the mathematics and methods of astrodynamics , author=. 1999 , publisher=

1999
[45]

Tractability , volume=

Submodular function maximization , author=. Tractability , volume=
[46]

Journal of Combinatorial Optimization , volume=

Approximation for maximizing monotone non-decreasing set functions with a greedy method , author=. Journal of Combinatorial Optimization , volume=. 2016 , publisher=

2016
[47]

Automatica , volume=

A new performance bound for submodular maximization problems and its application to multi-agent optimal coverage problems , author=. Automatica , volume=. 2022 , publisher=

2022
[48]

IEEE Transactions on Mobile Computing , volume=

Predictive service provisioning with online learning in wireless edge networks , author=. IEEE Transactions on Mobile Computing , volume=. 2023 , publisher=

2023
[49]

Scientific Reports , volume=

A distributed task allocation approach for multi-UAV persistent monitoring in dynamic environments , author=. Scientific Reports , volume=. 2025 , publisher=

2025
[50]

Computing , volume=

Heuristic algorithms for the multiple knapsack problem , author=. Computing , volume=. 1981 , publisher=

1981
[51]

Naval Research Logistics Quarterly , volume=

An algorithm for 0-1 multiple-knapsack problems , author=. Naval Research Logistics Quarterly , volume=. 1978 , publisher=

1978
[52]

SIAM Journal on Optimization , volume=

Solving multiple knapsack problems by cutting planes , author=. SIAM Journal on Optimization , volume=. 1996 , publisher=

1996
[53]

2019 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) , pages=

Learning q-network for active information acquisition , author=. 2019 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) , pages=. 2019 , organization=

2019
[54]

Proceedings of the National Conference on Artificial Intelligence , volume=

Solving the auction-based task allocation problem in an open environment , author=. Proceedings of the National Conference on Artificial Intelligence , volume=
[55]

IEEE transactions on robotics and automation , volume=

Sold!: Auction methods for multirobot coordination , author=. IEEE transactions on robotics and automation , volume=. 2002 , publisher=

2002
[56]

Proceedings of the 11th International Conference on Autonomous Agents and Multiagent Systems-Volume 1 , pages=

Modeling and learning synergy for team formation with heterogeneous agents , author=. Proceedings of the 11th International Conference on Autonomous Agents and Multiagent Systems-Volume 1 , pages=
[57]

Artificial Intelligence , volume=

Bounded approximate decentralised coordination via the max-sum algorithm , author=. Artificial Intelligence , volume=. 2011 , publisher=

2011
[58]

Journal of the ACM (JACM) , volume=

Bandits with knapsacks , author=. Journal of the ACM (JACM) , volume=. 2018 , publisher=

2018
[59]

Proceedings of the fifteenth ACM conference on Economics and computation , pages=

Bandits with concave rewards and convex knapsacks , author=. Proceedings of the fifteenth ACM conference on Economics and computation , pages=
[60]

SIAM Journal on Computing , volume=

Maximizing a monotone submodular function subject to a matroid constraint , author=. SIAM Journal on Computing , volume=. 2011 , publisher=

2011
[61]

Journal of Machine Learning Research , year=

Near-optimal sensor placements in Gaussian processes: Theory, efficient algorithms and empirical studies , author=. Journal of Machine Learning Research , year=
[62]

Automatica , volume=

Online distributed nonconvex optimization with stochastic objective functions: High probability bound analysis of dynamic regrets , author=. Automatica , volume=. 2024 , publisher=

2024
[63]

Proceedings of the twenty-fourth annual ACM-SIAM symposium on Discrete algorithms , pages=

Online submodular welfare maximization: Greedy is optimal , author=. Proceedings of the twenty-fourth annual ACM-SIAM symposium on Discrete algorithms , pages=. 2013 , organization=

2013
[64]

IEEE Robotics and Automation Letters , volume=

Online submodular coordination with bounded tracking regret: Theory, algorithm, and applications to multi-robot coordination , author=. IEEE Robotics and Automation Letters , volume=. 2023 , publisher=

2023
[65]

Science China Technological Sciences , volume=

Computer vision tasks for intelligent aerospace perception: An overview , author=. Science China Technological Sciences , volume=. 2024 , publisher=

2024
[66]

IEEE Transactions on Robotics , volume=

Robust multi-robot active target tracking against sensing and communication attacks , author=. IEEE Transactions on Robotics , volume=. 2023 , publisher=

2023
[67]

IEEE Transactions on Robotics , volume=

Robust multiple-path orienteering problem: Securing against adversarial attacks , author=. IEEE Transactions on Robotics , volume=. 2023 , publisher=

2023
[68]

IEEE Transactions on Automatic Control , volume=

Distributed online resource allocation in open networks , author=. IEEE Transactions on Automatic Control , volume=. 2024 , publisher=

2024
[69]

2024 American Control Conference (ACC) , pages=

Leveraging untrustworthy commands for multi-robot coordination in unpredictable environments: A bandit submodular maximization approach , author=. 2024 American Control Conference (ACC) , pages=. 2024 , organization=

2024
[70]

IEEE Transactions on Robotics , volume=

Distributed attack-robust submodular maximization for multirobot planning , author=. IEEE Transactions on Robotics , volume=. 2022 , publisher=

2022
[71]

IFAC-PapersOnLine , volume=

Combined macroscopic and microscopic multi-agent control for multi-target tracking , author=. IFAC-PapersOnLine , volume=. 2022 , publisher=

2022
[72]

European Symposium on Algorithms , pages=

Greedy in approximation algorithms , author=. European Symposium on Algorithms , pages=. 2006 , organization=

2006
[73]

IEEE Transactions on Automatic Control , volume=

A sum-of-states preservation framework for open multiagent systems with nonlinear heterogeneous coupling , author=. IEEE Transactions on Automatic Control , volume=. 2023 , publisher=

2023
[74]

Distributed Submodular Maximization on Partition Matroids for Planning on Large Sensor Networks , year=

Corah, Micah and Michael, Nathan , booktitle=. Distributed Submodular Maximization on Partition Matroids for Planning on Large Sensor Networks , year=
[75]

Submodular Maximization With Limited Function Access , volume =

Downie, Andrew and Gharesifard, Bahman and Smith, Stephen , year =. Submodular Maximization With Limited Function Access , volume =. IEEE Transactions on Automatic Control , doi =
[76]

International Conference on Artificial Intelligence and Statistics , pages=

Tracking regret bounds for online submodular optimization , author=. International Conference on Artificial Intelligence and Statistics , pages=. 2021 , organization=

2021
[77]

Science Robotics , volume=

RoboBallet: Planning for multirobot reaching with graph neural networks and reinforcement learning , author=. Science Robotics , volume=. 2025 , publisher=

2025
[78]

International Conference on Learning Representations , year=

Submodular Reinforcement Learning , author=. International Conference on Learning Representations , year=
[79]

International Conference on Learning Representations , year=

Near-optimal online learning for multi-agent submodular coordination: Tight approximation and communication efficiency , author=. International Conference on Learning Representations , year=
[80]

Neural Computing and Applications , volume=

Difference rewards policy gradients , author=. Neural Computing and Applications , volume=. 2025 , publisher=

2025

Showing first 80 references.