Learning Altruistic Collaboration in Heterogeneous Multi-Team Systems

Brooks A. Butler; Magnus Egerstedt; Riwa Karam; Ruoyu Lin

arxiv: 2605.21723 · v1 · pith:PDOPCE6Inew · submitted 2026-05-20 · 💻 cs.RO · cs.AI· cs.MA· cs.SY· eess.SY

Learning Altruistic Collaboration in Heterogeneous Multi-Team Systems

Riwa Karam , Ruoyu Lin , Brooks A. Butler , Magnus Egerstedt This is my paper

Pith reviewed 2026-05-22 08:58 UTC · model grok-4.3

classification 💻 cs.RO cs.AIcs.MAcs.SYeess.SY

keywords heterogeneous multi-team systemsaltruistic collaborationHamilton's rulegraph neural networksdynamic robot allocationmulti-robot systemsfirefighting scenarioresource sharing

0 comments

The pith

A graph neural network can approximate altruistic robot allocations based on Hamilton's rule to enable scalable collaboration across heterogeneous multi-team systems.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper shows how to treat robots as transferable resources between teams that have different capabilities, where transfers incur costs and produce capability-dependent benefits. It adapts Hamilton's rule from biology to decide when one team should altruistically give a robot to another, turning the allocation into a combinatorial problem that is NP-hard. To make this practical, the authors train a graph neural network policy centrally on the team interaction graph so that it can predict transfers and re-assignments decentrally at runtime. Simulations and experiments in a firefighting domain indicate that the learned policy stays close to optimal while handling larger numbers of teams and robots than exact solvers allow.

Core claim

Modeling robot transfers as altruistic acts governed by Hamilton's rule produces a combinatorial allocation problem that is NP-hard; a graph neural network policy trained under centralized training and decentralized execution can approximate the resulting allocations, delivering near-optimal performance in multi-team firefighting scenarios while scaling to larger systems.

What carries the argument

The graph neural network policy that operates over the team interaction graph and predicts robot-level transfer decisions together with next robot-to-team assignments, thereby approximating the altruistic allocations derived from Hamilton's rule.

If this is right

The allocation problem is NP-hard, so exact methods become intractable beyond small scales.
The learned GNN policy achieves near-optimal performance in the firefighting domain.
Centralized training with decentralized execution allows the policy to run on larger systems than exact solvers.
Validation combines simulation results with physical experiments confirming practical viability.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Ecological decision rules such as Hamilton's could be reused as lightweight heuristics in other engineered multi-agent resource-sharing problems.
The centralized-training decentralized-execution pattern may reduce communication demands when the same policy is deployed on physical robot teams.
Extending the model to include stochastic transfer costs or partial observability would test robustness beyond the current deterministic setting.

Load-bearing premise

That Hamilton's rule from ecology can be directly adapted as a decision mechanism for robot transfers in heterogeneous teams with capability-dependent contributions and transfer costs without introducing significant modeling error.

What would settle it

Solving the exact altruistic allocation on small firefighting instances and measuring the performance gap to the GNN policy; a large gap or failure of the policy to maintain near-optimal results on larger instances would falsify the scalability claim.

Figures

Figures reproduced from arXiv: 2605.21723 by Brooks A. Butler, Magnus Egerstedt, Riwa Karam, Ruoyu Lin.

**Figure 2.** Figure 2: Overview of the proposed GNN-based policy for heterogeneous altruistic collaboration with an example graph input [PITH_FULL_IMAGE:figures/full_fig_p007_2.png] view at source ↗

**Figure 4.** Figure 4: Training, validation, and testing performance of the GNN policy. The [PITH_FULL_IMAGE:figures/full_fig_p011_4.png] view at source ↗

**Figure 5.** Figure 5: Training dynamics of the GNN policy, showing the model loss and [PITH_FULL_IMAGE:figures/full_fig_p011_5.png] view at source ↗

**Figure 6.** Figure 6: Example executions of the proposed GNN-based heterogeneous altruistic collaboration framework in software simulations. The top row shows a [PITH_FULL_IMAGE:figures/full_fig_p013_6.png] view at source ↗

**Figure 7.** Figure 7: Execution sequence of the proposed GNN-based heterogeneous altruistic collaboration framework on a multi-robot testbed. Teams operate in distinct [PITH_FULL_IMAGE:figures/full_fig_p015_7.png] view at source ↗

read the original abstract

This paper studies heterogeneous multi-team collaboration through dynamic robot allocation, where robots are treated as transferable resources. Leveraging Hamilton's rule from ecology as an altruistic decision-making mechanism, we propose a multi-team collaborative resource allocation framework with heterogeneous capabilities, transfer costs, and capability-dependent contributions. The resulting allocation problem is combinatorial and is shown to be NP-hard. To address scalability, we develop a graph neural network policy under centralized training and decentralized execution that approximates the altruistic allocations based on Hamilton's rule. The model operates over the team interaction graph and predicts robot-level transfer decisions and next robot-to-team assignments. The proposed approach is validated in a firefighting scenario through simulations and experiments, demonstrating that the learned policy achieves near-optimal performance while scaling to larger systems.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

Hamilton's rule adapted to guide robot transfers in heterogeneous teams is a new framing, but the mapping to capabilities and costs needs more scrutiny to support the near-optimal claim.

read the letter

The paper takes Hamilton's rule from evolutionary biology and turns it into a decision rule for when one robot team should hand a robot to another, then trains a GNN to approximate those choices at scale in a multi-team setting with different capabilities and transfer costs. That combination feels fresh compared to standard multi-agent allocation work. They also correctly flag the combinatorial problem as NP-hard, which justifies moving to a learned policy under centralized training and decentralized execution. The firefighting scenario is a reasonable test case for showing that the policy scales while staying close to the rule-based allocations. Those pieces are solid and give the work a practical angle for resource sharing across teams. The soft spot sits in the translation step. Hamilton's rule uses relatedness r, benefit B, and cost C, but here B depends on the receiving team's current capability profile and C includes explicit transfer expenses. If those quantities have to be recomputed from vectors and matrices at every step, the inequality rB > C can pick up extra approximation error before the GNN sees it. The abstract says the learned policy reaches near-optimal performance, yet without visible error bars, ablations on the parameter mapping, or direct comparisons to non-altruistic baselines, it is hard to tell how much of the performance comes from the rule itself versus the GNN's flexibility. This is the kind of paper that would interest people working on multi-robot coordination and bio-inspired methods for allocation. A reader who needs ideas for scalable policies on team interaction graphs could pick up the GNN architecture and the CTDE training setup. It is coherent enough on its own terms to deserve a serious referee, mainly to push for clearer details on the rule adaptation and stronger validation. I would send it out for review with those requests rather than desk reject.

Referee Report

2 major / 2 minor

Summary. The paper studies dynamic robot allocation across heterogeneous multi-team systems, treating robots as transferable resources. It adapts Hamilton's rule from evolutionary biology as an altruistic decision criterion incorporating relatedness, benefit, and cost, formulates the resulting allocation task as a combinatorial optimization problem shown to be NP-hard, and proposes a graph neural network policy (centralized training, decentralized execution) that operates on the team interaction graph to predict robot transfers and assignments. The approach is evaluated in a firefighting scenario via simulation and hardware experiments, with the central claim that the learned policy achieves near-optimal performance while scaling to larger team sizes.

Significance. If the adaptation of Hamilton's rule to capability-dependent contributions and explicit transfer costs preserves the intended altruistic optimum without substantial modeling error, the work would provide a principled, scalable method for multi-robot collaboration that bridges ecological theory and robotics. The GNN approximation of the combinatorial allocation and the demonstration of scalability in a concrete domain are positive elements; reproducible validation in both simulation and experiments would further strengthen the contribution.

major comments (2)

[Abstract] Abstract: the claim that the GNN 'approximates the altruistic allocations based on Hamilton's rule' and achieves 'near-optimal performance' is presented without derivation of the mapping from (r, B, C) to capability vectors and transfer costs, without error bars, and without ablation results on the rule-to-reward translation; this mapping is load-bearing for the optimality claim once heterogeneous capabilities and costs are introduced.
[Problem Formulation] Problem statement / formulation section: the NP-hardness of the allocation problem is asserted but no reduction or proof sketch is referenced; because the GNN is motivated as a scalable surrogate precisely for this hardness, the absence of the hardness argument weakens the justification for the learning approach.

minor comments (2)

[Introduction] Notation for capability profiles and transfer-cost matrices should be introduced earlier and used consistently when describing how Hamilton's rule is instantiated.
[Experiments] Figure captions for the firefighting scenario should explicitly state the number of teams, robots, and capability types used in the scaling experiments.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their insightful comments and the opportunity to improve our manuscript. We address each major comment below and outline the revisions we will make.

read point-by-point responses

Referee: [Abstract] Abstract: the claim that the GNN 'approximates the altruistic allocations based on Hamilton's rule' and achieves 'near-optimal performance' is presented without derivation of the mapping from (r, B, C) to capability vectors and transfer costs, without error bars, and without ablation results on the rule-to-reward translation; this mapping is load-bearing for the optimality claim once heterogeneous capabilities and costs are introduced.

Authors: We agree that the abstract could better highlight the key elements of our approach. The derivation of the mapping from Hamilton's rule (r, B, C) to capability vectors and transfer costs is provided in detail in Section 3, where we define the benefit B as the capability-dependent performance gain and cost C incorporating explicit transfer costs. To strengthen the presentation, we will revise the abstract to briefly reference this mapping and ensure the results include error bars and additional ablation studies on the rule-to-reward formulation. We believe these changes will clarify the foundation for the near-optimal performance claims. revision: yes
Referee: [Problem Formulation] Problem statement / formulation section: the NP-hardness of the allocation problem is asserted but no reduction or proof sketch is referenced; because the GNN is motivated as a scalable surrogate precisely for this hardness, the absence of the hardness argument weakens the justification for the learning approach.

Authors: We acknowledge this point. While the combinatorial nature and NP-hardness are discussed in the problem formulation, we did not include an explicit reduction. In the revised version, we will add a proof sketch by reducing from the generalized assignment problem, which is known to be NP-hard, adapted to include the altruistic criterion and transfer costs. This will provide a stronger justification for using the GNN as a scalable surrogate. revision: yes

Circularity Check

0 steps flagged

No significant circularity; derivation anchored in external ecological rule

full rationale

The paper adapts Hamilton's rule (rB > C) from ecology as the basis for defining altruistic robot allocations in a heterogeneous multi-team setting with capability-dependent contributions and transfer costs. It then formulates the resulting combinatorial allocation problem and trains a GNN policy to approximate those allocations under centralized training. No step reduces by construction to its own inputs: the rule is imported as an independent principle rather than defined in terms of the GNN outputs or fitted parameters, the NP-hardness claim is separate from the learning, and the 'near-optimal' performance is measured against the allocations derived from the external rule. The derivation chain remains self-contained against this external benchmark.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract-only review; free parameters, axioms, and invented entities cannot be enumerated without the full methods and equations.

pith-pipeline@v0.9.0 · 5669 in / 1073 out tokens · 32440 ms · 2026-05-22T08:58:45.465564+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

Leveraging Hamilton’s rule from ecology as an altruistic decision making mechanism... Hamilton-admissible indicator h^{k+1}_{r,i→j} := I{ r_{ij} B_{r,j}(S^k_j) > C_{r,i}(S^k_i) }
IndisputableMonolith/Foundation/RealityFromDistinction.lean reality_from_one_distinction unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

The resulting allocation problem is combinatorial and is shown to be NP-hard... GNN policy under centralized training and decentralized execution

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

33 extracted references · 33 canonical work pages · 3 internal anchors

[1]

R. R. Murphy,Disaster Robotics. MIT Press, 2017. 15 (a) Initial Allocation (b) First Robot Migrations (c) Second Robot Migrations (d) Final Allocation Fig. 7. Execution sequence of the proposed GNN-based heterogeneous altruistic collaboration framework on a multi-robot testbed. Teams operate in distinct spatial regions with heterogeneous sensing and fire-...

work page 2017
[2]

Coverage control for mobile sensing networks,

J. Cort ´es, S. Mart ´ınez, T. Karatas, and F. Bullo, “Coverage control for mobile sensing networks,”IEEE Transactions on Robotics and Automation, vol. 20, no. 2, pp. 243–255, 2004

work page 2004
[3]

Decentralized, adaptive cover- age control for networked robots,

M. Schwager, D. Rus, and J.-J. Slotine, “Decentralized, adaptive cover- age control for networked robots,”The International Journal of Robotics Research, vol. 28, no. 3, pp. 357–375, 2009

work page 2009
[4]

A distributed auction algorithm for the assignment problem,

M. M. Zavlanos, L. Spesivtsev, and G. J. Pappas, “A distributed auction algorithm for the assignment problem,” inProceedings of the IEEE Conference on Decision and Control (CDC). IEEE, 2008, pp. 1212– 1217

work page 2008
[5]

Multi-robot task allocation games in dynamically changing environments,

S. Park, Y . D. Zhong, and N. E. Leonard, “Multi-robot task allocation games in dynamically changing environments,” inProceedings of the IEEE International Conference on Robotics and Automation (ICRA). IEEE, 2021, pp. 8678–8684

work page 2021
[6]

A resilient and energy-aware task allocation framework for heterogeneous multirobot systems,

G. Notomista, S. Mayya, Y . Emam, C. Kroninger, A. Bohannon, S. Hutchinson, and M. Egerstedt, “A resilient and energy-aware task allocation framework for heterogeneous multirobot systems,”IEEE Transactions on Robotics (T-RO), vol. 38, no. 1, pp. 159–179, 2021

work page 2021
[7]

Resource allocation with multi-team collaboration based on hamilton’s rule,

R. Karam, R. Lin, B. A. Butler, and M. Egerstedt, “Resource allocation with multi-team collaboration based on hamilton’s rule,” inProceedings of the IEEE Conference on Decision and Control (CDC). IEEE, 2025, pp. 6891–6898

work page 2025
[8]

Distributed resource allocation via multi- agent systems under time-varying networks,

K. Lu, H. Xu, and Y . Zheng, “Distributed resource allocation via multi- agent systems under time-varying networks,”Automatica, vol. 136, p. 110059, 2022

work page 2022
[9]

A formal analysis and taxonomy of task allocation in multi-robot systems,

B. P. Gerkey and M. J. Matari ´c, “A formal analysis and taxonomy of task allocation in multi-robot systems,”The International Journal of Robotics Research, vol. 23, no. 9, pp. 939–954, 2004

work page 2004
[10]

A comprehensive taxonomy for multi-robot task allocation,

G. A. Korsah, A. Stentz, and M. B. Dias, “A comprehensive taxonomy for multi-robot task allocation,”The International Journal of Robotics Research, vol. 32, no. 12, pp. 1495–1512, 2013

work page 2013
[11]

Decentralized control of cooperative and autonomous agents for solving the distributed resource allocation problem,

P. Anussornnitisarn, S. Y . Nof, and O. Etzion, “Decentralized control of cooperative and autonomous agents for solving the distributed resource allocation problem,”International Journal of Production Economics, vol. 98, no. 2, pp. 114–128, 2005

work page 2005
[12]

Consensus-based decentralized auctions for robust task allocation,

H.-L. Choi, L. Brunet, and J. P. How, “Consensus-based decentralized auctions for robust task allocation,”IEEE Transactions on Robotics (T- RO), vol. 25, no. 4, pp. 912–926, 2009

work page 2009
[13]

Collaboration in multi-robot systems: Taxonomy and survey over frameworks for collaboration,

R. Karam, A. A. Nguyen, R. Lin, D. R. Martin, D. Morales, B. A. Butler, and M. Egerstedt, “Collaboration in multi-robot systems: Taxonomy and survey over frameworks for collaboration,” 2026. [Online]. Available: https://arxiv.org/abs/2603.23898

work page arXiv 2026
[14]

Monotonic value function factorisation for deep multi- agent reinforcement learning,

T. Rashid, M. Samvelyan, C. S. De Witt, G. Farquhar, J. Foerster, and S. Whiteson, “Monotonic value function factorisation for deep multi- agent reinforcement learning,”Journal of Machine Learning Research, vol. 21, no. 178, pp. 1–51, 2020

work page 2020
[15]

Learning combinatorial optimization algorithms over graphs,

E. Khalil, H. Dai, Y . Zhang, B. Dilkina, and L. Song, “Learning combinatorial optimization algorithms over graphs,”Advances in Neural Information Processing Systems (NeurIps), vol. 30, 2017

work page 2017
[16]

Machine learning for combinato- rial optimization: A methodological tour d’horizon,

Y . Bengio, A. Lodi, and A. Prouvost, “Machine learning for combinato- rial optimization: A methodological tour d’horizon,”European Journal of Operational Research, vol. 290, no. 2, pp. 405–421, 2021

work page 2021
[17]

The graph neural network model,

F. Scarselli, M. Gori, A. C. Tsoi, M. Hagenbuchner, and G. Monfardini, “The graph neural network model,”IEEE Transactions on Neural Networks, vol. 20, no. 1, pp. 61–80, 2009

work page 2009
[18]

Relational inductive biases, deep learning, and graph networks

P. W. Battaglia, J. B. Hamrick, V . Bapst, A. Sanchez-Gonzalez, V . Zambaldi, M. Malinowski, A. Tacchetti, D. Raposo, A. Santoro, R. Faulkner, C. Gulcehre, F. Song, A. Ballard, J. Gilmer, G. Dahl, A. Vaswani, K. Allen, C. Nash, V . Langston, C. Dyer, N. Heess, D. Wierstra, P. Kohli, M. Botvinick, O. Vinyals, Y . Li, and R. Pascanu, 16 “Relational inductiv...

work page internal anchor Pith review Pith/arXiv arXiv 2018
[19]

Graph neural networks for decentralized multi-robot path planning,

Q. Li, F. Gama, A. Ribeiro, and A. Prorok, “Graph neural networks for decentralized multi-robot path planning,” inIEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). IEEE, 2020, pp. 11 785–11 792

work page 2020
[20]

Learning decentralized controllers for robot swarms with graph neu- ral networks,

E. Tolstaya, F. Gama, J. Paulos, G. Pappas, V . Kumar, and A. Ribeiro, “Learning decentralized controllers for robot swarms with graph neu- ral networks,” inProceedings of the Conference on Robot Learning. PMLR, 2020, pp. 671–682

work page 2020
[21]

Attention, Learn to Solve Routing Problems!

W. Kool, H. van Hoof, and M. Welling, “Attention, learn to solve routing problems!” 2019. [Online]. Available: https://arxiv.org/abs/1803.08475

work page internal anchor Pith review Pith/arXiv arXiv 2019
[22]

Learning to dispatch for job shop scheduling via deep reinforcement learning,

C. Zhang, W. Song, Z. Cao, J. Zhang, P. S. Tan, and X. Chi, “Learning to dispatch for job shop scheduling via deep reinforcement learning,” inAdvances in Neural Information Processing Systems (NeurIps), H. Larochelle, M. Ranzato, R. Hadsell, M. Balcan, and H. Lin, Eds., vol. 33. Curran Associates, Inc., 2020, pp. 1621–1632

work page 2020
[23]

Graph networks as learnable physics engines for inference and control,

A. Sanchez-Gonzalez, N. Heess, J. T. Springenberg, J. Merel, M. Ried- miller, R. Hadsell, and P. Battaglia, “Graph networks as learnable physics engines for inference and control,” inInternational Conference on Machine Learning (ICML). PMLR, 2018, pp. 4470–4479

work page 2018
[24]

Bonabeau, M

E. Bonabeau, M. Dorigo, and G. Theraulaz,Swarm intelligence: from natural to artificial systems. Oxford university press, 1999, no. 1

work page 1999
[25]

Evolutionary explanations for cooperation,

S. A. West, A. S. Griffin, and A. Gardner, “Evolutionary explanations for cooperation,”Current biology, vol. 17, no. 16, pp. R661–R672, 2007

work page 2007
[26]

The genetical evolution of social behaviour,

W. D. Hamilton, “The genetical evolution of social behaviour,”Journal of Theoretical Biology, vol. 7, no. 1, pp. 1–16, 1964

work page 1964
[27]

Hamilton’s rule for enabling altruism in multi-agent systems,

B. A. Butler and M. Egerstedt, “Hamilton’s rule for enabling altruism in multi-agent systems,” inIEEE Conference on Decision and Control (CDC), 2025, pp. 6776–6783

work page 2025
[28]

M. R. Garey and D. S. Johnson,Computers and Intractability: A Guide to the Theory of NP-Completeness. W. H. Freeman and Company, 1979

work page 1979
[29]

Reducibility among combinatorial problems,

R. M. Karp, “Reducibility among combinatorial problems,” in50 Years of Integer Programming 1958-2008: from the Early Years to the State- of-the-Art. Springer, 2009, pp. 219–241

work page 1958
[30]

Submodular function maximization,

A. Krause and D. Golovin, “Submodular function maximization,” Tractability, vol. 3, no. 71-104, p. 3, 2014

work page 2014
[31]

Decoupled Weight Decay Regularization

I. Loshchilov and F. Hutter, “Decoupled weight decay regularization,” arXiv preprint arXiv:1711.05101, 2017

work page internal anchor Pith review Pith/arXiv arXiv 2017
[32]

A systematic analysis of performance measures for classification tasks,

M. Sokolova and G. Lapalme, “A systematic analysis of performance measures for classification tasks,”Information Processing & Manage- ment, vol. 45, no. 4, pp. 427–437, 2009

work page 2009
[33]

The relationship between precision-recall and roc curves,

J. Davis and M. Goadrich, “The relationship between precision-recall and roc curves,” inProceedings of the International Conference on Machine Learning (ICML), 2006, pp. 233–240

work page 2006

[1] [1]

R. R. Murphy,Disaster Robotics. MIT Press, 2017. 15 (a) Initial Allocation (b) First Robot Migrations (c) Second Robot Migrations (d) Final Allocation Fig. 7. Execution sequence of the proposed GNN-based heterogeneous altruistic collaboration framework on a multi-robot testbed. Teams operate in distinct spatial regions with heterogeneous sensing and fire-...

work page 2017

[2] [2]

Coverage control for mobile sensing networks,

J. Cort ´es, S. Mart ´ınez, T. Karatas, and F. Bullo, “Coverage control for mobile sensing networks,”IEEE Transactions on Robotics and Automation, vol. 20, no. 2, pp. 243–255, 2004

work page 2004

[3] [3]

Decentralized, adaptive cover- age control for networked robots,

M. Schwager, D. Rus, and J.-J. Slotine, “Decentralized, adaptive cover- age control for networked robots,”The International Journal of Robotics Research, vol. 28, no. 3, pp. 357–375, 2009

work page 2009

[4] [4]

A distributed auction algorithm for the assignment problem,

M. M. Zavlanos, L. Spesivtsev, and G. J. Pappas, “A distributed auction algorithm for the assignment problem,” inProceedings of the IEEE Conference on Decision and Control (CDC). IEEE, 2008, pp. 1212– 1217

work page 2008

[5] [5]

Multi-robot task allocation games in dynamically changing environments,

S. Park, Y . D. Zhong, and N. E. Leonard, “Multi-robot task allocation games in dynamically changing environments,” inProceedings of the IEEE International Conference on Robotics and Automation (ICRA). IEEE, 2021, pp. 8678–8684

work page 2021

[6] [6]

A resilient and energy-aware task allocation framework for heterogeneous multirobot systems,

G. Notomista, S. Mayya, Y . Emam, C. Kroninger, A. Bohannon, S. Hutchinson, and M. Egerstedt, “A resilient and energy-aware task allocation framework for heterogeneous multirobot systems,”IEEE Transactions on Robotics (T-RO), vol. 38, no. 1, pp. 159–179, 2021

work page 2021

[7] [7]

Resource allocation with multi-team collaboration based on hamilton’s rule,

R. Karam, R. Lin, B. A. Butler, and M. Egerstedt, “Resource allocation with multi-team collaboration based on hamilton’s rule,” inProceedings of the IEEE Conference on Decision and Control (CDC). IEEE, 2025, pp. 6891–6898

work page 2025

[8] [8]

Distributed resource allocation via multi- agent systems under time-varying networks,

K. Lu, H. Xu, and Y . Zheng, “Distributed resource allocation via multi- agent systems under time-varying networks,”Automatica, vol. 136, p. 110059, 2022

work page 2022

[9] [9]

A formal analysis and taxonomy of task allocation in multi-robot systems,

B. P. Gerkey and M. J. Matari ´c, “A formal analysis and taxonomy of task allocation in multi-robot systems,”The International Journal of Robotics Research, vol. 23, no. 9, pp. 939–954, 2004

work page 2004

[10] [10]

A comprehensive taxonomy for multi-robot task allocation,

G. A. Korsah, A. Stentz, and M. B. Dias, “A comprehensive taxonomy for multi-robot task allocation,”The International Journal of Robotics Research, vol. 32, no. 12, pp. 1495–1512, 2013

work page 2013

[11] [11]

Decentralized control of cooperative and autonomous agents for solving the distributed resource allocation problem,

P. Anussornnitisarn, S. Y . Nof, and O. Etzion, “Decentralized control of cooperative and autonomous agents for solving the distributed resource allocation problem,”International Journal of Production Economics, vol. 98, no. 2, pp. 114–128, 2005

work page 2005

[12] [12]

Consensus-based decentralized auctions for robust task allocation,

H.-L. Choi, L. Brunet, and J. P. How, “Consensus-based decentralized auctions for robust task allocation,”IEEE Transactions on Robotics (T- RO), vol. 25, no. 4, pp. 912–926, 2009

work page 2009

[13] [13]

Collaboration in multi-robot systems: Taxonomy and survey over frameworks for collaboration,

R. Karam, A. A. Nguyen, R. Lin, D. R. Martin, D. Morales, B. A. Butler, and M. Egerstedt, “Collaboration in multi-robot systems: Taxonomy and survey over frameworks for collaboration,” 2026. [Online]. Available: https://arxiv.org/abs/2603.23898

work page arXiv 2026

[14] [14]

Monotonic value function factorisation for deep multi- agent reinforcement learning,

T. Rashid, M. Samvelyan, C. S. De Witt, G. Farquhar, J. Foerster, and S. Whiteson, “Monotonic value function factorisation for deep multi- agent reinforcement learning,”Journal of Machine Learning Research, vol. 21, no. 178, pp. 1–51, 2020

work page 2020

[15] [15]

Learning combinatorial optimization algorithms over graphs,

E. Khalil, H. Dai, Y . Zhang, B. Dilkina, and L. Song, “Learning combinatorial optimization algorithms over graphs,”Advances in Neural Information Processing Systems (NeurIps), vol. 30, 2017

work page 2017

[16] [16]

Machine learning for combinato- rial optimization: A methodological tour d’horizon,

Y . Bengio, A. Lodi, and A. Prouvost, “Machine learning for combinato- rial optimization: A methodological tour d’horizon,”European Journal of Operational Research, vol. 290, no. 2, pp. 405–421, 2021

work page 2021

[17] [17]

The graph neural network model,

F. Scarselli, M. Gori, A. C. Tsoi, M. Hagenbuchner, and G. Monfardini, “The graph neural network model,”IEEE Transactions on Neural Networks, vol. 20, no. 1, pp. 61–80, 2009

work page 2009

[18] [18]

Relational inductive biases, deep learning, and graph networks

P. W. Battaglia, J. B. Hamrick, V . Bapst, A. Sanchez-Gonzalez, V . Zambaldi, M. Malinowski, A. Tacchetti, D. Raposo, A. Santoro, R. Faulkner, C. Gulcehre, F. Song, A. Ballard, J. Gilmer, G. Dahl, A. Vaswani, K. Allen, C. Nash, V . Langston, C. Dyer, N. Heess, D. Wierstra, P. Kohli, M. Botvinick, O. Vinyals, Y . Li, and R. Pascanu, 16 “Relational inductiv...

work page internal anchor Pith review Pith/arXiv arXiv 2018

[19] [19]

Graph neural networks for decentralized multi-robot path planning,

Q. Li, F. Gama, A. Ribeiro, and A. Prorok, “Graph neural networks for decentralized multi-robot path planning,” inIEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). IEEE, 2020, pp. 11 785–11 792

work page 2020

[20] [20]

Learning decentralized controllers for robot swarms with graph neu- ral networks,

E. Tolstaya, F. Gama, J. Paulos, G. Pappas, V . Kumar, and A. Ribeiro, “Learning decentralized controllers for robot swarms with graph neu- ral networks,” inProceedings of the Conference on Robot Learning. PMLR, 2020, pp. 671–682

work page 2020

[21] [21]

Attention, Learn to Solve Routing Problems!

W. Kool, H. van Hoof, and M. Welling, “Attention, learn to solve routing problems!” 2019. [Online]. Available: https://arxiv.org/abs/1803.08475

work page internal anchor Pith review Pith/arXiv arXiv 2019

[22] [22]

Learning to dispatch for job shop scheduling via deep reinforcement learning,

C. Zhang, W. Song, Z. Cao, J. Zhang, P. S. Tan, and X. Chi, “Learning to dispatch for job shop scheduling via deep reinforcement learning,” inAdvances in Neural Information Processing Systems (NeurIps), H. Larochelle, M. Ranzato, R. Hadsell, M. Balcan, and H. Lin, Eds., vol. 33. Curran Associates, Inc., 2020, pp. 1621–1632

work page 2020

[23] [23]

Graph networks as learnable physics engines for inference and control,

A. Sanchez-Gonzalez, N. Heess, J. T. Springenberg, J. Merel, M. Ried- miller, R. Hadsell, and P. Battaglia, “Graph networks as learnable physics engines for inference and control,” inInternational Conference on Machine Learning (ICML). PMLR, 2018, pp. 4470–4479

work page 2018

[24] [24]

Bonabeau, M

E. Bonabeau, M. Dorigo, and G. Theraulaz,Swarm intelligence: from natural to artificial systems. Oxford university press, 1999, no. 1

work page 1999

[25] [25]

Evolutionary explanations for cooperation,

S. A. West, A. S. Griffin, and A. Gardner, “Evolutionary explanations for cooperation,”Current biology, vol. 17, no. 16, pp. R661–R672, 2007

work page 2007

[26] [26]

The genetical evolution of social behaviour,

W. D. Hamilton, “The genetical evolution of social behaviour,”Journal of Theoretical Biology, vol. 7, no. 1, pp. 1–16, 1964

work page 1964

[27] [27]

Hamilton’s rule for enabling altruism in multi-agent systems,

B. A. Butler and M. Egerstedt, “Hamilton’s rule for enabling altruism in multi-agent systems,” inIEEE Conference on Decision and Control (CDC), 2025, pp. 6776–6783

work page 2025

[28] [28]

M. R. Garey and D. S. Johnson,Computers and Intractability: A Guide to the Theory of NP-Completeness. W. H. Freeman and Company, 1979

work page 1979

[29] [29]

Reducibility among combinatorial problems,

R. M. Karp, “Reducibility among combinatorial problems,” in50 Years of Integer Programming 1958-2008: from the Early Years to the State- of-the-Art. Springer, 2009, pp. 219–241

work page 1958

[30] [30]

Submodular function maximization,

A. Krause and D. Golovin, “Submodular function maximization,” Tractability, vol. 3, no. 71-104, p. 3, 2014

work page 2014

[31] [31]

Decoupled Weight Decay Regularization

I. Loshchilov and F. Hutter, “Decoupled weight decay regularization,” arXiv preprint arXiv:1711.05101, 2017

work page internal anchor Pith review Pith/arXiv arXiv 2017

[32] [32]

A systematic analysis of performance measures for classification tasks,

M. Sokolova and G. Lapalme, “A systematic analysis of performance measures for classification tasks,”Information Processing & Manage- ment, vol. 45, no. 4, pp. 427–437, 2009

work page 2009

[33] [33]

The relationship between precision-recall and roc curves,

J. Davis and M. Goadrich, “The relationship between precision-recall and roc curves,” inProceedings of the International Conference on Machine Learning (ICML), 2006, pp. 233–240

work page 2006