pith. sign in

arxiv: 2605.21723 · v1 · pith:PDOPCE6Inew · submitted 2026-05-20 · 💻 cs.RO · cs.AI· cs.MA· cs.SY· eess.SY

Learning Altruistic Collaboration in Heterogeneous Multi-Team Systems

Pith reviewed 2026-05-22 08:58 UTC · model grok-4.3

classification 💻 cs.RO cs.AIcs.MAcs.SYeess.SY
keywords heterogeneous multi-team systemsaltruistic collaborationHamilton's rulegraph neural networksdynamic robot allocationmulti-robot systemsfirefighting scenarioresource sharing
0
0 comments X

The pith

A graph neural network can approximate altruistic robot allocations based on Hamilton's rule to enable scalable collaboration across heterogeneous multi-team systems.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper shows how to treat robots as transferable resources between teams that have different capabilities, where transfers incur costs and produce capability-dependent benefits. It adapts Hamilton's rule from biology to decide when one team should altruistically give a robot to another, turning the allocation into a combinatorial problem that is NP-hard. To make this practical, the authors train a graph neural network policy centrally on the team interaction graph so that it can predict transfers and re-assignments decentrally at runtime. Simulations and experiments in a firefighting domain indicate that the learned policy stays close to optimal while handling larger numbers of teams and robots than exact solvers allow.

Core claim

Modeling robot transfers as altruistic acts governed by Hamilton's rule produces a combinatorial allocation problem that is NP-hard; a graph neural network policy trained under centralized training and decentralized execution can approximate the resulting allocations, delivering near-optimal performance in multi-team firefighting scenarios while scaling to larger systems.

What carries the argument

The graph neural network policy that operates over the team interaction graph and predicts robot-level transfer decisions together with next robot-to-team assignments, thereby approximating the altruistic allocations derived from Hamilton's rule.

If this is right

  • The allocation problem is NP-hard, so exact methods become intractable beyond small scales.
  • The learned GNN policy achieves near-optimal performance in the firefighting domain.
  • Centralized training with decentralized execution allows the policy to run on larger systems than exact solvers.
  • Validation combines simulation results with physical experiments confirming practical viability.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • Ecological decision rules such as Hamilton's could be reused as lightweight heuristics in other engineered multi-agent resource-sharing problems.
  • The centralized-training decentralized-execution pattern may reduce communication demands when the same policy is deployed on physical robot teams.
  • Extending the model to include stochastic transfer costs or partial observability would test robustness beyond the current deterministic setting.

Load-bearing premise

That Hamilton's rule from ecology can be directly adapted as a decision mechanism for robot transfers in heterogeneous teams with capability-dependent contributions and transfer costs without introducing significant modeling error.

What would settle it

Solving the exact altruistic allocation on small firefighting instances and measuring the performance gap to the GNN policy; a large gap or failure of the policy to maintain near-optimal results on larger instances would falsify the scalability claim.

Figures

Figures reproduced from arXiv: 2605.21723 by Brooks A. Butler, Magnus Egerstedt, Riwa Karam, Ruoyu Lin.

Figure 1
Figure 1. Figure 1: Example of a heterogeneous multi-team system where teams, depicted [PITH_FULL_IMAGE:figures/full_fig_p003_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Overview of the proposed GNN-based policy for heterogeneous altruistic collaboration with an example graph input [PITH_FULL_IMAGE:figures/full_fig_p007_2.png] view at source ↗
Figure 4
Figure 4. Figure 4: Training, validation, and testing performance of the GNN policy. The [PITH_FULL_IMAGE:figures/full_fig_p011_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: Training dynamics of the GNN policy, showing the model loss and [PITH_FULL_IMAGE:figures/full_fig_p011_5.png] view at source ↗
Figure 6
Figure 6. Figure 6: Example executions of the proposed GNN-based heterogeneous altruistic collaboration framework in software simulations. The top row shows a [PITH_FULL_IMAGE:figures/full_fig_p013_6.png] view at source ↗
Figure 7
Figure 7. Figure 7: Execution sequence of the proposed GNN-based heterogeneous altruistic collaboration framework on a multi-robot testbed. Teams operate in distinct [PITH_FULL_IMAGE:figures/full_fig_p015_7.png] view at source ↗
read the original abstract

This paper studies heterogeneous multi-team collaboration through dynamic robot allocation, where robots are treated as transferable resources. Leveraging Hamilton's rule from ecology as an altruistic decision-making mechanism, we propose a multi-team collaborative resource allocation framework with heterogeneous capabilities, transfer costs, and capability-dependent contributions. The resulting allocation problem is combinatorial and is shown to be NP-hard. To address scalability, we develop a graph neural network policy under centralized training and decentralized execution that approximates the altruistic allocations based on Hamilton's rule. The model operates over the team interaction graph and predicts robot-level transfer decisions and next robot-to-team assignments. The proposed approach is validated in a firefighting scenario through simulations and experiments, demonstrating that the learned policy achieves near-optimal performance while scaling to larger systems.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The paper studies dynamic robot allocation across heterogeneous multi-team systems, treating robots as transferable resources. It adapts Hamilton's rule from evolutionary biology as an altruistic decision criterion incorporating relatedness, benefit, and cost, formulates the resulting allocation task as a combinatorial optimization problem shown to be NP-hard, and proposes a graph neural network policy (centralized training, decentralized execution) that operates on the team interaction graph to predict robot transfers and assignments. The approach is evaluated in a firefighting scenario via simulation and hardware experiments, with the central claim that the learned policy achieves near-optimal performance while scaling to larger team sizes.

Significance. If the adaptation of Hamilton's rule to capability-dependent contributions and explicit transfer costs preserves the intended altruistic optimum without substantial modeling error, the work would provide a principled, scalable method for multi-robot collaboration that bridges ecological theory and robotics. The GNN approximation of the combinatorial allocation and the demonstration of scalability in a concrete domain are positive elements; reproducible validation in both simulation and experiments would further strengthen the contribution.

major comments (2)
  1. [Abstract] Abstract: the claim that the GNN 'approximates the altruistic allocations based on Hamilton's rule' and achieves 'near-optimal performance' is presented without derivation of the mapping from (r, B, C) to capability vectors and transfer costs, without error bars, and without ablation results on the rule-to-reward translation; this mapping is load-bearing for the optimality claim once heterogeneous capabilities and costs are introduced.
  2. [Problem Formulation] Problem statement / formulation section: the NP-hardness of the allocation problem is asserted but no reduction or proof sketch is referenced; because the GNN is motivated as a scalable surrogate precisely for this hardness, the absence of the hardness argument weakens the justification for the learning approach.
minor comments (2)
  1. [Introduction] Notation for capability profiles and transfer-cost matrices should be introduced earlier and used consistently when describing how Hamilton's rule is instantiated.
  2. [Experiments] Figure captions for the firefighting scenario should explicitly state the number of teams, robots, and capability types used in the scaling experiments.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their insightful comments and the opportunity to improve our manuscript. We address each major comment below and outline the revisions we will make.

read point-by-point responses
  1. Referee: [Abstract] Abstract: the claim that the GNN 'approximates the altruistic allocations based on Hamilton's rule' and achieves 'near-optimal performance' is presented without derivation of the mapping from (r, B, C) to capability vectors and transfer costs, without error bars, and without ablation results on the rule-to-reward translation; this mapping is load-bearing for the optimality claim once heterogeneous capabilities and costs are introduced.

    Authors: We agree that the abstract could better highlight the key elements of our approach. The derivation of the mapping from Hamilton's rule (r, B, C) to capability vectors and transfer costs is provided in detail in Section 3, where we define the benefit B as the capability-dependent performance gain and cost C incorporating explicit transfer costs. To strengthen the presentation, we will revise the abstract to briefly reference this mapping and ensure the results include error bars and additional ablation studies on the rule-to-reward formulation. We believe these changes will clarify the foundation for the near-optimal performance claims. revision: yes

  2. Referee: [Problem Formulation] Problem statement / formulation section: the NP-hardness of the allocation problem is asserted but no reduction or proof sketch is referenced; because the GNN is motivated as a scalable surrogate precisely for this hardness, the absence of the hardness argument weakens the justification for the learning approach.

    Authors: We acknowledge this point. While the combinatorial nature and NP-hardness are discussed in the problem formulation, we did not include an explicit reduction. In the revised version, we will add a proof sketch by reducing from the generalized assignment problem, which is known to be NP-hard, adapted to include the altruistic criterion and transfer costs. This will provide a stronger justification for using the GNN as a scalable surrogate. revision: yes

Circularity Check

0 steps flagged

No significant circularity; derivation anchored in external ecological rule

full rationale

The paper adapts Hamilton's rule (rB > C) from ecology as the basis for defining altruistic robot allocations in a heterogeneous multi-team setting with capability-dependent contributions and transfer costs. It then formulates the resulting combinatorial allocation problem and trains a GNN policy to approximate those allocations under centralized training. No step reduces by construction to its own inputs: the rule is imported as an independent principle rather than defined in terms of the GNN outputs or fitted parameters, the NP-hardness claim is separate from the learning, and the 'near-optimal' performance is measured against the allocations derived from the external rule. The derivation chain remains self-contained against this external benchmark.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract-only review; free parameters, axioms, and invented entities cannot be enumerated without the full methods and equations.

pith-pipeline@v0.9.0 · 5669 in / 1073 out tokens · 32440 ms · 2026-05-22T08:58:45.465564+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

33 extracted references · 33 canonical work pages · 3 internal anchors

  1. [1]

    R. R. Murphy,Disaster Robotics. MIT Press, 2017. 15 (a) Initial Allocation (b) First Robot Migrations (c) Second Robot Migrations (d) Final Allocation Fig. 7. Execution sequence of the proposed GNN-based heterogeneous altruistic collaboration framework on a multi-robot testbed. Teams operate in distinct spatial regions with heterogeneous sensing and fire-...

  2. [2]

    Coverage control for mobile sensing networks,

    J. Cort ´es, S. Mart ´ınez, T. Karatas, and F. Bullo, “Coverage control for mobile sensing networks,”IEEE Transactions on Robotics and Automation, vol. 20, no. 2, pp. 243–255, 2004

  3. [3]

    Decentralized, adaptive cover- age control for networked robots,

    M. Schwager, D. Rus, and J.-J. Slotine, “Decentralized, adaptive cover- age control for networked robots,”The International Journal of Robotics Research, vol. 28, no. 3, pp. 357–375, 2009

  4. [4]

    A distributed auction algorithm for the assignment problem,

    M. M. Zavlanos, L. Spesivtsev, and G. J. Pappas, “A distributed auction algorithm for the assignment problem,” inProceedings of the IEEE Conference on Decision and Control (CDC). IEEE, 2008, pp. 1212– 1217

  5. [5]

    Multi-robot task allocation games in dynamically changing environments,

    S. Park, Y . D. Zhong, and N. E. Leonard, “Multi-robot task allocation games in dynamically changing environments,” inProceedings of the IEEE International Conference on Robotics and Automation (ICRA). IEEE, 2021, pp. 8678–8684

  6. [6]

    A resilient and energy-aware task allocation framework for heterogeneous multirobot systems,

    G. Notomista, S. Mayya, Y . Emam, C. Kroninger, A. Bohannon, S. Hutchinson, and M. Egerstedt, “A resilient and energy-aware task allocation framework for heterogeneous multirobot systems,”IEEE Transactions on Robotics (T-RO), vol. 38, no. 1, pp. 159–179, 2021

  7. [7]

    Resource allocation with multi-team collaboration based on hamilton’s rule,

    R. Karam, R. Lin, B. A. Butler, and M. Egerstedt, “Resource allocation with multi-team collaboration based on hamilton’s rule,” inProceedings of the IEEE Conference on Decision and Control (CDC). IEEE, 2025, pp. 6891–6898

  8. [8]

    Distributed resource allocation via multi- agent systems under time-varying networks,

    K. Lu, H. Xu, and Y . Zheng, “Distributed resource allocation via multi- agent systems under time-varying networks,”Automatica, vol. 136, p. 110059, 2022

  9. [9]

    A formal analysis and taxonomy of task allocation in multi-robot systems,

    B. P. Gerkey and M. J. Matari ´c, “A formal analysis and taxonomy of task allocation in multi-robot systems,”The International Journal of Robotics Research, vol. 23, no. 9, pp. 939–954, 2004

  10. [10]

    A comprehensive taxonomy for multi-robot task allocation,

    G. A. Korsah, A. Stentz, and M. B. Dias, “A comprehensive taxonomy for multi-robot task allocation,”The International Journal of Robotics Research, vol. 32, no. 12, pp. 1495–1512, 2013

  11. [11]

    Decentralized control of cooperative and autonomous agents for solving the distributed resource allocation problem,

    P. Anussornnitisarn, S. Y . Nof, and O. Etzion, “Decentralized control of cooperative and autonomous agents for solving the distributed resource allocation problem,”International Journal of Production Economics, vol. 98, no. 2, pp. 114–128, 2005

  12. [12]

    Consensus-based decentralized auctions for robust task allocation,

    H.-L. Choi, L. Brunet, and J. P. How, “Consensus-based decentralized auctions for robust task allocation,”IEEE Transactions on Robotics (T- RO), vol. 25, no. 4, pp. 912–926, 2009

  13. [13]

    Collaboration in multi-robot systems: Taxonomy and survey over frameworks for collaboration,

    R. Karam, A. A. Nguyen, R. Lin, D. R. Martin, D. Morales, B. A. Butler, and M. Egerstedt, “Collaboration in multi-robot systems: Taxonomy and survey over frameworks for collaboration,” 2026. [Online]. Available: https://arxiv.org/abs/2603.23898

  14. [14]

    Monotonic value function factorisation for deep multi- agent reinforcement learning,

    T. Rashid, M. Samvelyan, C. S. De Witt, G. Farquhar, J. Foerster, and S. Whiteson, “Monotonic value function factorisation for deep multi- agent reinforcement learning,”Journal of Machine Learning Research, vol. 21, no. 178, pp. 1–51, 2020

  15. [15]

    Learning combinatorial optimization algorithms over graphs,

    E. Khalil, H. Dai, Y . Zhang, B. Dilkina, and L. Song, “Learning combinatorial optimization algorithms over graphs,”Advances in Neural Information Processing Systems (NeurIps), vol. 30, 2017

  16. [16]

    Machine learning for combinato- rial optimization: A methodological tour d’horizon,

    Y . Bengio, A. Lodi, and A. Prouvost, “Machine learning for combinato- rial optimization: A methodological tour d’horizon,”European Journal of Operational Research, vol. 290, no. 2, pp. 405–421, 2021

  17. [17]

    The graph neural network model,

    F. Scarselli, M. Gori, A. C. Tsoi, M. Hagenbuchner, and G. Monfardini, “The graph neural network model,”IEEE Transactions on Neural Networks, vol. 20, no. 1, pp. 61–80, 2009

  18. [18]

    Relational inductive biases, deep learning, and graph networks

    P. W. Battaglia, J. B. Hamrick, V . Bapst, A. Sanchez-Gonzalez, V . Zambaldi, M. Malinowski, A. Tacchetti, D. Raposo, A. Santoro, R. Faulkner, C. Gulcehre, F. Song, A. Ballard, J. Gilmer, G. Dahl, A. Vaswani, K. Allen, C. Nash, V . Langston, C. Dyer, N. Heess, D. Wierstra, P. Kohli, M. Botvinick, O. Vinyals, Y . Li, and R. Pascanu, 16 “Relational inductiv...

  19. [19]

    Graph neural networks for decentralized multi-robot path planning,

    Q. Li, F. Gama, A. Ribeiro, and A. Prorok, “Graph neural networks for decentralized multi-robot path planning,” inIEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). IEEE, 2020, pp. 11 785–11 792

  20. [20]

    Learning decentralized controllers for robot swarms with graph neu- ral networks,

    E. Tolstaya, F. Gama, J. Paulos, G. Pappas, V . Kumar, and A. Ribeiro, “Learning decentralized controllers for robot swarms with graph neu- ral networks,” inProceedings of the Conference on Robot Learning. PMLR, 2020, pp. 671–682

  21. [21]

    Attention, Learn to Solve Routing Problems!

    W. Kool, H. van Hoof, and M. Welling, “Attention, learn to solve routing problems!” 2019. [Online]. Available: https://arxiv.org/abs/1803.08475

  22. [22]

    Learning to dispatch for job shop scheduling via deep reinforcement learning,

    C. Zhang, W. Song, Z. Cao, J. Zhang, P. S. Tan, and X. Chi, “Learning to dispatch for job shop scheduling via deep reinforcement learning,” inAdvances in Neural Information Processing Systems (NeurIps), H. Larochelle, M. Ranzato, R. Hadsell, M. Balcan, and H. Lin, Eds., vol. 33. Curran Associates, Inc., 2020, pp. 1621–1632

  23. [23]

    Graph networks as learnable physics engines for inference and control,

    A. Sanchez-Gonzalez, N. Heess, J. T. Springenberg, J. Merel, M. Ried- miller, R. Hadsell, and P. Battaglia, “Graph networks as learnable physics engines for inference and control,” inInternational Conference on Machine Learning (ICML). PMLR, 2018, pp. 4470–4479

  24. [24]

    Bonabeau, M

    E. Bonabeau, M. Dorigo, and G. Theraulaz,Swarm intelligence: from natural to artificial systems. Oxford university press, 1999, no. 1

  25. [25]

    Evolutionary explanations for cooperation,

    S. A. West, A. S. Griffin, and A. Gardner, “Evolutionary explanations for cooperation,”Current biology, vol. 17, no. 16, pp. R661–R672, 2007

  26. [26]

    The genetical evolution of social behaviour,

    W. D. Hamilton, “The genetical evolution of social behaviour,”Journal of Theoretical Biology, vol. 7, no. 1, pp. 1–16, 1964

  27. [27]

    Hamilton’s rule for enabling altruism in multi-agent systems,

    B. A. Butler and M. Egerstedt, “Hamilton’s rule for enabling altruism in multi-agent systems,” inIEEE Conference on Decision and Control (CDC), 2025, pp. 6776–6783

  28. [28]

    M. R. Garey and D. S. Johnson,Computers and Intractability: A Guide to the Theory of NP-Completeness. W. H. Freeman and Company, 1979

  29. [29]

    Reducibility among combinatorial problems,

    R. M. Karp, “Reducibility among combinatorial problems,” in50 Years of Integer Programming 1958-2008: from the Early Years to the State- of-the-Art. Springer, 2009, pp. 219–241

  30. [30]

    Submodular function maximization,

    A. Krause and D. Golovin, “Submodular function maximization,” Tractability, vol. 3, no. 71-104, p. 3, 2014

  31. [31]

    Decoupled Weight Decay Regularization

    I. Loshchilov and F. Hutter, “Decoupled weight decay regularization,” arXiv preprint arXiv:1711.05101, 2017

  32. [32]

    A systematic analysis of performance measures for classification tasks,

    M. Sokolova and G. Lapalme, “A systematic analysis of performance measures for classification tasks,”Information Processing & Manage- ment, vol. 45, no. 4, pp. 427–437, 2009

  33. [33]

    The relationship between precision-recall and roc curves,

    J. Davis and M. Goadrich, “The relationship between precision-recall and roc curves,” inProceedings of the International Conference on Machine Learning (ICML), 2006, pp. 233–240