pith. sign in

arxiv: 2410.17517 · v5 · submitted 2024-10-23 · 💻 cs.MA · cs.AI· cs.GT

The Hive Mind is a Single Reinforcement Learning Agent

Pith reviewed 2026-05-23 19:16 UTC · model grok-4.3

classification 💻 cs.MA cs.AIcs.GT
keywords hive mindreinforcement learningcollective decision-makinghoney beeswaggle dancemulti-armed banditimitationMaynard-Cross Learning
0
0 comments X

The pith

Honey bee swarms using only local imitation learn exactly as one reinforcement learning agent does.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper establishes an equivalence between collective decision-making via imitation and single-agent trial-and-error learning. In the weighted voter model of honey bee nest-hunting, the group's choices update according to a multi-armed bandit rule the authors call Maynard-Cross Learning, making the hive mind behave as one online RL agent facing many parallel environments. A reader would care because the result unifies two classic explanations for intelligent behavior: groups that copy successes and individuals that explore. It shows that simple, local imitation rules can produce the same learning dynamics as explicit reinforcement without any individual needing to reason about rewards. The authors note this view applies to any imitative collective and supplies a formal tool for analyzing such systems.

Core claim

The emergent distributed cognition arising from individuals following simple, local imitation-based rules in the weighted voter model of bees' waggle dance is that of a single online reinforcement learning agent interacting with many parallel environments; the group's update rule is the multi-armed bandit algorithm the authors term Maynard-Cross Learning.

What carries the argument

The weighted voter model of waggle-dance communication, whose choice-update rule matches Maynard-Cross Learning and thereby equates the collective to a single multi-armed-bandit agent.

If this is right

  • A group of purely imitative organisms functions as a more complex reinforcement-enabled entity.
  • Group-level intelligence can explain the evolutionary selection of simple individual behaviors.
  • Imitative economic or social systems can be analyzed as collective learning processes.
  • The framework supplies design principles for scalable artificial collective systems inspired by RL.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same equivalence might appear in other animal groups that rely on local copying rules.
  • Markets or voting systems could be modeled as single RL agents when participants imitate successful choices.
  • Adding memory or private exploration to the bee model would be a direct test of whether the equivalence survives.
  • Multi-agent RL systems could deliberately use imitation to reproduce single-agent learning at scale.

Load-bearing premise

The weighted voter model fully captures the bees' collective decision process without additional mechanisms such as memory or spatial exploration.

What would settle it

Demonstration that real bees employ mechanisms outside the weighted voter model, such as individual memory of sites or non-imitative exploration, during nest selection.

read the original abstract

Decision-making is an essential attribute of any intelligent agent or group. Natural systems are known to converge to effective strategies through at least two distinct mechanisms: collective decision-making via imitation of others, and trial-and-error by a single agent. This paper establishes an equivalence between these two paradigms by drawing from the well-studied collective decision-making problem of nest-hunting in swarms of honey bees. We show that the emergent distributed cognition (sometimes referred to as the $\textit{hive mind}$) arising from individuals following simple, local imitation-based rules is that of a single online reinforcement learning (RL) agent interacting with many parallel environments. More specifically, in the purely imitative $\textit{weighted voter}$ model of bees' waggle dance, the update rule through which this macro-agent learns is a multi-armed bandit algorithm that we coin $\textit{Maynard-Cross Learning}$. Our analysis implies that a group of purely imitative organisms can be equivalent to a more complex, reinforcement-enabled entity, substantiating the idea that group-level intelligence may explain how seemingly simple and blind individual behaviors are selected in nature. Beyond biology, the framework offers new tools for analyzing economic and social systems where individuals imitate successful strategies, effectively participating in a collective learning process. Our findings may further inform the design of scalable RL-inspired collective systems in artificial domains.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The paper claims that within the weighted voter model of honeybee waggle-dance communication, the emergent collective decision process is mathematically equivalent to a single online reinforcement learning agent interacting with many parallel environments; specifically, the collective update rule is a multi-armed bandit algorithm that the authors name Maynard-Cross Learning. This equivalence is used to argue that group-level intelligence can arise from simple local imitation rules, with implications for natural selection, economic systems, and the design of collective AI.

Significance. If the internal mapping is rigorously derived, the result supplies a clean theoretical bridge between imitation-based collective behavior and reinforcement learning, offering a parameter-free account of how simple individual rules can produce effective group strategies. This could inform multi-agent system design and bio-inspired algorithms, though its explanatory power for natural systems hinges on model completeness.

major comments (2)
  1. [Abstract / model section] Abstract and model section: the equivalence is asserted for the weighted voter model, yet the broader claim that this 'substantiates the idea that group-level intelligence may explain how seemingly simple and blind individual behaviors are selected in nature' requires the model to be a complete description; no empirical comparison or discussion of omitted mechanisms (spatial exploration, memory, non-imitative rules) is supplied, creating a correctness risk for the natural-selection implication.
  2. [Maynard-Cross Learning definition] Section introducing Maynard-Cross Learning: coining a new algorithm name for the update rule obtained directly by aggregating the weighted voter model makes the claimed equivalence appear definitional rather than independently derived; a concrete test is whether the algorithm exhibits any property not already entailed by the model's aggregation step.
minor comments (2)
  1. [Abstract] Abstract: the equivalence statement would be clearer if a single key equation or proof outline were included.
  2. [Notation / model definition] Notation: the distinction between individual and collective update rules should be made explicit with consistent symbols to avoid reader confusion.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive report and the recommendation for major revision. We address each major comment below with specific plans for revision where warranted. The core mathematical equivalence between the weighted voter model and the derived update rule remains intact, but we agree that certain claims require additional qualification and clarification.

read point-by-point responses
  1. Referee: [Abstract / model section] Abstract and model section: the equivalence is asserted for the weighted voter model, yet the broader claim that this 'substantiates the idea that group-level intelligence may explain how seemingly simple and blind individual behaviors are selected in nature' requires the model to be a complete description; no empirical comparison or discussion of omitted mechanisms (spatial exploration, memory, non-imitative rules) is supplied, creating a correctness risk for the natural-selection implication.

    Authors: We agree that the natural-selection implication in the abstract and introduction is stated too strongly given the model's scope. The equivalence is rigorously derived only for the purely imitative weighted voter model; extending it to explain evolutionary selection requires acknowledging that real bee colonies include additional mechanisms. We will revise the abstract and add a new subsection (likely in the Discussion) that explicitly lists the omitted mechanisms (spatial exploration, individual memory, and non-imitative rules), discusses how they might alter or preserve the equivalence, and qualifies the evolutionary claim as a hypothesis supported by the model rather than a direct substantiation. No empirical comparison is feasible within the current theoretical scope, but the added discussion will reduce the risk of overgeneralization. revision: yes

  2. Referee: [Maynard-Cross Learning definition] Section introducing Maynard-Cross Learning: coining a new algorithm name for the update rule obtained directly by aggregating the weighted voter model makes the claimed equivalence appear definitional rather than independently derived; a concrete test is whether the algorithm exhibits any property not already entailed by the model's aggregation step.

    Authors: The name 'Maynard-Cross Learning' is intended to identify the specific multi-armed bandit update rule that emerges from the aggregation, allowing comparison with existing algorithms. However, we accept that the presentation risks making the equivalence appear tautological. In revision we will (1) separate the derivation of the update rule from the naming, (2) add a short analysis showing that the resulting algorithm possesses a non-trivial property (convergence rate under parallel environments that differs from standard UCB or epsilon-greedy when the number of parallel instances grows) not directly implied by the aggregation step alone, and (3) rephrase the surrounding text to emphasize that the equivalence is a derived result rather than a definitional restatement. We will also consider whether a different presentation (e.g., without a new name) better conveys the contribution. revision: yes

Circularity Check

0 steps flagged

No significant circularity; equivalence derived directly from model rules

full rationale

The paper derives the equivalence between the weighted voter model's collective update rule and a multi-armed bandit algorithm by direct mathematical mapping within the model's stated assumptions. This is presented as an analytical result rather than a fitted prediction or self-referential definition. No equations or steps in the abstract reduce the claimed RL equivalence to its inputs by construction, and no self-citations, ansatzes, or renamings of external results are invoked as load-bearing. The coining of 'Maynard-Cross Learning' is merely nomenclature for the derived rule. The derivation chain remains self-contained against the model's premises, with the broader applicability to real bees noted as an assumption rather than a circular claim.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 1 invented entities

Review performed on abstract only; full derivation, assumptions, and any fitted parameters are not visible. The central claim rests on the accuracy of the weighted voter model and the existence of a direct mapping to RL updates.

axioms (1)
  • domain assumption The weighted voter model accurately captures the collective nest-hunting decision process in honey bee swarms
    Invoked as the basis for the equivalence; stated in abstract as the model from which the RL equivalence follows.
invented entities (1)
  • Maynard-Cross Learning no independent evidence
    purpose: The specific multi-armed bandit update rule that the bee imitation process is claimed to implement
    Newly coined in the paper; no independent evidence provided in abstract

pith-pipeline@v0.9.0 · 5776 in / 1309 out tokens · 19397 ms · 2026-05-23T19:16:50.475476+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

55 extracted references · 55 canonical work pages · 3 internal anchors

  1. [1]

    Princeton University Press, Princeton, NJ, USA (1944)

    Neumann, J.V., Morgenstern, O.: Theory of Games and Economic Behavior. Princeton University Press, Princeton, NJ, USA (1944)

  2. [2]

    Artificial Intelligence299, 103535 (2021) https://doi.org/10.1016/j.artint.2021.103535

    Silver, D., Singh, S., Precup, D., Sutton, R.S.: Reward is enough. Artificial Intelligence299, 103535 (2021) https://doi.org/10.1016/j.artint.2021.103535

  3. [3]

    A Bradford Book, Cambridge, MA, USA (2018)

    Sutton, R.S., Barto, A.G.: Reinforcemet Learning: An Introduction. A Bradford Book, Cambridge, MA, USA (2018)

  4. [4]

    Nature Machine Intelligence1(3), 133–143 (2019)

    Neftci, E.O., Averbeck, B.B.: Reinforcement learning in artificial and biological systems. Nature Machine Intelligence1(3), 133–143 (2019)

  5. [5]

    Nature Neuroscience —27, 403–408 (2024) https://doi.org/10.1038/s41593-023-01535-w

    Muller, T.H., Butler, J.L., Veselic, S., Miranda, B., Wallis, J.D., Dayan, P., Behrens, T.E.J., Kurth-Nelson, Z., Kennerley, S.W.: nature neuroscience distri- butional reinforcement learning in prefrontal cortex. Nature Neuroscience —27, 403–408 (2024) https://doi.org/10.1038/s41593-023-01535-w

  6. [6]

    https://arxiv.org/abs/2410.14606

    Elsayed, M., Vasan, G., Mahmood, A.R.: Streaming Deep Reinforcement Learning Finally Works (2024). https://arxiv.org/abs/2410.14606

  7. [7]

    https://arxiv.org/abs/2411.15370

    Vasan, G., Elsayed, M., Azimi, A., He, J., Shariar, F., Bellinger, C., White, M., Mahmood, A.R.: Deep Policy Gradient Methods Without Batch Updates, Target Networks, or Replay Buffers (2024). https://arxiv.org/abs/2411.15370

  8. [8]

    In: International Conference on Machine Learning, pp

    Mnih, V., Badia, A.P., Mirza, M., Graves, A., Lillicrap, T., Harley, T., Silver, D., Kavukcuoglu, K.: Asynchronous methods for deep reinforcement learning. In: International Conference on Machine Learning, pp. 1928–1937 (2016). PmLR

  9. [9]

    Machine Learning47, 235–256 (2002) https://doi.org/10.1023/ A:1013689704352

    Auer, P., Cesa-Bianchi, N., Fischer, P.: Finite-time analysis of the multiarmed bandit problem. Machine Learning47, 235–256 (2002) https://doi.org/10.1023/ A:1013689704352

  10. [10]

    Williams, R.J.: Simple Statistical Gradient-Following Algorithms for Connection- ist Reinforcement Learning (1992)

  11. [11]

    The Quarterly Journal of Economics87(2), 239–266 (1973)

    Cross, J.G.: A stochastic learning model of economic behavior. The Quarterly Journal of Economics87(2), 239–266 (1973)

  12. [12]

    Journal of Artificial Intelligence Research53, 659–697 (2015) https://doi.org/10.1613/jair.4818 15

    Bloembergen, D., Tuyls, K., Hennes, D., Kaisers, M.: Evolutionary dynamics of multi-agent learning: A survey. Journal of Artificial Intelligence Research53, 659–697 (2015) https://doi.org/10.1613/jair.4818 15

  13. [13]

    Williams, R.J.: Simple statistical gradient-following algorithms for connectionist reinforcement learning. Mach. Learn.8(3–4), 229–256 (1992) https://doi.org/10. 1007/BF00992696

  14. [14]

    Proximal Policy Optimization Algorithms

    Schulman, J., Wolski, F., Dhariwal, P., Radford, A., Klimov, O.: Proximal Policy Optimization Algorithms (2017). https://arxiv.org/abs/1707.06347

  15. [15]

    Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor

    Haarnoja, T., Zhou, A., Abbeel, P., Levine, S.: Soft Actor-Critic: Off-Policy Max- imum Entropy Deep Reinforcement Learning with a Stochastic Actor (2018). https://arxiv.org/abs/1801.01290

  16. [16]

    PLOS ONE10(10), 1–18 (2015) https: //doi.org/10.1371/journal.pone.0140950

    Reina, A., Valentini, G., Fern´ andez-Oto, C., Dorigo, M., Trianni, V.: A design pattern for decentralised decision making. PLOS ONE10(10), 1–18 (2015) https: //doi.org/10.1371/journal.pone.0140950

  17. [17]

    Current Opinion in Behavioral Sciences16, 30–34 (2017) https://doi.org/10.1016/j.cobeha.2017

    Bose, T., Reina, A., Marshall, J.A.: Collective decision-making. Current Opinion in Behavioral Sciences16, 30–34 (2017) https://doi.org/10.1016/j.cobeha.2017. 03.004 . Comparative cognition

  18. [18]

    American Economic Journal: Microeconomics2, 112–49 (2010) https: //doi.org/10.1257/mic.2.1.112

    Jackson, M., Golub, B.: Na¨ ıve learning in social networks and the wisdom of crowds. American Economic Journal: Microeconomics2, 112–49 (2010) https: //doi.org/10.1257/mic.2.1.112

  19. [19]

    Autonomous Agents and Multi-Agent Systems30(3), 553–580 (2016) https://doi.org/10.1007/ s10458-015-9323-3

    Valentini, G., Ferrante, E., Hamann, H., Dorigo, M.: Collective decision with 100 Kilobots: speed versus accuracy in binary discrimination problems. Autonomous Agents and Multi-Agent Systems30(3), 553–580 (2016) https://doi.org/10.1007/ s10458-015-9323-3

  20. [20]

    In: Proceedings of the 2014 International Conference on Autonomous Agents and Multi-Agent Systems

    Valentini, G., Hamann, H., Dorigo, M.: Self-organized collective decision making: the weighted voter model. In: Proceedings of the 2014 International Conference on Autonomous Agents and Multi-Agent Systems. AAMAS ’14, pp. 45–52. Inter- national Foundation for Autonomous Agents and Multiagent Systems, Richland, SC (2014)

  21. [21]

    The MIT Press, Cambridge, MA (2010).http://www.jstor.org/stable/j.ctt5hhbq5Accessed 2025- 05-23

    Sandholm, W.H.: Population Games and Evolutionary Dynamics. The MIT Press, Cambridge, MA (2010).http://www.jstor.org/stable/j.ctt5hhbq5Accessed 2025- 05-23

  22. [22]

    Reina, A., Njougouo, T., Tuci, E., Carletti, T.: Speed-accuracy trade-offs in best- of-ncollective decision making through heterogeneous mean-field modeling. Phys. Rev. E109, 054307 (2024) https://doi.org/10.1103/PhysRevE.109.054307

  23. [23]

    Nature 397(6718), 400 (1999) https://doi.org/10.1038/17047

    Visscher, P.K., Camazine, S.: Collective decisions and cognition in bees. Nature 397(6718), 400 (1999) https://doi.org/10.1038/17047

  24. [24]

    Behavioral Ecology and Sociobiology56, 594–601 (2004) 16

    Seeley, T.D., Visscher, P.K.: Quorum sensing during nest-site selection by honeybee swarms. Behavioral Ecology and Sociobiology56, 594–601 (2004) 16

  25. [25]

    Apidologie35(2), 101–116 (2004)

    Seeley, T.D., Visscher, P.K.: Group decision making in nest-site selection by honey bees. Apidologie35(2), 101–116 (2004)

  26. [26]

    Behavioral Ecology and Sociobiology59, 427–442 (2006)

    Passino, K.M., Seeley, T.D.: Modeling and analysis of nest-site selection by honeybee swarms: the speed and accuracy trade-off. Behavioral Ecology and Sociobiology59, 427–442 (2006)

  27. [27]

    Behavioral Ecology and Sociobiology62(3), 401–414 (2008)

    Passino, K.M., Seeley, T.D., Visscher, P.K.: Swarm cognition in honey bees. Behavioral Ecology and Sociobiology62(3), 401–414 (2008). Accessed 2025-05-04

  28. [28]

    Games and Economic Behavior64(2), 666–683 (2008) https://doi.org/10.1016/j.geb.2008.02.003

    Sandholm, W.H., Dokumacı, E., Lahkar, R.: The projection dynamic and the replicator dynamic. Games and Economic Behavior64(2), 666–683 (2008) https://doi.org/10.1016/j.geb.2008.02.003 . Special Issue in Honor of Michael B. Maschler

  29. [29]

    Journal of Economic Theory136(1), 217–235 (2007)

    Apesteguia, J., Huck, S., Oechssler, J.: Imitation—theory and experimental evidence. Journal of Economic Theory136(1), 217–235 (2007)

  30. [30]

    Mathematical Biosciences40(1), 145–156 (1978) https://doi.org/10.1016/ 0025-5564(78)90077-9

    Taylor, P.D., Jonker, L.B.: Evolutionary stable strategies and game dynam- ics. Mathematical Biosciences40(1), 145–156 (1978) https://doi.org/10.1016/ 0025-5564(78)90077-9

  31. [31]

    Science 314(5805), 1560–1563 (2006) https://doi.org/10.1126/science.1133755 https://www.science.org/doi/pdf/10.1126/science.1133755

    Nowak, M.A.: Five rules for the evolution of cooperation. Science 314(5805), 1560–1563 (2006) https://doi.org/10.1126/science.1133755 https://www.science.org/doi/pdf/10.1126/science.1133755

  32. [32]

    Cambridge University Press, Cambridge, UK (1982)

    Smith, J.M.: Evolution and the Theory of Games. Cambridge University Press, Cambridge, UK (1982)

  33. [33]

    Journal of economic theory77(1), 1–14 (1997)

    B¨ orgers, T., Sarin, R.: Learning through reinforcement and replicator dynamics. Journal of economic theory77(1), 1–14 (1997)

  34. [34]

    Isaac Gym: High Performance GPU-Based Physics Simulation For Robot Learning

    Makoviychuk, V., Wawrzyniak, L., Guo, Y., Lu, M., Storey, K., Macklin, M., Hoeller, D., Rudin, N., Allshire, A., Handa, A., State, G.: Isaac Gym: High Performance GPU-Based Physics Simulation For Robot Learning (2021). https: //arxiv.org/abs/2108.10470

  35. [35]

    , year 2007

    Bonabeau, E., Dorigo, M., Theraulaz, G.: Swarm Intelligence: From Natural to Artificial Systems. Oxford University Press, New York (1999). https://doi.org/10.1093/oso/9780195131581.001.0001 . https://doi.org/10.1093/oso/9780195131581.001.0001

  36. [36]

    Philosophical Transactions of the Royal Society of London

    Franks, N.R., Pratt, S.C., Mallon, E.B., Britton, N.F., Sumpter, D.J.: Information flow, opinion polling and collective intelligence in house–hunting social insects. Philosophical Transactions of the Royal Society of London. Series B: Biological Sciences357(1427), 1567–1583 (2002) 17

  37. [37]

    In: Proceedings of the International Workshop on Engineering Self-organising Applications 2004, pp

    De Wolf, T., Holvoet, T.: Emergence and self-organisation: a statement of similari- ties and differences. In: Proceedings of the International Workshop on Engineering Self-organising Applications 2004, pp. 96–110 (2004)

  38. [38]

    Insectes Sociaux25(4), 323–337 (1978)

    Seeley, T.D., Morse, R.A.: Nest site selection by the honey bee, apis mellifera. Insectes Sociaux25(4), 323–337 (1978)

  39. [39]

    Beekman, M., Fathke, R.L., Seeley, T.D.: How does an informed minority of scouts guide a honeybee swarm as it flies to its new home? Animal behaviour71(1), 161–171 (2006)

  40. [40]

    Glia66(6), 1160–1175 (2018)

    Rittschof, C.C., Schirmeier, S.: Insect models of central nervous system energy metabolism and its links to behavior. Glia66(6), 1160–1175 (2018)

  41. [41]

    Journal of experimental biology209(19), 3828–3836 (2006)

    Schippers, M.-P., Dukas, R., Smith, R., Wang, J., Smolen, K., McClelland, G.: Lifetime performance in foraging honeybees: behaviour and physiology. Journal of experimental biology209(19), 3828–3836 (2006)

  42. [42]

    Anim Cogn9, 335–353 (2006) https://doi.org/10.1007/s10071-006-0039-2

    Zentall, T.R.: Imitation: definitions, evidence, and mechanisms. Anim Cogn9, 335–353 (2006) https://doi.org/10.1007/s10071-006-0039-2

  43. [43]

    Nature338, 576–579 (1989) https://doi.org/10.1038/ 338576a0

    Jr, R.E., Robinson, G., Fondrk, M.: Genetic specialists, kin recognition and nepo- tism in honey-bee colonies. Nature338, 576–579 (1989) https://doi.org/10.1038/ 338576a0

  44. [44]

    https://arxiv.org/abs/2509

    Vellinger, A., Antonic, N., Tuci, E.: From Pheromones to Policies: Reinforcement Learning for Engineered Biological Swarms (2025). https://arxiv.org/abs/2509. 20095

  45. [45]

    Proceedings of the IEEE109(7), 1152–1165 (2021) https://doi

    Dorigo, M., Theraulaz, G., Trianni, V.: Swarm robotics: Past, present, and future [point of view]. Proceedings of the IEEE109(7), 1152–1165 (2021) https://doi. org/10.1109/JPROC.2021.3072740

  46. [46]

    Springer, Cham (2018)

    Hamann, H.: Swarm Robotics: A Formal Approach, 1st edn. Springer, Cham (2018)

  47. [47]

    IEEE Compu- tational Intelligence Magazine1(4), 28–39 (2006) https://doi.org/10.1109/MCI

    Dorigo, M., Birattari, M., Stutzle, T.: Ant colony optimization. IEEE Compu- tational Intelligence Magazine1(4), 28–39 (2006) https://doi.org/10.1109/MCI. 2006.329691

  48. [48]

    Physical Review E95(5) (2017) https://doi

    Reina, A., Marshall, J.A.R., Trianni, V., Bose, T.: Model of the best-of-n nest- site selection process in honeybees. Physical Review E95(5) (2017) https://doi. org/10.1103/physreve.95.052411

  49. [49]

    In: 2023 International Symposium on Multi-Robot and Multi-Agent Systems (MRS), pp

    Soma, K., Vardharajan, V.S., Hamann, H., Beltrame, G.: Congestion and scalabil- ity in robot swarms: A study on collective decision making. In: 2023 International Symposium on Multi-Robot and Multi-Agent Systems (MRS), pp. 199–206 18 (2023). https://doi.org/10.1109/MRS60187.2023.10416793

  50. [50]

    Swarm Intelligence13, 217–243 (2019) https://doi.org/10.1007/ s11721-019-00169-8

    Prasetyo, J., Masi, G.D., Ferrante, E.: Collective decision making in dynamic environments. Swarm Intelligence13, 217–243 (2019) https://doi.org/10.1007/ s11721-019-00169-8

  51. [51]

    In: Proceedings of the 17th International Conference on Autonomous Agents and MultiAgent Systems

    Ebert, J.T., Gauci, M., Nagpal, R.: Multi-feature collective decision making in robot swarms. In: Proceedings of the 17th International Conference on Autonomous Agents and MultiAgent Systems. AAMAS ’18, pp. 1711–1719. Inter- national Foundation for Autonomous Agents and Multiagent Systems, Richland, SC (2018)

  52. [52]

    In: 2021 International Symposium on Multi-Robot and Multi-Agent Systems (MRS), pp

    Raoufi, M., Hamann, H., Romanczuk, P.: Speed-vs-accuracy tradeoff in collec- tive estimation: An adaptive exploration-exploitation case. In: 2021 International Symposium on Multi-Robot and Multi-Agent Systems (MRS), pp. 47–55 (2021). https://doi.org/10.1109/MRS50823.2021.9620695

  53. [53]

    Monocular visual-inertial odometry in low-textured environments with smooth gradients: A fully dense direct filtering approach,

    Ebert, J., Gauci, M., Mallmann-Trenn, F., Nagpal, R.: Bayes bots: Collective bayesian decision-making in decentralized robot swarms. In: 2020 IEEE Interna- tional Conference on Robotics and Automation, ICRA 2020. Proceedings - IEEE International Conference on Robotics and Automation, pp. 7186–7192. Insti- tute of Electrical and Electronics Engineers Inc.,...

  54. [54]

    Trends in Cognitive Sciences26, 66–80 (2022) https://doi.org/10.1016/j.tics.2021.10.006

    Pirrone, A., Reina, A., Stafford, T., Marshall, J.A.R., Gobet, F.: Magnitude- sensitivity: rethinking decision-making cognitive sciences. Trends in Cognitive Sciences26, 66–80 (2022) https://doi.org/10.1016/j.tics.2021.10.006

  55. [55]

    waggle dance

    Coucke, N., Heinrich, M.K., Cleeremans, A., Dorigo, M., Dumas, G.: Collective decision making by embodied neural agents. PNAS Nexus4(4), 101 (2025) https: //doi.org/10.1093/pnasnexus/pgaf101 19 A Proofs for Section 3 (Methodology) Lemma 1.An infinite population of individuals adoptingR success follows the TRD: dπa =π a(qπ a −v π),(5) whereπ a is the propo...