The Hive Mind is a Single Reinforcement Learning Agent
Pith reviewed 2026-05-23 19:16 UTC · model grok-4.3
The pith
Honey bee swarms using only local imitation learn exactly as one reinforcement learning agent does.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The emergent distributed cognition arising from individuals following simple, local imitation-based rules in the weighted voter model of bees' waggle dance is that of a single online reinforcement learning agent interacting with many parallel environments; the group's update rule is the multi-armed bandit algorithm the authors term Maynard-Cross Learning.
What carries the argument
The weighted voter model of waggle-dance communication, whose choice-update rule matches Maynard-Cross Learning and thereby equates the collective to a single multi-armed-bandit agent.
If this is right
- A group of purely imitative organisms functions as a more complex reinforcement-enabled entity.
- Group-level intelligence can explain the evolutionary selection of simple individual behaviors.
- Imitative economic or social systems can be analyzed as collective learning processes.
- The framework supplies design principles for scalable artificial collective systems inspired by RL.
Where Pith is reading between the lines
- The same equivalence might appear in other animal groups that rely on local copying rules.
- Markets or voting systems could be modeled as single RL agents when participants imitate successful choices.
- Adding memory or private exploration to the bee model would be a direct test of whether the equivalence survives.
- Multi-agent RL systems could deliberately use imitation to reproduce single-agent learning at scale.
Load-bearing premise
The weighted voter model fully captures the bees' collective decision process without additional mechanisms such as memory or spatial exploration.
What would settle it
Demonstration that real bees employ mechanisms outside the weighted voter model, such as individual memory of sites or non-imitative exploration, during nest selection.
read the original abstract
Decision-making is an essential attribute of any intelligent agent or group. Natural systems are known to converge to effective strategies through at least two distinct mechanisms: collective decision-making via imitation of others, and trial-and-error by a single agent. This paper establishes an equivalence between these two paradigms by drawing from the well-studied collective decision-making problem of nest-hunting in swarms of honey bees. We show that the emergent distributed cognition (sometimes referred to as the $\textit{hive mind}$) arising from individuals following simple, local imitation-based rules is that of a single online reinforcement learning (RL) agent interacting with many parallel environments. More specifically, in the purely imitative $\textit{weighted voter}$ model of bees' waggle dance, the update rule through which this macro-agent learns is a multi-armed bandit algorithm that we coin $\textit{Maynard-Cross Learning}$. Our analysis implies that a group of purely imitative organisms can be equivalent to a more complex, reinforcement-enabled entity, substantiating the idea that group-level intelligence may explain how seemingly simple and blind individual behaviors are selected in nature. Beyond biology, the framework offers new tools for analyzing economic and social systems where individuals imitate successful strategies, effectively participating in a collective learning process. Our findings may further inform the design of scalable RL-inspired collective systems in artificial domains.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper claims that within the weighted voter model of honeybee waggle-dance communication, the emergent collective decision process is mathematically equivalent to a single online reinforcement learning agent interacting with many parallel environments; specifically, the collective update rule is a multi-armed bandit algorithm that the authors name Maynard-Cross Learning. This equivalence is used to argue that group-level intelligence can arise from simple local imitation rules, with implications for natural selection, economic systems, and the design of collective AI.
Significance. If the internal mapping is rigorously derived, the result supplies a clean theoretical bridge between imitation-based collective behavior and reinforcement learning, offering a parameter-free account of how simple individual rules can produce effective group strategies. This could inform multi-agent system design and bio-inspired algorithms, though its explanatory power for natural systems hinges on model completeness.
major comments (2)
- [Abstract / model section] Abstract and model section: the equivalence is asserted for the weighted voter model, yet the broader claim that this 'substantiates the idea that group-level intelligence may explain how seemingly simple and blind individual behaviors are selected in nature' requires the model to be a complete description; no empirical comparison or discussion of omitted mechanisms (spatial exploration, memory, non-imitative rules) is supplied, creating a correctness risk for the natural-selection implication.
- [Maynard-Cross Learning definition] Section introducing Maynard-Cross Learning: coining a new algorithm name for the update rule obtained directly by aggregating the weighted voter model makes the claimed equivalence appear definitional rather than independently derived; a concrete test is whether the algorithm exhibits any property not already entailed by the model's aggregation step.
minor comments (2)
- [Abstract] Abstract: the equivalence statement would be clearer if a single key equation or proof outline were included.
- [Notation / model definition] Notation: the distinction between individual and collective update rules should be made explicit with consistent symbols to avoid reader confusion.
Simulated Author's Rebuttal
We thank the referee for the constructive report and the recommendation for major revision. We address each major comment below with specific plans for revision where warranted. The core mathematical equivalence between the weighted voter model and the derived update rule remains intact, but we agree that certain claims require additional qualification and clarification.
read point-by-point responses
-
Referee: [Abstract / model section] Abstract and model section: the equivalence is asserted for the weighted voter model, yet the broader claim that this 'substantiates the idea that group-level intelligence may explain how seemingly simple and blind individual behaviors are selected in nature' requires the model to be a complete description; no empirical comparison or discussion of omitted mechanisms (spatial exploration, memory, non-imitative rules) is supplied, creating a correctness risk for the natural-selection implication.
Authors: We agree that the natural-selection implication in the abstract and introduction is stated too strongly given the model's scope. The equivalence is rigorously derived only for the purely imitative weighted voter model; extending it to explain evolutionary selection requires acknowledging that real bee colonies include additional mechanisms. We will revise the abstract and add a new subsection (likely in the Discussion) that explicitly lists the omitted mechanisms (spatial exploration, individual memory, and non-imitative rules), discusses how they might alter or preserve the equivalence, and qualifies the evolutionary claim as a hypothesis supported by the model rather than a direct substantiation. No empirical comparison is feasible within the current theoretical scope, but the added discussion will reduce the risk of overgeneralization. revision: yes
-
Referee: [Maynard-Cross Learning definition] Section introducing Maynard-Cross Learning: coining a new algorithm name for the update rule obtained directly by aggregating the weighted voter model makes the claimed equivalence appear definitional rather than independently derived; a concrete test is whether the algorithm exhibits any property not already entailed by the model's aggregation step.
Authors: The name 'Maynard-Cross Learning' is intended to identify the specific multi-armed bandit update rule that emerges from the aggregation, allowing comparison with existing algorithms. However, we accept that the presentation risks making the equivalence appear tautological. In revision we will (1) separate the derivation of the update rule from the naming, (2) add a short analysis showing that the resulting algorithm possesses a non-trivial property (convergence rate under parallel environments that differs from standard UCB or epsilon-greedy when the number of parallel instances grows) not directly implied by the aggregation step alone, and (3) rephrase the surrounding text to emphasize that the equivalence is a derived result rather than a definitional restatement. We will also consider whether a different presentation (e.g., without a new name) better conveys the contribution. revision: yes
Circularity Check
No significant circularity; equivalence derived directly from model rules
full rationale
The paper derives the equivalence between the weighted voter model's collective update rule and a multi-armed bandit algorithm by direct mathematical mapping within the model's stated assumptions. This is presented as an analytical result rather than a fitted prediction or self-referential definition. No equations or steps in the abstract reduce the claimed RL equivalence to its inputs by construction, and no self-citations, ansatzes, or renamings of external results are invoked as load-bearing. The coining of 'Maynard-Cross Learning' is merely nomenclature for the derived rule. The derivation chain remains self-contained against the model's premises, with the broader applicability to real bees noted as an assumption rather than a circular claim.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption The weighted voter model accurately captures the collective nest-hunting decision process in honey bee swarms
invented entities (1)
-
Maynard-Cross Learning
no independent evidence
Lean theorems connected to this paper
-
IndisputableMonolith/Cost/FunctionalEquation.leanwashburn_uniqueness_aczel unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
the update rule through which this macro-agent learns is a multi-armed bandit algorithm that we coin Maynard-Cross Learning
-
IndisputableMonolith/Foundation/RealityFromDistinction.leanreality_from_one_distinction unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
a swarm of honey bees collectively acts as a single RL entity
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
Reference graph
Works this paper leans on
-
[1]
Princeton University Press, Princeton, NJ, USA (1944)
Neumann, J.V., Morgenstern, O.: Theory of Games and Economic Behavior. Princeton University Press, Princeton, NJ, USA (1944)
work page 1944
-
[2]
Artificial Intelligence299, 103535 (2021) https://doi.org/10.1016/j.artint.2021.103535
Silver, D., Singh, S., Precup, D., Sutton, R.S.: Reward is enough. Artificial Intelligence299, 103535 (2021) https://doi.org/10.1016/j.artint.2021.103535
-
[3]
A Bradford Book, Cambridge, MA, USA (2018)
Sutton, R.S., Barto, A.G.: Reinforcemet Learning: An Introduction. A Bradford Book, Cambridge, MA, USA (2018)
work page 2018
-
[4]
Nature Machine Intelligence1(3), 133–143 (2019)
Neftci, E.O., Averbeck, B.B.: Reinforcement learning in artificial and biological systems. Nature Machine Intelligence1(3), 133–143 (2019)
work page 2019
-
[5]
Nature Neuroscience —27, 403–408 (2024) https://doi.org/10.1038/s41593-023-01535-w
Muller, T.H., Butler, J.L., Veselic, S., Miranda, B., Wallis, J.D., Dayan, P., Behrens, T.E.J., Kurth-Nelson, Z., Kennerley, S.W.: nature neuroscience distri- butional reinforcement learning in prefrontal cortex. Nature Neuroscience —27, 403–408 (2024) https://doi.org/10.1038/s41593-023-01535-w
-
[6]
https://arxiv.org/abs/2410.14606
Elsayed, M., Vasan, G., Mahmood, A.R.: Streaming Deep Reinforcement Learning Finally Works (2024). https://arxiv.org/abs/2410.14606
-
[7]
https://arxiv.org/abs/2411.15370
Vasan, G., Elsayed, M., Azimi, A., He, J., Shariar, F., Bellinger, C., White, M., Mahmood, A.R.: Deep Policy Gradient Methods Without Batch Updates, Target Networks, or Replay Buffers (2024). https://arxiv.org/abs/2411.15370
-
[8]
In: International Conference on Machine Learning, pp
Mnih, V., Badia, A.P., Mirza, M., Graves, A., Lillicrap, T., Harley, T., Silver, D., Kavukcuoglu, K.: Asynchronous methods for deep reinforcement learning. In: International Conference on Machine Learning, pp. 1928–1937 (2016). PmLR
work page 1928
-
[9]
Machine Learning47, 235–256 (2002) https://doi.org/10.1023/ A:1013689704352
Auer, P., Cesa-Bianchi, N., Fischer, P.: Finite-time analysis of the multiarmed bandit problem. Machine Learning47, 235–256 (2002) https://doi.org/10.1023/ A:1013689704352
work page 2002
-
[10]
Williams, R.J.: Simple Statistical Gradient-Following Algorithms for Connection- ist Reinforcement Learning (1992)
work page 1992
-
[11]
The Quarterly Journal of Economics87(2), 239–266 (1973)
Cross, J.G.: A stochastic learning model of economic behavior. The Quarterly Journal of Economics87(2), 239–266 (1973)
work page 1973
-
[12]
Journal of Artificial Intelligence Research53, 659–697 (2015) https://doi.org/10.1613/jair.4818 15
Bloembergen, D., Tuyls, K., Hennes, D., Kaisers, M.: Evolutionary dynamics of multi-agent learning: A survey. Journal of Artificial Intelligence Research53, 659–697 (2015) https://doi.org/10.1613/jair.4818 15
-
[13]
Williams, R.J.: Simple statistical gradient-following algorithms for connectionist reinforcement learning. Mach. Learn.8(3–4), 229–256 (1992) https://doi.org/10. 1007/BF00992696
work page 1992
-
[14]
Proximal Policy Optimization Algorithms
Schulman, J., Wolski, F., Dhariwal, P., Radford, A., Klimov, O.: Proximal Policy Optimization Algorithms (2017). https://arxiv.org/abs/1707.06347
work page internal anchor Pith review Pith/arXiv arXiv 2017
-
[15]
Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor
Haarnoja, T., Zhou, A., Abbeel, P., Levine, S.: Soft Actor-Critic: Off-Policy Max- imum Entropy Deep Reinforcement Learning with a Stochastic Actor (2018). https://arxiv.org/abs/1801.01290
work page internal anchor Pith review Pith/arXiv arXiv 2018
-
[16]
PLOS ONE10(10), 1–18 (2015) https: //doi.org/10.1371/journal.pone.0140950
Reina, A., Valentini, G., Fern´ andez-Oto, C., Dorigo, M., Trianni, V.: A design pattern for decentralised decision making. PLOS ONE10(10), 1–18 (2015) https: //doi.org/10.1371/journal.pone.0140950
-
[17]
Current Opinion in Behavioral Sciences16, 30–34 (2017) https://doi.org/10.1016/j.cobeha.2017
Bose, T., Reina, A., Marshall, J.A.: Collective decision-making. Current Opinion in Behavioral Sciences16, 30–34 (2017) https://doi.org/10.1016/j.cobeha.2017. 03.004 . Comparative cognition
-
[18]
American Economic Journal: Microeconomics2, 112–49 (2010) https: //doi.org/10.1257/mic.2.1.112
Jackson, M., Golub, B.: Na¨ ıve learning in social networks and the wisdom of crowds. American Economic Journal: Microeconomics2, 112–49 (2010) https: //doi.org/10.1257/mic.2.1.112
-
[19]
Valentini, G., Ferrante, E., Hamann, H., Dorigo, M.: Collective decision with 100 Kilobots: speed versus accuracy in binary discrimination problems. Autonomous Agents and Multi-Agent Systems30(3), 553–580 (2016) https://doi.org/10.1007/ s10458-015-9323-3
work page 2016
-
[20]
In: Proceedings of the 2014 International Conference on Autonomous Agents and Multi-Agent Systems
Valentini, G., Hamann, H., Dorigo, M.: Self-organized collective decision making: the weighted voter model. In: Proceedings of the 2014 International Conference on Autonomous Agents and Multi-Agent Systems. AAMAS ’14, pp. 45–52. Inter- national Foundation for Autonomous Agents and Multiagent Systems, Richland, SC (2014)
work page 2014
-
[21]
The MIT Press, Cambridge, MA (2010).http://www.jstor.org/stable/j.ctt5hhbq5Accessed 2025- 05-23
Sandholm, W.H.: Population Games and Evolutionary Dynamics. The MIT Press, Cambridge, MA (2010).http://www.jstor.org/stable/j.ctt5hhbq5Accessed 2025- 05-23
work page 2010
-
[22]
Reina, A., Njougouo, T., Tuci, E., Carletti, T.: Speed-accuracy trade-offs in best- of-ncollective decision making through heterogeneous mean-field modeling. Phys. Rev. E109, 054307 (2024) https://doi.org/10.1103/PhysRevE.109.054307
-
[23]
Nature 397(6718), 400 (1999) https://doi.org/10.1038/17047
Visscher, P.K., Camazine, S.: Collective decisions and cognition in bees. Nature 397(6718), 400 (1999) https://doi.org/10.1038/17047
-
[24]
Behavioral Ecology and Sociobiology56, 594–601 (2004) 16
Seeley, T.D., Visscher, P.K.: Quorum sensing during nest-site selection by honeybee swarms. Behavioral Ecology and Sociobiology56, 594–601 (2004) 16
work page 2004
-
[25]
Apidologie35(2), 101–116 (2004)
Seeley, T.D., Visscher, P.K.: Group decision making in nest-site selection by honey bees. Apidologie35(2), 101–116 (2004)
work page 2004
-
[26]
Behavioral Ecology and Sociobiology59, 427–442 (2006)
Passino, K.M., Seeley, T.D.: Modeling and analysis of nest-site selection by honeybee swarms: the speed and accuracy trade-off. Behavioral Ecology and Sociobiology59, 427–442 (2006)
work page 2006
-
[27]
Behavioral Ecology and Sociobiology62(3), 401–414 (2008)
Passino, K.M., Seeley, T.D., Visscher, P.K.: Swarm cognition in honey bees. Behavioral Ecology and Sociobiology62(3), 401–414 (2008). Accessed 2025-05-04
work page 2008
-
[28]
Games and Economic Behavior64(2), 666–683 (2008) https://doi.org/10.1016/j.geb.2008.02.003
Sandholm, W.H., Dokumacı, E., Lahkar, R.: The projection dynamic and the replicator dynamic. Games and Economic Behavior64(2), 666–683 (2008) https://doi.org/10.1016/j.geb.2008.02.003 . Special Issue in Honor of Michael B. Maschler
-
[29]
Journal of Economic Theory136(1), 217–235 (2007)
Apesteguia, J., Huck, S., Oechssler, J.: Imitation—theory and experimental evidence. Journal of Economic Theory136(1), 217–235 (2007)
work page 2007
-
[30]
Mathematical Biosciences40(1), 145–156 (1978) https://doi.org/10.1016/ 0025-5564(78)90077-9
Taylor, P.D., Jonker, L.B.: Evolutionary stable strategies and game dynam- ics. Mathematical Biosciences40(1), 145–156 (1978) https://doi.org/10.1016/ 0025-5564(78)90077-9
work page 1978
-
[31]
Nowak, M.A.: Five rules for the evolution of cooperation. Science 314(5805), 1560–1563 (2006) https://doi.org/10.1126/science.1133755 https://www.science.org/doi/pdf/10.1126/science.1133755
-
[32]
Cambridge University Press, Cambridge, UK (1982)
Smith, J.M.: Evolution and the Theory of Games. Cambridge University Press, Cambridge, UK (1982)
work page 1982
-
[33]
Journal of economic theory77(1), 1–14 (1997)
B¨ orgers, T., Sarin, R.: Learning through reinforcement and replicator dynamics. Journal of economic theory77(1), 1–14 (1997)
work page 1997
-
[34]
Isaac Gym: High Performance GPU-Based Physics Simulation For Robot Learning
Makoviychuk, V., Wawrzyniak, L., Guo, Y., Lu, M., Storey, K., Macklin, M., Hoeller, D., Rudin, N., Allshire, A., Handa, A., State, G.: Isaac Gym: High Performance GPU-Based Physics Simulation For Robot Learning (2021). https: //arxiv.org/abs/2108.10470
work page internal anchor Pith review Pith/arXiv arXiv 2021
-
[35]
Bonabeau, E., Dorigo, M., Theraulaz, G.: Swarm Intelligence: From Natural to Artificial Systems. Oxford University Press, New York (1999). https://doi.org/10.1093/oso/9780195131581.001.0001 . https://doi.org/10.1093/oso/9780195131581.001.0001
-
[36]
Philosophical Transactions of the Royal Society of London
Franks, N.R., Pratt, S.C., Mallon, E.B., Britton, N.F., Sumpter, D.J.: Information flow, opinion polling and collective intelligence in house–hunting social insects. Philosophical Transactions of the Royal Society of London. Series B: Biological Sciences357(1427), 1567–1583 (2002) 17
work page 2002
-
[37]
In: Proceedings of the International Workshop on Engineering Self-organising Applications 2004, pp
De Wolf, T., Holvoet, T.: Emergence and self-organisation: a statement of similari- ties and differences. In: Proceedings of the International Workshop on Engineering Self-organising Applications 2004, pp. 96–110 (2004)
work page 2004
-
[38]
Insectes Sociaux25(4), 323–337 (1978)
Seeley, T.D., Morse, R.A.: Nest site selection by the honey bee, apis mellifera. Insectes Sociaux25(4), 323–337 (1978)
work page 1978
-
[39]
Beekman, M., Fathke, R.L., Seeley, T.D.: How does an informed minority of scouts guide a honeybee swarm as it flies to its new home? Animal behaviour71(1), 161–171 (2006)
work page 2006
-
[40]
Rittschof, C.C., Schirmeier, S.: Insect models of central nervous system energy metabolism and its links to behavior. Glia66(6), 1160–1175 (2018)
work page 2018
-
[41]
Journal of experimental biology209(19), 3828–3836 (2006)
Schippers, M.-P., Dukas, R., Smith, R., Wang, J., Smolen, K., McClelland, G.: Lifetime performance in foraging honeybees: behaviour and physiology. Journal of experimental biology209(19), 3828–3836 (2006)
work page 2006
-
[42]
Anim Cogn9, 335–353 (2006) https://doi.org/10.1007/s10071-006-0039-2
Zentall, T.R.: Imitation: definitions, evidence, and mechanisms. Anim Cogn9, 335–353 (2006) https://doi.org/10.1007/s10071-006-0039-2
-
[43]
Nature338, 576–579 (1989) https://doi.org/10.1038/ 338576a0
Jr, R.E., Robinson, G., Fondrk, M.: Genetic specialists, kin recognition and nepo- tism in honey-bee colonies. Nature338, 576–579 (1989) https://doi.org/10.1038/ 338576a0
work page 1989
-
[44]
Vellinger, A., Antonic, N., Tuci, E.: From Pheromones to Policies: Reinforcement Learning for Engineered Biological Swarms (2025). https://arxiv.org/abs/2509. 20095
work page 2025
-
[45]
Proceedings of the IEEE109(7), 1152–1165 (2021) https://doi
Dorigo, M., Theraulaz, G., Trianni, V.: Swarm robotics: Past, present, and future [point of view]. Proceedings of the IEEE109(7), 1152–1165 (2021) https://doi. org/10.1109/JPROC.2021.3072740
-
[46]
Hamann, H.: Swarm Robotics: A Formal Approach, 1st edn. Springer, Cham (2018)
work page 2018
-
[47]
IEEE Compu- tational Intelligence Magazine1(4), 28–39 (2006) https://doi.org/10.1109/MCI
Dorigo, M., Birattari, M., Stutzle, T.: Ant colony optimization. IEEE Compu- tational Intelligence Magazine1(4), 28–39 (2006) https://doi.org/10.1109/MCI. 2006.329691
work page doi:10.1109/mci 2006
-
[48]
Physical Review E95(5) (2017) https://doi
Reina, A., Marshall, J.A.R., Trianni, V., Bose, T.: Model of the best-of-n nest- site selection process in honeybees. Physical Review E95(5) (2017) https://doi. org/10.1103/physreve.95.052411
-
[49]
In: 2023 International Symposium on Multi-Robot and Multi-Agent Systems (MRS), pp
Soma, K., Vardharajan, V.S., Hamann, H., Beltrame, G.: Congestion and scalabil- ity in robot swarms: A study on collective decision making. In: 2023 International Symposium on Multi-Robot and Multi-Agent Systems (MRS), pp. 199–206 18 (2023). https://doi.org/10.1109/MRS60187.2023.10416793
-
[50]
Swarm Intelligence13, 217–243 (2019) https://doi.org/10.1007/ s11721-019-00169-8
Prasetyo, J., Masi, G.D., Ferrante, E.: Collective decision making in dynamic environments. Swarm Intelligence13, 217–243 (2019) https://doi.org/10.1007/ s11721-019-00169-8
work page 2019
-
[51]
In: Proceedings of the 17th International Conference on Autonomous Agents and MultiAgent Systems
Ebert, J.T., Gauci, M., Nagpal, R.: Multi-feature collective decision making in robot swarms. In: Proceedings of the 17th International Conference on Autonomous Agents and MultiAgent Systems. AAMAS ’18, pp. 1711–1719. Inter- national Foundation for Autonomous Agents and Multiagent Systems, Richland, SC (2018)
work page 2018
-
[52]
In: 2021 International Symposium on Multi-Robot and Multi-Agent Systems (MRS), pp
Raoufi, M., Hamann, H., Romanczuk, P.: Speed-vs-accuracy tradeoff in collec- tive estimation: An adaptive exploration-exploitation case. In: 2021 International Symposium on Multi-Robot and Multi-Agent Systems (MRS), pp. 47–55 (2021). https://doi.org/10.1109/MRS50823.2021.9620695
-
[53]
Ebert, J., Gauci, M., Mallmann-Trenn, F., Nagpal, R.: Bayes bots: Collective bayesian decision-making in decentralized robot swarms. In: 2020 IEEE Interna- tional Conference on Robotics and Automation, ICRA 2020. Proceedings - IEEE International Conference on Robotics and Automation, pp. 7186–7192. Insti- tute of Electrical and Electronics Engineers Inc.,...
-
[54]
Trends in Cognitive Sciences26, 66–80 (2022) https://doi.org/10.1016/j.tics.2021.10.006
Pirrone, A., Reina, A., Stafford, T., Marshall, J.A.R., Gobet, F.: Magnitude- sensitivity: rethinking decision-making cognitive sciences. Trends in Cognitive Sciences26, 66–80 (2022) https://doi.org/10.1016/j.tics.2021.10.006
-
[55]
Coucke, N., Heinrich, M.K., Cleeremans, A., Dorigo, M., Dumas, G.: Collective decision making by embodied neural agents. PNAS Nexus4(4), 101 (2025) https: //doi.org/10.1093/pnasnexus/pgaf101 19 A Proofs for Section 3 (Methodology) Lemma 1.An infinite population of individuals adoptingR success follows the TRD: dπa =π a(qπ a −v π),(5) whereπ a is the propo...
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.