Decentralized Collective World Model for Emergent Communication and Coordination

Kentaro Nomura; Tadahiro Taniguchi; Takato Horii; Tatsuya Aoki

arxiv: 2504.03353 · v3 · submitted 2025-04-04 · 💻 cs.MA · cs.AI

Decentralized Collective World Model for Emergent Communication and Coordination

Kentaro Nomura , Tatsuya Aoki , Tadahiro Taniguchi , Takato Horii This is my paper

Pith reviewed 2026-05-22 21:34 UTC · model grok-4.3

classification 💻 cs.MA cs.AI

keywords decentralized multi-agent systemsemergent communicationcollective world modelspredictive codingsymbol emergencecontrastive learningmulti-agent coordinationtrajectory drawing task

0 comments

The pith

A decentralized world model lets agents coordinate actions and develop meaningful shared symbols through bidirectional communication even when their perceptions differ.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper establishes that a fully decentralized multi-agent world model, built by extending collective predictive coding across time, can support both coordinated behavior and the emergence of communication symbols at the same time. Agents maintain internal predictions of environmental dynamics and exchange messages bidirectionally, using contrastive learning to align those messages without any agent accessing another's internal states. In a two-agent trajectory drawing task where each agent receives only partial observations, this setup produces better coordination than models lacking communication and yields symbols that more accurately track actual environmental states. A sympathetic reader would care because the result points to a practical route for independent agents to reach shared understanding and joint action in settings where a central controller is unavailable or undesirable.

Core claim

The central claim is that integrating world models with communication channels through bidirectional message exchange and contrastive learning for message alignment enables agents to predict environmental dynamics, estimate states from partial observations, and share critical information, resulting in coordination performance that surpasses non-communicative baselines when perceptual capabilities diverge and that ranks second only to centralized models, while also producing symbol systems that accurately reflect environmental states under the constraint that no agent can access another's internal representations.

What carries the argument

The decentralized collective world model formed by temporal extension of collective predictive coding, which carries the argument by letting each agent maintain predictive models that are aligned across agents solely through constrained message passing.

If this is right

Communication-based decentralized models outperform non-communicative models on coordination when agents receive divergent observations.
The same decentralized constraints that block direct state access produce emergent symbols that more closely match environmental states than symbols arising without those constraints.
The approach reaches coordination performance second only to fully centralized models while remaining fully decentralized.
Predictive coding extended across agents and time supplies the mechanism that simultaneously supports state estimation and message alignment.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same architecture could be tested on tasks requiring longer temporal horizons or more than two agents to check whether symbol quality and coordination scale together.
If the emergent symbols prove stable across different drawing trajectories, they might serve as reusable building blocks for other coordination problems without retraining.
Replacing the contrastive loss with other alignment objectives would offer a direct test of whether the reported symbol accuracy depends on that specific choice.
The finding that decentralization plus communication constraints improves symbol quality suggests similar benefits might appear in domains where agents must operate under privacy or bandwidth limits.

Load-bearing premise

Bidirectional message exchange plus contrastive learning will automatically produce both improved coordination and symbols that track environmental states when agents cannot directly inspect one another's internal states.

What would settle it

Running the two-agent trajectory drawing task with the contrastive alignment term removed and finding no gain in coordination score or symbol accuracy over the non-communicative baseline would falsify the central claim.

Figures

Figures reproduced from arXiv: 2504.03353 by Kentaro Nomura, Tadahiro Taniguchi, Takato Horii, Tatsuya Aoki.

**Figure 2.** Figure 2: The architecture of the collective world model. [PITH_FULL_IMAGE:figures/full_fig_p003_2.png] view at source ↗

**Figure 3.** Figure 3: (a) Schematic overview of the trajectory drawing [PITH_FULL_IMAGE:figures/full_fig_p005_3.png] view at source ↗

**Figure 5.** Figure 5: Comparison of coordination achievement with ( [PITH_FULL_IMAGE:figures/full_fig_p006_5.png] view at source ↗

**Figure 6.** Figure 6: Similarity between the structure of inferred messages [PITH_FULL_IMAGE:figures/full_fig_p007_6.png] view at source ↗

**Figure 7.** Figure 7: (Left) Trajectory of point P when moved according to test data, and (Right) sequence of messages inferred by each agent when reconstructing observations using EC (proposed method) with 6 bins. In all plots, the color of points changes from blue to red as time steps progress. points at all possible time step combinations for both inferred messages and actual point P coordinates, creating dissimilarity matr… view at source ↗

read the original abstract

We propose a fully decentralized multi-agent world model that enables both symbol emergence for communication and coordinated behavior through temporal extension of collective predictive coding. Unlike previous research that focuses on either communication or coordination separately, our approach achieves both simultaneously. Our method integrates world models with communication channels, enabling agents to predict environmental dynamics, estimate states from partial observations, and share critical information through bidirectional message exchange with contrastive learning for message alignment. Using a two-agent trajectory drawing task, we demonstrate that our communication-based approach outperforms non-communicative models when agents have divergent perceptual capabilities, achieving the second-best coordination after centralized models. Importantly, our decentralized approach with constraints preventing direct access to other agents' internal states facilitates the emergence of more meaningful symbol systems that accurately reflect environmental states. These findings demonstrate the effectiveness of decentralized communication for supporting coordination while developing shared representations of the environment.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper gives a concrete decentralized predictive-coding setup that gets both coordination and emergent symbols on a two-agent drawing task, but the evidence that the symbols actually track environmental states rather than just align private predictions is thin.

read the letter

The core contribution is a fully decentralized multi-agent world model that extends collective predictive coding to handle both symbol emergence and coordination at once. Agents maintain private world models, exchange bidirectional messages, and use contrastive learning to align them without direct access to each other's internals. On a trajectory-drawing task with mismatched perceptual capabilities, the approach beats non-communicative baselines and comes second only to a centralized model. That joint treatment in one framework is the main novelty relative to earlier work that tackled communication or coordination separately.

Referee Report

2 major / 2 minor

Summary. The manuscript proposes a fully decentralized multi-agent world model based on temporal extension of collective predictive coding. Agents integrate individual world models with bidirectional communication channels and use contrastive learning to align messages without direct access to each other's internal states. In a two-agent trajectory drawing task with divergent perceptual capabilities, the approach outperforms non-communicative baselines in coordination while achieving second-best performance after centralized models, and produces emergent symbols claimed to accurately reflect environmental states.

Significance. If the grounding and coordination results hold under rigorous evaluation, the work would be significant for multi-agent systems research by showing how decentralized predictive coding can simultaneously support emergent communication and task coordination. The explicit constraints on internal-state access and the comparison against both non-communicative and centralized controls provide a clear testbed for claims about meaningful symbol emergence.

major comments (2)

[§3.2] §3.2 (contrastive alignment objective): the description of the bidirectional message exchange and contrastive loss does not specify whether negative pairs are drawn from distinct environmental configurations or only from the same trajectory. Without negatives that vary environmental state, the loss can succeed at inter-agent alignment while leaving symbols ungrounded in the shared environment, directly undermining the claim that symbols 'accurately reflect environmental states.'
[§4.3] §4.3 (symbol quality evaluation): the reported 'more meaningful' symbols are assessed via coordination performance and qualitative inspection, but no quantitative metric (e.g., mutual information with held-out ground-truth state variables or decoding accuracy on unseen trajectories) is provided. This leaves the central claim about environmental reflection unsupported by the presented evidence.

minor comments (2)

[Abstract] The abstract introduces 'temporal extension of collective predictive coding' without a one-sentence gloss or citation; a brief parenthetical definition would aid readers.
[Figure 3] Figure 3 (trajectory examples) would benefit from explicit annotation of the divergent observation masks used by each agent.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive comments, which help clarify key aspects of our method and evaluation. We address each major comment point by point below.

read point-by-point responses

Referee: [§3.2] §3.2 (contrastive alignment objective): the description of the bidirectional message exchange and contrastive loss does not specify whether negative pairs are drawn from distinct environmental configurations or only from the same trajectory. Without negatives that vary environmental state, the loss can succeed at inter-agent alignment while leaving symbols ungrounded in the shared environment, directly undermining the claim that symbols 'accurately reflect environmental states.'

Authors: We agree that the current description in §3.2 is insufficiently precise on this point. In the implemented contrastive objective, negative pairs are sampled from distinct environmental configurations (different initial states and trajectories in the dataset) rather than solely within the same trajectory. This design choice is intended to promote grounding in shared environmental features. We will revise §3.2 to explicitly document the negative sampling procedure and confirm that negatives vary across environmental states. revision: yes
Referee: [§4.3] §4.3 (symbol quality evaluation): the reported 'more meaningful' symbols are assessed via coordination performance and qualitative inspection, but no quantitative metric (e.g., mutual information with held-out ground-truth state variables or decoding accuracy on unseen trajectories) is provided. This leaves the central claim about environmental reflection unsupported by the presented evidence.

Authors: We acknowledge that the current evaluation in §4.3 relies on coordination performance and qualitative inspection, which does not directly quantify how well symbols reflect environmental states. To strengthen this claim, we will add a quantitative analysis in the revised manuscript, including mutual information between emergent symbols and held-out ground-truth state variables evaluated on unseen trajectories. revision: yes

Circularity Check

0 steps flagged

No significant circularity detected

full rationale

The paper presents an empirical method for decentralized multi-agent world modeling via temporal extension of collective predictive coding, bidirectional messaging, and contrastive alignment, validated on a two-agent trajectory task against non-communicative and centralized baselines. No equations or derivation steps are shown that reduce by construction to fitted inputs, self-definitions, or self-citation chains. The central claims rest on experimental comparisons rather than any load-bearing self-referential step, making the work self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract-only review provides no explicit free parameters, axioms, or invented entities; the approach implicitly assumes that partial observations plus message exchange suffice for state estimation and alignment.

pith-pipeline@v0.9.0 · 5679 in / 1086 out tokens · 38115 ms · 2026-05-22T21:34:59.576399+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

36 extracted references · 36 canonical work pages · 5 internal anchors

[1]

Shared Agency,

A. S. Roth, “Shared Agency,” in The Stanford Encyclopedia of Philosophy, Summer 2017 ed., E. N. Zalta, Ed. Metaphysics Research Lab, Stanford University, 2017

work page 2017
[2]

arXiv preprint arXiv:2012.08630 , year=

A. Dafoe, E. Hughes, Y . Bachrach, T. Collins, K. R. McKee, J. Z. Leibo, K. Larson, and T. Graepel, “Open problems in cooperative AI,” CoRR, vol. abs/2012.08630, 2020. [Online]. Available: https://arxiv.org/abs/2012.08630

work page arXiv 2012
[3]

Tom2c: Target-oriented multi-agent communication and cooperation with theory of mind,

Y . Wang, F. Zhong, J. Xu, and Y . Wang, “Tom2c: Target-oriented multi-agent communication and cooperation with theory of mind,” in International Conference on Learning Representations, 2022. © 2025 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinti...

work page 2022
[4]

Angeliki Lazaridou, Alexander Peysakhovich, and Marco Baroni

A. Lazaridou and M. Baroni, “Emergent multi-agent communication in the deep learning era,” CoRR, vol. abs/2006.02419, 2020. [Online]. Available: https://arxiv.org/abs/2006.02419

work page arXiv 2006
[5]

Toward more human-like ai communication: A review of emergent communication research,

N. Brandizzi, “Toward more human-like ai communication: A review of emergent communication research,” IEEE Access, vol. 11, pp. 142 317–142 340, 2023

work page 2023
[6]

Recurrent world models facilitate policy evolution,

D. Ha and J. Schmidhuber, “Recurrent world models facilitate policy evolution,” in Advances in Neural Information Processing Systems, S. Bengio, H. Wallach, H. Larochelle, K. Grauman, N. Cesa-Bianchi, and R. Garnett, Eds., vol. 31. Curran Associates, Inc., 2018. [Online]. Available: https://proceedings.neurips.cc/paper files/paper/ 2018/file/2de5d16682c3c...

work page 2018
[7]

Mastering Diverse Domains through World Models

D. Hafner, J. Pasukonis, J. Ba, and T. Lillicrap, “Mastering diverse domains through world models,” 2024. [Online]. Available: https://arxiv.org/abs/2301.04104

work page internal anchor Pith review Pith/arXiv arXiv 2024
[8]

World mod- els and predictive coding for cognitive and developmental robotics: frontiers and challenges

T. Taniguchi, S. Murata, M. Suzuki, D. Ognibene, P. Lanillos, E. Ugur, L. Jamone, T. Nakamura, A. Ciria, B. Lara, and G. P. and, “World models and predictive coding for cognitive and developmental robotics: frontiers and challenges,” Advanced Robotics, vol. 37, no. 13, pp. 780–806, 2023. [Online]. Available: https://doi.org/10.1080/01691864.2023.2225232

work page doi:10.1080/01691864.2023.2225232 2023
[9]

Emergent language: a survey and taxonomy,

J. Peters, C. Waubert de Puiseau, H. Tercan, A. Gopikrishnan, G. A. Lucas de Carvalho, C. Bitter, and T. Meisen, “Emergent language: a survey and taxonomy,” Autonomous Agents and Multi-Agent Systems, vol. 39, no. 1, p. 18, Mar 2025. [Online]. Available: https://doi.org/10.1007/s10458-025-09691-y

work page doi:10.1007/s10458-025-09691-y 2025
[10]

Multi-agent actor-critic for mixed cooperative- competitive environments,

R. Lowe, Y . WU, A. Tamar, J. Harb, O. Pieter Abbeel, and I. Mordatch, “Multi-agent actor-critic for mixed cooperative- competitive environments,” in Advances in Neural Information Processing Systems, I. Guyon, U. V . Luxburg, S. Bengio, H. Wallach, R. Fergus, S. Vishwanathan, and R. Garnett, Eds., vol. 30. Curran Associates, Inc., 2017. [Online]. Availab...

work page 2017
[11]

Multi-agent reinforcement learning is a sequence modeling problem,

M. Wen, J. G. Kuba, R. Lin, W. Zhang, Y . Wen, J. Wang, and Y . Yang, “Multi-agent reinforcement learning is a sequence modeling problem,” in Proceedings of the 36th International Conference on Neural Information Processing Systems, ser. NIPS ’22. Red Hook, NY , USA: Curran Associates Inc., 2022

work page 2022
[12]

V-learning - A simple, efficient, decentralized algorithm for multiagent RL,

C. Jin, Q. Liu, Y . Wang, and T. Yu, “V-learning - A simple, efficient, decentralized algorithm for multiagent RL,” CoRR, vol. abs/2110.14555, 2021. [Online]. Available: https://arxiv.org/abs/2110. 14555

work page arXiv 2021
[13]

Learning to communicate through implicit communication channels,

H. Wang, B. Chen, T. Zhang, and B. Wang, “Learning to communicate through implicit communication channels,” in The Thirteenth International Conference on Learning Representations, 2025. [Online]. Available: https://openreview.net/forum?id=wm5wwAdiEt

work page 2025
[14]

Learning to ground multi-agent communication with autoencoders,

T. Lin, J. Huh, C. Stauffer, S. N. Lim, and P. Isola, “Learning to ground multi-agent communication with autoencoders,” in Advances in Neural Information Processing Systems, M. Ranzato, A. Beygelzimer, Y . Dauphin, P. Liang, and J. W. Vaughan, Eds., vol. 34. Curran Associates, Inc., 2021, pp. 15 230–15 242. [Online]. Available: https://proceedings.neurips...

work page 2021
[15]

Fully independent communication in multi-agent reinforcement learning,

R. Pina, V . De Silva, C. Artaud, and X. Liu, “Fully independent communication in multi-agent reinforcement learning,” in Proceedings of the 23rd International Conference on Autonomous Agents and Multiagent Systems, ser. AAMAS ’24. Richland, SC: International Foundation for Autonomous Agents and Multiagent Systems, 2024, p. 2423–2425

work page 2024
[16]

, author Krakauer, D.C

M. A. Nowak and D. C. Krakauer, “The evolution of language,” Proceedings of the National Academy of Sciences, vol. 96, no. 14, pp. 8028–8033, 1999. [Online]. Available: https://www.pnas.org/doi/ abs/10.1073/pnas.96.14.8028

work page doi:10.1073/pnas.96.14.8028 1999
[17]

Collective predictive coding hypothesis: symbol emergence as decentralized bayesian inference,

T. Taniguchi, “Collective predictive coding hypothesis: symbol emergence as decentralized bayesian inference,” Frontiers in Robotics and AI, vol. V olume 11 - 2024, 2024. [Online]. Avail- able: https://www.frontiersin.org/journals/robotics-and-ai/articles/10. 3389/frobt.2024.1353870

work page arXiv 2024
[18]

Generative emergent communication: Large language model is a collective world model,

T. Taniguchi, R. Ueda, T. Nakamura, M. Suzuki, and A. Taniguchi, “Generative emergent communication: Large language model is a collective world model,” 2024. [Online]. Available: https: //arxiv.org/abs/2501.00226

work page arXiv 2024
[19]

Predictive coding in the visual cortex: a functional interpretation of some extra-classical receptive-field effects,

R. P. N. Rao and D. H. Ballard, “Predictive coding in the visual cortex: a functional interpretation of some extra-classical receptive-field effects,”Nature Neuroscience, vol. 2, no. 1, pp. 79–87, Jan 1999. [Online]. Available: https://doi.org/10.1038/4580

work page doi:10.1038/4580 1999
[20]

The free-energy principle: a unified brain theory? , volume =

K. Friston, “The free-energy principle: a unified brain theory?” Nature Reviews Neuroscience, vol. 11, no. 2, pp. 127–138, Feb 2010. [Online]. Available: https://doi.org/10.1038/nrn2787

work page doi:10.1038/nrn2787 2010
[21]

doi: 10.3389/frai.2023.1235231

R. Okumura, T. Taniguchi, Y . Hagiwara, and A. Taniguchi, “Metropolis-hastings algorithm in joint-attention naming game: experimental semiotics study,” Frontiers in Artificial Intelligence, vol. V olume 6 - 2023, 2023. [Online]. Available: https://www.frontiersin. org/journals/artificial-intelligence/articles/10.3389/frai.2023.1235231

work page doi:10.3389/frai.2023.1235231 2023
[22]

World mod- els and predictive coding for cognitive and developmental robotics: frontiers and challenges

T. Taniguchi, Y . Yoshida, Y . Matsui, N. L. Hoang, A. Taniguchi, and Y . H. and, “Emergent communication through metropolis- hastings naming game with deep generative models,” Advanced Robotics, vol. 37, no. 19, pp. 1266–1282, 2023. [Online]. Available: https://doi.org/10.1080/01691864.2023.2260856

work page doi:10.1080/01691864.2023.2260856 2023
[23]

SimSiam Naming Game: A Unified Approach for Representation Learning and Emergent Communication

N. L. Hoang, T. Taniguchi, F. Tianwei, and A. Taniguchi, “Simsiam naming game: A unified approach for representation learning and emergent communication,” 2024. [Online]. Available: https://arxiv.org/abs/2410.21803

work page internal anchor Pith review Pith/arXiv arXiv 2024
[24]

Control as probabilistic inference as an emergent communication mechanism in multi-agent reinforcement learning,

T. Nakamura, A. Taniguchi, and T. Taniguchi, “Control as probabilistic inference as an emergent communication mechanism in multi-agent reinforcement learning,” CoRR, vol. abs/2307.05004, 2023. [Online]. Available: https://doi.org/10.48550/arXiv.2307.05004

work page doi:10.48550/arxiv.2307.05004 2023
[25]

Multi-agent reinforcement learning with emergent communication using discrete and indifferentiable message,

H. Ebara, T. Nakamura, A. Taniguchi, and T. Taniguchi, “Multi-agent reinforcement learning with emergent communication using discrete and indifferentiable message,” in 2023 15th International Congress on Advanced Applied Informatics Winter (IIAI-AAI-Winter), 2023, pp. 366–371

work page 2023
[26]

Collective predictive coding as model of science: formalizing scientific activities towards generative science,

T. Taniguchi, S. Takagi, J. Otsuka, Y . Hayashi, and H. T. Hamada, “Collective predictive coding as model of science: formalizing scientific activities towards generative science,” Royal Society Open Science, vol. 12, no. 6, p. 241678, 2025. [Online]. Available: https://royalsocietypublishing.org/doi/abs/10.1098/rsos.241678

work page doi:10.1098/rsos.241678 2025
[27]

Learning multi-agent communication with contrastive learning,

Y . L. Lo, B. Sengupta, J. N. Foerster, and M. Noukhovitch, “Learning multi-agent communication with contrastive learning,” in The Twelfth International Conference on Learning Representations, 2024. [Online]. Available: https://openreview.net/forum?id=vZZ4hhniJU

work page 2024
[28]

Markov games as a framework for multi- agent reinforcement learning,

M. L. Littman, “Markov games as a framework for multi- agent reinforcement learning,” in Machine Learning Proceedings 1994, W. W. Cohen and H. Hirsh, Eds. San Francisco (CA): Morgan Kaufmann, 1994, pp. 157–163. [Online]. Available: https: //www.sciencedirect.com/science/article/pii/B9781558603356500271

work page 1994
[29]

The complexity of decentralized control of markov decision processes,

D. S. Bernstein, R. Givan, N. Immerman, and S. Zilberstein, “The complexity of decentralized control of markov decision processes,” Mathematics of Operations Research, vol. 27, no. 4, pp. 819–840,

work page
[30]

Available: https://doi.org/10.1287/moor.27.4.819.297

[Online]. Available: https://doi.org/10.1287/moor.27.4.819.297

work page doi:10.1287/moor.27.4.819.297
[31]

Learning Latent Dynamics for Planning from Pixels

D. Hafner, T. Lillicrap, I. Fischer, R. Villegas, D. Ha, H. Lee, and J. Davidson, “Learning latent dynamics for planning from pixels,” arXiv preprint arXiv:1811.04551, 2018

work page internal anchor Pith review Pith/arXiv arXiv 2018
[32]

Learning Phrase Representations using RNN Encoder-Decoder for Statistical Machine Translation

K. Cho, B. Van Merri ¨enboer, C. Gulcehre, D. Bahdanau, F. Bougares, H. Schwenk, and Y . Bengio, “Learning phrase representations using rnn encoder-decoder for statistical machine translation,” arXiv preprint arXiv:1406.1078, 2014

work page internal anchor Pith review Pith/arXiv arXiv 2014
[33]

Noise-contrastive estimation: A new estimation principle for unnormalized statistical models,

M. Gutmann and A. Hyv ¨arinen, “Noise-contrastive estimation: A new estimation principle for unnormalized statistical models,” in Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics, ser. Proceedings of Machine Learning Research, Y . W. Teh and M. Titterington, Eds., vol. 9. Chia Laguna Resort, Sardinia, Italy:...

work page 2010
[34]

Representation Learning with Contrastive Predictive Coding

A. van den Oord, Y . Li, and O. Vinyals, “Representation learning with contrastive predictive coding,” 2019. [Online]. Available: https://arxiv.org/abs/1807.03748

work page internal anchor Pith review Pith/arXiv arXiv 2019
[35]

Representational similarity analysis – connecting the branches of systems neuroscience , issn =

N. Kriegeskorte, M. Mur, and P. A. Bandettini, “Representational similarity analysis - connecting the branches of systems neuroscience,” Frontiers in Systems Neuroscience, vol. 2, 2008. [Online]. Available: https://www.frontiersin.org/journals/systems-neuroscience/ articles/10.3389/neuro.06.004.2008

work page doi:10.3389/neuro.06.004.2008 2008
[36]

On the pitfalls of measuring emergent communication,

R. Lowe, J. Foerster, Y .-L. Boureau, J. Pineau, and Y . Dauphin, “On the pitfalls of measuring emergent communication,” in Proceedings of the 18th International Conference on Autonomous Agents and MultiAgent Systems, ser. AAMAS ’19. Richland, SC: International Foundation for Autonomous Agents and Multiagent Systems, 2019, p. 693–701

work page 2019

[1] [1]

Shared Agency,

A. S. Roth, “Shared Agency,” in The Stanford Encyclopedia of Philosophy, Summer 2017 ed., E. N. Zalta, Ed. Metaphysics Research Lab, Stanford University, 2017

work page 2017

[2] [2]

arXiv preprint arXiv:2012.08630 , year=

A. Dafoe, E. Hughes, Y . Bachrach, T. Collins, K. R. McKee, J. Z. Leibo, K. Larson, and T. Graepel, “Open problems in cooperative AI,” CoRR, vol. abs/2012.08630, 2020. [Online]. Available: https://arxiv.org/abs/2012.08630

work page arXiv 2012

[3] [3]

Tom2c: Target-oriented multi-agent communication and cooperation with theory of mind,

Y . Wang, F. Zhong, J. Xu, and Y . Wang, “Tom2c: Target-oriented multi-agent communication and cooperation with theory of mind,” in International Conference on Learning Representations, 2022. © 2025 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinti...

work page 2022

[4] [4]

Angeliki Lazaridou, Alexander Peysakhovich, and Marco Baroni

A. Lazaridou and M. Baroni, “Emergent multi-agent communication in the deep learning era,” CoRR, vol. abs/2006.02419, 2020. [Online]. Available: https://arxiv.org/abs/2006.02419

work page arXiv 2006

[5] [5]

Toward more human-like ai communication: A review of emergent communication research,

N. Brandizzi, “Toward more human-like ai communication: A review of emergent communication research,” IEEE Access, vol. 11, pp. 142 317–142 340, 2023

work page 2023

[6] [6]

Recurrent world models facilitate policy evolution,

D. Ha and J. Schmidhuber, “Recurrent world models facilitate policy evolution,” in Advances in Neural Information Processing Systems, S. Bengio, H. Wallach, H. Larochelle, K. Grauman, N. Cesa-Bianchi, and R. Garnett, Eds., vol. 31. Curran Associates, Inc., 2018. [Online]. Available: https://proceedings.neurips.cc/paper files/paper/ 2018/file/2de5d16682c3c...

work page 2018

[7] [7]

Mastering Diverse Domains through World Models

D. Hafner, J. Pasukonis, J. Ba, and T. Lillicrap, “Mastering diverse domains through world models,” 2024. [Online]. Available: https://arxiv.org/abs/2301.04104

work page internal anchor Pith review Pith/arXiv arXiv 2024

[8] [8]

World mod- els and predictive coding for cognitive and developmental robotics: frontiers and challenges

T. Taniguchi, S. Murata, M. Suzuki, D. Ognibene, P. Lanillos, E. Ugur, L. Jamone, T. Nakamura, A. Ciria, B. Lara, and G. P. and, “World models and predictive coding for cognitive and developmental robotics: frontiers and challenges,” Advanced Robotics, vol. 37, no. 13, pp. 780–806, 2023. [Online]. Available: https://doi.org/10.1080/01691864.2023.2225232

work page doi:10.1080/01691864.2023.2225232 2023

[9] [9]

Emergent language: a survey and taxonomy,

J. Peters, C. Waubert de Puiseau, H. Tercan, A. Gopikrishnan, G. A. Lucas de Carvalho, C. Bitter, and T. Meisen, “Emergent language: a survey and taxonomy,” Autonomous Agents and Multi-Agent Systems, vol. 39, no. 1, p. 18, Mar 2025. [Online]. Available: https://doi.org/10.1007/s10458-025-09691-y

work page doi:10.1007/s10458-025-09691-y 2025

[10] [10]

Multi-agent actor-critic for mixed cooperative- competitive environments,

R. Lowe, Y . WU, A. Tamar, J. Harb, O. Pieter Abbeel, and I. Mordatch, “Multi-agent actor-critic for mixed cooperative- competitive environments,” in Advances in Neural Information Processing Systems, I. Guyon, U. V . Luxburg, S. Bengio, H. Wallach, R. Fergus, S. Vishwanathan, and R. Garnett, Eds., vol. 30. Curran Associates, Inc., 2017. [Online]. Availab...

work page 2017

[11] [11]

Multi-agent reinforcement learning is a sequence modeling problem,

M. Wen, J. G. Kuba, R. Lin, W. Zhang, Y . Wen, J. Wang, and Y . Yang, “Multi-agent reinforcement learning is a sequence modeling problem,” in Proceedings of the 36th International Conference on Neural Information Processing Systems, ser. NIPS ’22. Red Hook, NY , USA: Curran Associates Inc., 2022

work page 2022

[12] [12]

V-learning - A simple, efficient, decentralized algorithm for multiagent RL,

C. Jin, Q. Liu, Y . Wang, and T. Yu, “V-learning - A simple, efficient, decentralized algorithm for multiagent RL,” CoRR, vol. abs/2110.14555, 2021. [Online]. Available: https://arxiv.org/abs/2110. 14555

work page arXiv 2021

[13] [13]

Learning to communicate through implicit communication channels,

H. Wang, B. Chen, T. Zhang, and B. Wang, “Learning to communicate through implicit communication channels,” in The Thirteenth International Conference on Learning Representations, 2025. [Online]. Available: https://openreview.net/forum?id=wm5wwAdiEt

work page 2025

[14] [14]

Learning to ground multi-agent communication with autoencoders,

T. Lin, J. Huh, C. Stauffer, S. N. Lim, and P. Isola, “Learning to ground multi-agent communication with autoencoders,” in Advances in Neural Information Processing Systems, M. Ranzato, A. Beygelzimer, Y . Dauphin, P. Liang, and J. W. Vaughan, Eds., vol. 34. Curran Associates, Inc., 2021, pp. 15 230–15 242. [Online]. Available: https://proceedings.neurips...

work page 2021

[15] [15]

Fully independent communication in multi-agent reinforcement learning,

R. Pina, V . De Silva, C. Artaud, and X. Liu, “Fully independent communication in multi-agent reinforcement learning,” in Proceedings of the 23rd International Conference on Autonomous Agents and Multiagent Systems, ser. AAMAS ’24. Richland, SC: International Foundation for Autonomous Agents and Multiagent Systems, 2024, p. 2423–2425

work page 2024

[16] [16]

, author Krakauer, D.C

M. A. Nowak and D. C. Krakauer, “The evolution of language,” Proceedings of the National Academy of Sciences, vol. 96, no. 14, pp. 8028–8033, 1999. [Online]. Available: https://www.pnas.org/doi/ abs/10.1073/pnas.96.14.8028

work page doi:10.1073/pnas.96.14.8028 1999

[17] [17]

Collective predictive coding hypothesis: symbol emergence as decentralized bayesian inference,

T. Taniguchi, “Collective predictive coding hypothesis: symbol emergence as decentralized bayesian inference,” Frontiers in Robotics and AI, vol. V olume 11 - 2024, 2024. [Online]. Avail- able: https://www.frontiersin.org/journals/robotics-and-ai/articles/10. 3389/frobt.2024.1353870

work page arXiv 2024

[18] [18]

Generative emergent communication: Large language model is a collective world model,

T. Taniguchi, R. Ueda, T. Nakamura, M. Suzuki, and A. Taniguchi, “Generative emergent communication: Large language model is a collective world model,” 2024. [Online]. Available: https: //arxiv.org/abs/2501.00226

work page arXiv 2024

[19] [19]

Predictive coding in the visual cortex: a functional interpretation of some extra-classical receptive-field effects,

R. P. N. Rao and D. H. Ballard, “Predictive coding in the visual cortex: a functional interpretation of some extra-classical receptive-field effects,”Nature Neuroscience, vol. 2, no. 1, pp. 79–87, Jan 1999. [Online]. Available: https://doi.org/10.1038/4580

work page doi:10.1038/4580 1999

[20] [20]

The free-energy principle: a unified brain theory? , volume =

K. Friston, “The free-energy principle: a unified brain theory?” Nature Reviews Neuroscience, vol. 11, no. 2, pp. 127–138, Feb 2010. [Online]. Available: https://doi.org/10.1038/nrn2787

work page doi:10.1038/nrn2787 2010

[21] [21]

doi: 10.3389/frai.2023.1235231

R. Okumura, T. Taniguchi, Y . Hagiwara, and A. Taniguchi, “Metropolis-hastings algorithm in joint-attention naming game: experimental semiotics study,” Frontiers in Artificial Intelligence, vol. V olume 6 - 2023, 2023. [Online]. Available: https://www.frontiersin. org/journals/artificial-intelligence/articles/10.3389/frai.2023.1235231

work page doi:10.3389/frai.2023.1235231 2023

[22] [22]

World mod- els and predictive coding for cognitive and developmental robotics: frontiers and challenges

T. Taniguchi, Y . Yoshida, Y . Matsui, N. L. Hoang, A. Taniguchi, and Y . H. and, “Emergent communication through metropolis- hastings naming game with deep generative models,” Advanced Robotics, vol. 37, no. 19, pp. 1266–1282, 2023. [Online]. Available: https://doi.org/10.1080/01691864.2023.2260856

work page doi:10.1080/01691864.2023.2260856 2023

[23] [23]

SimSiam Naming Game: A Unified Approach for Representation Learning and Emergent Communication

N. L. Hoang, T. Taniguchi, F. Tianwei, and A. Taniguchi, “Simsiam naming game: A unified approach for representation learning and emergent communication,” 2024. [Online]. Available: https://arxiv.org/abs/2410.21803

work page internal anchor Pith review Pith/arXiv arXiv 2024

[24] [24]

Control as probabilistic inference as an emergent communication mechanism in multi-agent reinforcement learning,

T. Nakamura, A. Taniguchi, and T. Taniguchi, “Control as probabilistic inference as an emergent communication mechanism in multi-agent reinforcement learning,” CoRR, vol. abs/2307.05004, 2023. [Online]. Available: https://doi.org/10.48550/arXiv.2307.05004

work page doi:10.48550/arxiv.2307.05004 2023

[25] [25]

Multi-agent reinforcement learning with emergent communication using discrete and indifferentiable message,

H. Ebara, T. Nakamura, A. Taniguchi, and T. Taniguchi, “Multi-agent reinforcement learning with emergent communication using discrete and indifferentiable message,” in 2023 15th International Congress on Advanced Applied Informatics Winter (IIAI-AAI-Winter), 2023, pp. 366–371

work page 2023

[26] [26]

Collective predictive coding as model of science: formalizing scientific activities towards generative science,

T. Taniguchi, S. Takagi, J. Otsuka, Y . Hayashi, and H. T. Hamada, “Collective predictive coding as model of science: formalizing scientific activities towards generative science,” Royal Society Open Science, vol. 12, no. 6, p. 241678, 2025. [Online]. Available: https://royalsocietypublishing.org/doi/abs/10.1098/rsos.241678

work page doi:10.1098/rsos.241678 2025

[27] [27]

Learning multi-agent communication with contrastive learning,

Y . L. Lo, B. Sengupta, J. N. Foerster, and M. Noukhovitch, “Learning multi-agent communication with contrastive learning,” in The Twelfth International Conference on Learning Representations, 2024. [Online]. Available: https://openreview.net/forum?id=vZZ4hhniJU

work page 2024

[28] [28]

Markov games as a framework for multi- agent reinforcement learning,

M. L. Littman, “Markov games as a framework for multi- agent reinforcement learning,” in Machine Learning Proceedings 1994, W. W. Cohen and H. Hirsh, Eds. San Francisco (CA): Morgan Kaufmann, 1994, pp. 157–163. [Online]. Available: https: //www.sciencedirect.com/science/article/pii/B9781558603356500271

work page 1994

[29] [29]

The complexity of decentralized control of markov decision processes,

D. S. Bernstein, R. Givan, N. Immerman, and S. Zilberstein, “The complexity of decentralized control of markov decision processes,” Mathematics of Operations Research, vol. 27, no. 4, pp. 819–840,

work page

[30] [30]

Available: https://doi.org/10.1287/moor.27.4.819.297

[Online]. Available: https://doi.org/10.1287/moor.27.4.819.297

work page doi:10.1287/moor.27.4.819.297

[31] [31]

Learning Latent Dynamics for Planning from Pixels

D. Hafner, T. Lillicrap, I. Fischer, R. Villegas, D. Ha, H. Lee, and J. Davidson, “Learning latent dynamics for planning from pixels,” arXiv preprint arXiv:1811.04551, 2018

work page internal anchor Pith review Pith/arXiv arXiv 2018

[32] [32]

Learning Phrase Representations using RNN Encoder-Decoder for Statistical Machine Translation

K. Cho, B. Van Merri ¨enboer, C. Gulcehre, D. Bahdanau, F. Bougares, H. Schwenk, and Y . Bengio, “Learning phrase representations using rnn encoder-decoder for statistical machine translation,” arXiv preprint arXiv:1406.1078, 2014

work page internal anchor Pith review Pith/arXiv arXiv 2014

[33] [33]

Noise-contrastive estimation: A new estimation principle for unnormalized statistical models,

M. Gutmann and A. Hyv ¨arinen, “Noise-contrastive estimation: A new estimation principle for unnormalized statistical models,” in Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics, ser. Proceedings of Machine Learning Research, Y . W. Teh and M. Titterington, Eds., vol. 9. Chia Laguna Resort, Sardinia, Italy:...

work page 2010

[34] [34]

Representation Learning with Contrastive Predictive Coding

A. van den Oord, Y . Li, and O. Vinyals, “Representation learning with contrastive predictive coding,” 2019. [Online]. Available: https://arxiv.org/abs/1807.03748

work page internal anchor Pith review Pith/arXiv arXiv 2019

[35] [35]

Representational similarity analysis – connecting the branches of systems neuroscience , issn =

N. Kriegeskorte, M. Mur, and P. A. Bandettini, “Representational similarity analysis - connecting the branches of systems neuroscience,” Frontiers in Systems Neuroscience, vol. 2, 2008. [Online]. Available: https://www.frontiersin.org/journals/systems-neuroscience/ articles/10.3389/neuro.06.004.2008

work page doi:10.3389/neuro.06.004.2008 2008

[36] [36]

On the pitfalls of measuring emergent communication,

R. Lowe, J. Foerster, Y .-L. Boureau, J. Pineau, and Y . Dauphin, “On the pitfalls of measuring emergent communication,” in Proceedings of the 18th International Conference on Autonomous Agents and MultiAgent Systems, ser. AAMAS ’19. Richland, SC: International Foundation for Autonomous Agents and Multiagent Systems, 2019, p. 693–701

work page 2019