SimSiam Naming Game: A Unified Approach for Representation Learning and Emergent Communication

Akira Taniguchi; Fang Tianwei; Nguyen Le Hoang; Tadahiro Taniguchi

arxiv: 2410.21803 · v2 · submitted 2024-10-29 · 💻 cs.CL

SimSiam Naming Game: A Unified Approach for Representation Learning and Emergent Communication

Nguyen Le Hoang , Tadahiro Taniguchi , Fang Tianwei , Akira Taniguchi This is my paper

Pith reviewed 2026-05-23 19:07 UTC · model grok-4.3

classification 💻 cs.CL

keywords emergent communicationself-supervised learningnaming gamerepresentation alignmentGumbel-Softmaxmulti-agent systemssymbol emergence

0 comments

The pith

SimSiam Naming Game aligns agent representations via self-supervised messages to enable feedback-free emergent communication with improved downstream utility.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper proposes the SimSiam Naming Game as a new framework for emergent communication that unifies it with self-supervised representation learning. It treats symbol emergence as the alignment of latent representations between autonomous agents through exchanged messages, using a symmetric objective inspired by SimSiam. This replaces inefficient sampling in previous methods like MHNG with gradient-based optimization enabled by Gumbel-Softmax. The key result is that messages learned this way lead to substantially higher linear-probe accuracy on CIFAR-10 and ImageNet-100 than those from referential games, reconstruction games, or MHNG. Readers should care if they are interested in how agents can develop useful communication protocols without rewards or explicit feedback.

Core claim

SSNG formulates symbol emergence as an alignment process between agents' latent representations mediated by message exchange, building on a variational inference-based probabilistic interpretation of self-supervised learning. Discrete symbolic messages are learned via Gumbel-Softmax relaxation to allow end-to-end optimization. On CIFAR-10 and ImageNet-100, the emergent messages achieve substantially higher linear-probe classification accuracy than those from referential games, reconstruction games, and MHNG.

What carries the argument

The SimSiam Naming Game mechanism, which performs symmetric self-supervised alignment of agents' representations through exchanged discrete messages relaxed by Gumbel-Softmax.

If this is right

Feedback-free emergent communication becomes feasible in high-dimensional perceptual spaces.
Self-supervised objectives can mediate symbol emergence without joint attention or rewards.
Learned messages support stronger linear classification performance on standard image benchmarks.
End-to-end differentiable training is possible while keeping messages discrete.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

If the alignment works across agents, it may extend to larger populations of agents developing shared languages.
Representation learning objectives could be applied directly to communication emergence in other domains like robotics.
Future work might explore whether the same variational interpretation applies to other self-supervised methods beyond SimSiam.

Load-bearing premise

That a symmetric self-supervised alignment objective between agents can produce shared symbolic representations without any explicit success signal or reward.

What would settle it

Running the linear probe experiment on CIFAR-10 with SSNG messages and finding no improvement or lower accuracy compared to MHNG messages would falsify the performance claim.

Figures

Figures reproduced from arXiv: 2410.21803 by Akira Taniguchi, Fang Tianwei, Nguyen Le Hoang, Tadahiro Taniguchi.

**Figure 2.** Figure 2: Illustrations of the SimSiam+VAE. (a) The PGM representation of the generative and inference process in SimSiam+VAE. From the observations xA and xB, the representation z is inferred, which is subsequently used to infer latent variable z. Solid lines indicate the generative process (from w to z), while dashed lines indicate the inference process (from xA and xB to z and then to w). (b) Architecture of the… view at source ↗

**Figure 3.** Figure 3: The EmCom between two agents, A and B, based on the SimSiam Naming Game. [PITH_FULL_IMAGE:figures/full_fig_p007_3.png] view at source ↗

**Figure 4.** Figure 4: Architecture of the language coder comprising an LSTM-based encoder-decoder structure. The [PITH_FULL_IMAGE:figures/full_fig_p020_4.png] view at source ↗

read the original abstract

Emergent Communication (EmCom) investigates how agents develop symbolic communication through interaction without predefined language. Recent frameworks, such as the Metropolis--Hastings Naming Game (MHNG), formulate EmCom as the learning of shared external representations negotiated through interaction under joint attention, without explicit success or reward feedback. However, MHNG relies on sampling-based updates that suffer from high rejection rates in high-dimensional perceptual spaces, making the learning process sample-inefficient for complex visual datasets. In this work, we propose the SimSiam Naming Game (SSNG), a feedback-free EmCom framework that replaces sampling-based updates with a symmetric, self-supervised representation alignment objective between autonomous agents. Building on a variational inference--based probabilistic interpretation of self-supervised learning, SSNG formulates symbol emergence as an alignment process between agents' latent representations mediated by message exchange. To enable end-to-end gradient-based optimization, discrete symbolic messages are learned via a Gumbel--Softmax relaxation, preserving the discrete nature of communication while maintaining differentiability. Experiments on CIFAR-10 and ImageNet-100 show that the emergent messages learned by SSNG achieve substantially higher linear-probe classification accuracy than those produced by referential games, reconstruction games, and MHNG. These results indicate that self-supervised representation alignment provides an effective mechanism for feedback-free EmCom in multi-agent systems.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

SSNG swaps MHNG sampling for SimSiam-style alignment plus Gumbel-Softmax, and the abstract reports clearer probe gains on CIFAR-10 and ImageNet-100 than the listed baselines.

read the letter

SSNG replaces the rejection-heavy sampling in MHNG with a symmetric alignment objective between two agents' representations, exactly as SimSiam does for a single network. Gumbel-Softmax makes the discrete messages end-to-end differentiable, which is the concrete technical move that lets them drop the Metropolis-Hastings step while staying feedback-free. That unification is new on its own terms and does not collapse into the earlier MHNG equations. The experiments claim substantially higher linear-probe accuracy for the resulting messages than referential games, reconstruction games, and MHNG on both CIFAR-10 and ImageNet-100. If the baseline re-implementations are fair, this is a practical efficiency win for high-dimensional visual data. The variational-inference framing is presented as motivation but is not load-bearing for the algorithm itself. The main uncertainty is whether the reported gaps survive proper error bars and identical hyper-parameter budgets for the baselines; the abstract alone does not show those controls. The work is aimed at people already working on emergent communication who need something that scales past low-dimensional inputs. It is coherent enough on its own terms to deserve referee time, even if the gains turn out smaller once the full tables are examined.

Referee Report

2 major / 2 minor

Summary. The paper proposes SimSiam Naming Game (SSNG), a feedback-free emergent communication framework that replaces MHNG's sampling-based updates with a symmetric self-supervised representation alignment objective between agents. Discrete messages are obtained via Gumbel-Softmax relaxation, and experiments on CIFAR-10 and ImageNet-100 report substantially higher linear-probe classification accuracy for SSNG messages versus referential games, reconstruction games, and MHNG.

Significance. If the empirical comparison holds under rigorous controls, the work offers a scalable, gradient-based alternative to sampling-heavy methods for symbol emergence in high-dimensional visual domains and provides a concrete bridge between self-supervised representation learning and multi-agent emergent communication.

major comments (2)

[Experiments] Experiments section: the headline claim of substantially higher linear-probe accuracy is presented without error bars, number of independent runs, or statistical significance tests against the three baselines; this information is load-bearing for evaluating whether the reported gains are robust rather than implementation artifacts.
[Method] Method and experimental setup: the Gumbel-Softmax temperature is listed as a free hyperparameter, yet no sensitivity analysis or ablation on its value is reported; because the discretization step directly affects message quality and downstream probe accuracy, this choice must be shown not to drive the comparative result.

minor comments (2)

[Introduction] The variational-inference framing of SSL is introduced in the abstract and introduction but is not used in any derivation that alters the implemented objective; a brief clarification would remove potential reader confusion without changing the technical contribution.
[Method] Notation for the two agents, their encoders, and the message channel is introduced piecemeal; a single consolidated table or diagram early in the method section would improve readability.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive comments on our manuscript. We address each major comment below, indicating planned revisions where appropriate.

read point-by-point responses

Referee: [Experiments] Experiments section: the headline claim of substantially higher linear-probe accuracy is presented without error bars, number of independent runs, or statistical significance tests against the three baselines; this information is load-bearing for evaluating whether the reported gains are robust rather than implementation artifacts.

Authors: We agree that the absence of error bars, details on the number of independent runs, and statistical significance tests weakens the strength of the empirical claims. In the revised manuscript we will report results aggregated over multiple independent runs (with standard deviations) and include pairwise statistical significance tests against the baselines. revision: yes
Referee: [Method] Method and experimental setup: the Gumbel-Softmax temperature is listed as a free hyperparameter, yet no sensitivity analysis or ablation on its value is reported; because the discretization step directly affects message quality and downstream probe accuracy, this choice must be shown not to drive the comparative result.

Authors: We acknowledge that an ablation on the Gumbel-Softmax temperature is necessary to demonstrate that the reported gains are not driven by a particular temperature choice. We will add a sensitivity analysis varying the temperature across a reasonable range and include the corresponding linear-probe accuracies in the revised experimental section. revision: yes

Circularity Check

0 steps flagged

No significant circularity

full rationale

The paper introduces SSNG as a new feedback-free EmCom framework replacing MHNG's sampling with a symmetric self-supervised alignment objective (SimSiam-style) plus Gumbel-Softmax. The central claim is empirical: higher linear-probe accuracy on CIFAR-10 and ImageNet-100 versus referential, reconstruction, and MHNG baselines. This rests on concrete new experiments and re-implementations rather than any equation reducing by construction to fitted parameters, self-citations, or renamed inputs. The variational-inference framing of SSL is presented as interpretive scaffolding, not a load-bearing derivation that forces the result. No self-definitional, fitted-prediction, or uniqueness-imported steps appear in the provided abstract or framing.

Axiom & Free-Parameter Ledger

1 free parameters · 1 axioms · 0 invented entities

The central claim rests on the domain assumption that a variational inference interpretation of self-supervised learning transfers to multi-agent symbol emergence, plus standard optimization techniques; no new invented entities are introduced.

free parameters (1)

Gumbel-Softmax temperature
Standard hyperparameter for the discrete relaxation, assumed to be tuned for the reported experiments but not specified in the abstract.

axioms (1)

domain assumption A variational inference-based probabilistic interpretation of self-supervised learning extends to formulating symbol emergence as alignment between agents' latent representations
Explicitly invoked in the abstract to ground the SSNG formulation.

pith-pipeline@v0.9.0 · 5779 in / 1423 out tokens · 45220 ms · 2026-05-23T19:07:32.802856+00:00 · methodology

discussion (0)

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

Decentralized Collective World Model for Emergent Communication and Coordination
cs.MA 2025-04 unverdicted novelty 6.0

A decentralized collective world model integrates predictive coding with bidirectional communication to achieve simultaneous symbol emergence and coordination, outperforming non-communicative baselines in a two-agent ...

Reference graph

Works this paper leans on

33 extracted references · 33 canonical work pages · cited by 1 Pith paper · 4 internal anchors

[2]

Contrastive Variational Autoencoder Enhances Salient Features

URL http://arxiv.org/abs/1902.04601. Jyoti Aneja, Alex Schwing, Jan Kautz, and Arash Vahdat. A contrastive learning approach for training variational autoencoder priors. In A. Beygelzimer, Y. Dauphin, P. Liang, and J. Wortman Vaughan (eds.), Advances in Neural Information Processing Systems ,

work page internal anchor Pith review Pith/arXiv arXiv 1902
[3]

Yoshua Bengio, Aaron Courville, and Pascal Vincent

doi: 10.1098/rstb.2019.0307. Yoshua Bengio, Aaron Courville, and Pascal Vincent. Representation learning: A review and new perspec- tives. IEEE transactions on pattern analysis and machine intelligence , 35(8):1798–1828, 2013a. Yoshua Bengio, Nicholas L´ eonard, and Aaron Courville. Estimating or propagating gradients through stochastic neurons for condit...

work page doi:10.1098/rstb.2019.0307 2019
[4]

Henry Brighton and Simon Kirby

doi: 10.1109/ACCESS.2023.3339656. Henry Brighton and Simon Kirby. Understanding linguistic evolution by visualizing the emergence of topo- graphic mappings. Artificial life, 12(2):229–242,

work page doi:10.1109/access.2023.3339656 2023
[5]

Angelo Cangelosi and Domenico Parisi

doi: 10.1162/106454606776073323. Angelo Cangelosi and Domenico Parisi. Computer simulation: A new scientific approach to the study of language evolution. Simulating the Evolution of Language , pp. 3–28,

work page doi:10.1162/106454606776073323
[6]

Emerging properties in self-supervised vision transformers

10 Mathilde Caron, Hugo Touvron, Ishan Misra, Herv´ e Jegou, Julien Mairal, Piotr Bojanowski, and Armand Joulin. Emerging properties in self-supervised vision transformers. In 2021 IEEE/CVF International Conference on Computer Vision (ICCV) , pp. 9630–9640,

work page 2021
[7]

Walk in the cloud: Learning curves for point clouds shape analysis, pp

doi: 10.1109/ICCV48922.2021.00951. Rahma Chaabouni, Eugene Kharitonov, Diane Bouchacourt, Emmanuel Dupoux, and Marco Baroni. Com- positionality and generalization in emergent languages. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics , July

work page doi:10.1109/iccv48922.2021.00951 2021
[8]

Ting Chen, Simon Kornblith, Mohammad Norouzi, and Geoffrey Hinton

doi: 10.4324/9780203014936. Ting Chen, Simon Kornblith, Mohammad Norouzi, and Geoffrey Hinton. A simple framework for contrastive learning of visual representations. InProceedings of the 37th International Conference on Machine Learning (ICML),

work page doi:10.4324/9780203014936
[9]

When does contrastive visual representation learning work? In 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp

Elijah Cole, Xuan Yang, Kimberly Wilber, Oisin Mac Aodha, and Serge Belongie. When does contrastive visual representation learning work? In 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 01–10,

work page 2022
[10]

A ConvNet for the 2020s

doi: 10.1109/CVPR52688.2022.01434. Kevin Denamgana¨ ı, Sondess Missaoui, and James Alfred Walker. Visual referential games further the emer- gence of disentangled representations,

work page doi:10.1109/cvpr52688.2022.01434 2022
[11]

Hiroto Ebara, Tomoaki Nakamura, Akira Taniguchi, and Tadahiro Taniguchi

URL https://arxiv.org/abs/2304.14511. Hiroto Ebara, Tomoaki Nakamura, Akira Taniguchi, and Tadahiro Taniguchi. Multi-agent reinforcement learning with emergent communication using discrete and indifferentiable message. In 2023 15th Inter- national Congress on Advanced Applied Informatics Winter (IIAI-AAI-Winter) , pp. 366–371,

work page arXiv 2023
[12]

doi: 10.1109/IIAI-AAI-Winter61682.2023.00073. Jakob N. Foerster, Yannis M. Assael, Nando de Freitas, and Shimon Whiteson. Learning to communicate with deep multi-agent reinforcement learning,

work page doi:10.1109/iiai-aai-winter61682.2023.00073 2023
[13]

Learning to Communicate with Deep Multi-Agent Reinforcement Learning

URL https://arxiv.org/abs/1605.06676. Karl Friston, Rosalyn J Moran, Yukie Nagai, Tadahiro Taniguchi, Hiroaki Gomi, and Josh Tenenbaum. World model learning and inference. Neural networks: the official journal of the International Neural Network Society, 144:573–590,

work page internal anchor Pith review Pith/arXiv arXiv
[14]

Lukas Galke, Yoav Ram, and Limor Raviv

doi: 10.1016/j.neunet.2021.09.011. Lukas Galke, Yoav Ram, and Limor Raviv. Emergent communication for understanding human language evolution: What’s missing? arXiv,

work page doi:10.1016/j.neunet.2021.09.011 2021
[15]

Serhii Havrylov and Ivan Titov

doi: 10.3389/ frobt.2019.00134. Serhii Havrylov and Ivan Titov. Emergence of language with multi-agent games: Learning to communicate with sequences of symbols. In Advances in Neural Information Processing Systems 30 , pp. 2146–2156,

work page arXiv 2019
[16]

Liesen, Z

ISBN 9780199682737. doi: 10.1093/ acprof:oso/9780199682737.001.0001. Jun Inukai, Tadahiro Taniguchi, Akira Taniguchi, and Yoshinobu Hagiwara. Recursive metropolis-hastings naming game: Symbol emergence in a multi-agent system based on probabilistic generative models. Fron- tiers in Artificial Intelligence ,

work page arXiv
[17]

doi: 10.3389/frai.2023.1229127

ISSN 2624-8212. doi: 10.3389/frai.2023.1229127. Eric Jang, Shixiang Gu, and Ben Poole. Categorical reparameterization with gumbel-softmax. In Interna- tional Conference on Learning Representations,

work page doi:10.3389/frai.2023.1229127 2023
[18]

Auto-Encoding Variational Bayes

Diederik P Kingma and Max Welling. Auto-encoding variational bayes. arXiv, https://arxiv.org/abs/1312.6114,

work page internal anchor Pith review Pith/arXiv arXiv
[19]

Angeliki Lazaridou, Alexander Peysakhovich, and Marco Baroni

URL https://arxiv.org/abs/2006.02419. Angeliki Lazaridou, Alexander Peysakhovich, and Marco Baroni. Multi-agent cooperation and the emergence of (natural) language. In The International Conference on Learning Representations (ICLR) ,

work page arXiv 2006
[20]

doi: 10.1109/access.2020.3031549

ISSN 2169-3536. doi: 10.1109/access.2020.3031549. URL http://dx.doi.org/10.1109/ACCESS.2020.3031549. Yann LeCun, Yoshua Bengio, and Geoffrey Hinton. Deep learning. Nature, 521(7553):436–444,

work page doi:10.1109/access.2020.3031549 2020
[21]

Cr-vae: Contrastive regularization on variational autoencoders for preventing posterior collapse

Fotios Lygerakis and Elmar Rueckert. Cr-vae: Contrastive regularization on variational autoencoders for preventing posterior collapse. In 2023 7th Asian Conference on Artificial Intelligence Technology (ACAIT), pp. 427–437,

work page 2023
[22]

Ryota Okumura, Tadahiro Taniguchi, Yosinobu Hagiwara, and Akira Taniguchi

doi: 10.48550/arxiv.2203.11437. Ryota Okumura, Tadahiro Taniguchi, Yosinobu Hagiwara, and Akira Taniguchi. Metropolis-hastings algo- rithm in joint-attention naming game: Experimental semiotics study. Frontiers in Artificial Intelligence , 6,

work page doi:10.48550/arxiv.2203.11437
[23]

doi: 10.3389/frai.2023.1235231

ISSN 2624-8212. doi: 10.3389/frai.2023.1235231. Jannik Peters, Constantin Waubert de Puiseau, Hasan Tercan, Arya Gopikrishnan, Gustavo Adolpho Lu- cas De Carvalho, Christian Bitter, and Tobias Meisen. A survey on emergent language,

work page doi:10.3389/frai.2023.1235231 2023
[24]

Mathieu Rita, Paul Michel, Rahma Chaabouni, Olivier Pietquin, Emmanuel Dupoux, and Florian Strub

URL https://arxiv.org/abs/2409.02645. Mathieu Rita, Paul Michel, Rahma Chaabouni, Olivier Pietquin, Emmanuel Dupoux, and Florian Strub. Language evolution with deep learning,

work page arXiv
[25]

, author Michel, P

URL https://arxiv.org/abs/2403.11958. Claude E. Shannon and Warren Weaver. The Mathematical Theory of Communication. University of Illinois Press, Urbana, IL,

work page arXiv
[26]

Tadahiro Taniguchi, Takayuki Nagai, Tomoaki Nakamura, Naoto Iwahashi, Tetsuya Ogata, and Hideki Asoh

doi: 10.3389/frobt.2024.1353870. Tadahiro Taniguchi, Takayuki Nagai, Tomoaki Nakamura, Naoto Iwahashi, Tetsuya Ogata, and Hideki Asoh. Symbol emergence in robotics: a survey. Advanced Robotics, 30(11-12):706–728,

work page doi:10.3389/frobt.2024.1353870 2024
[27]

World mod- els and predictive coding for cognitive and developmental robotics: frontiers and challenges

Tadahiro Taniguchi, Shingo Murata, Masahiro Suzuki, Dimitri Ognibene, Pablo Lanillos, Emre Ugur, Lorenzo Jamone, Tomoaki Nakamura, Alejandra Ciria, Bruno Lara, and Giovanni Pezzulo. World mod- els and predictive coding for cognitive and developmental robotics: frontiers and challenges. Advanced Robotics, 37(13), 2023a. Tadahiro Taniguchi, Yuto Yoshida, Yu...

work page doi:10.1080/01691864.2023.2260856 2023
[28]

Tobias Uelwer, Jan Robine, Stefan Sylvius Wagner, Marc H¨ oftmann, Eric Upschulte, Sebastian Konietzny, Maike Behrendt, and Stefan Harmeling

URL https://arxiv.org/abs/2102.06810. Tobias Uelwer, Jan Robine, Stefan Sylvius Wagner, Marc H¨ oftmann, Eric Upschulte, Sebastian Konietzny, Maike Behrendt, and Stefan Harmeling. A survey on self-supervised representation learning,

work page arXiv
[29]

URL https://arxiv.org/abs/2308.11455. K. Wagner, J. Reggia, J. Uriagereka, and G. Wilkinson. Progress in the simulation of emergent communica- tion and language. Adaptive Behavior, 11(1):37–69,

work page arXiv
[30]

Yu Wang, Hengrui Zhang, Zhiwei Liu, Liangwei Yang, and Philip S

doi: 10.1177/10597123030111003. Yu Wang, Hengrui Zhang, Zhiwei Liu, Liangwei Yang, and Philip S. Yu. Contrastvae: Contrastive variational autoencoder for sequential recommendation. In Proceedings of the 31st ACM International Conference on Information & Knowledge Management, CIKM ’22, pp. 2056–2066, New York, NY, USA,

work page doi:10.1177/10597123030111003 2056
[31]

ISBN 9781450392365

Association for Computing Machinery. ISBN 9781450392365. doi: 10.1145/3511808.3557268. URL https://doi.org/ 10.1145/3511808.3557268. Han Xiao, Kashif Rasul, and Roland Vollgraf. Fashion-mnist: a novel image dataset for benchmarking machine learning algorithms,

work page doi:10.1145/3511808.3557268
[32]

Fashion-MNIST: a Novel Image Dataset for Benchmarking Machine Learning Algorithms

URL https://arxiv.org/abs/1708.07747. 13 Zhenlin Xu, Marc Niethammer, and Colin Raffel. Compositional generalization in unsupervised composi- tional representation learning: A study on disentanglement and emergent language. International Con- ference on Neural Information Processing Systems ,

work page internal anchor Pith review Pith/arXiv arXiv
[33]

log ηθ MY i=1 vMF(fϕ(xj); µz = gθ(xi), κz) ! − log 1 M # (47) := 1 M MX j=1

A Comparison among referential game, Metropolis-Hastings nam- ing game and SimSiam naming game Aspect Referential Game Metropolis-Hastings (MH) Naming Game SimSiam Naming Game (SSNG) Objective Develop emergent lan- guage (EmLang) to refer to shared objects or concepts, focusing on com- munication accuracy. Develop EmLang through probabilistic updates, opt...

work page 2017
[34]

is a collection of 60,000 color images, each of size 32x32 and belonging to one of 10 different classes with 50,000 training and 10,000 testing images. Model Architecture: • Backbone network: – FashionMNIST Backbone: A custom CNN with two convolutional layers: the first outputs 16 channels (kernel size 4, stride 2, padding 1), and the second doubles the c...

work page 2016

[1] [2]

Contrastive Variational Autoencoder Enhances Salient Features

URL http://arxiv.org/abs/1902.04601. Jyoti Aneja, Alex Schwing, Jan Kautz, and Arash Vahdat. A contrastive learning approach for training variational autoencoder priors. In A. Beygelzimer, Y. Dauphin, P. Liang, and J. Wortman Vaughan (eds.), Advances in Neural Information Processing Systems ,

work page internal anchor Pith review Pith/arXiv arXiv 1902

[2] [3]

Yoshua Bengio, Aaron Courville, and Pascal Vincent

doi: 10.1098/rstb.2019.0307. Yoshua Bengio, Aaron Courville, and Pascal Vincent. Representation learning: A review and new perspec- tives. IEEE transactions on pattern analysis and machine intelligence , 35(8):1798–1828, 2013a. Yoshua Bengio, Nicholas L´ eonard, and Aaron Courville. Estimating or propagating gradients through stochastic neurons for condit...

work page doi:10.1098/rstb.2019.0307 2019

[3] [4]

Henry Brighton and Simon Kirby

doi: 10.1109/ACCESS.2023.3339656. Henry Brighton and Simon Kirby. Understanding linguistic evolution by visualizing the emergence of topo- graphic mappings. Artificial life, 12(2):229–242,

work page doi:10.1109/access.2023.3339656 2023

[4] [5]

Angelo Cangelosi and Domenico Parisi

doi: 10.1162/106454606776073323. Angelo Cangelosi and Domenico Parisi. Computer simulation: A new scientific approach to the study of language evolution. Simulating the Evolution of Language , pp. 3–28,

work page doi:10.1162/106454606776073323

[5] [6]

Emerging properties in self-supervised vision transformers

10 Mathilde Caron, Hugo Touvron, Ishan Misra, Herv´ e Jegou, Julien Mairal, Piotr Bojanowski, and Armand Joulin. Emerging properties in self-supervised vision transformers. In 2021 IEEE/CVF International Conference on Computer Vision (ICCV) , pp. 9630–9640,

work page 2021

[6] [7]

Walk in the cloud: Learning curves for point clouds shape analysis, pp

doi: 10.1109/ICCV48922.2021.00951. Rahma Chaabouni, Eugene Kharitonov, Diane Bouchacourt, Emmanuel Dupoux, and Marco Baroni. Com- positionality and generalization in emergent languages. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics , July

work page doi:10.1109/iccv48922.2021.00951 2021

[7] [8]

Ting Chen, Simon Kornblith, Mohammad Norouzi, and Geoffrey Hinton

doi: 10.4324/9780203014936. Ting Chen, Simon Kornblith, Mohammad Norouzi, and Geoffrey Hinton. A simple framework for contrastive learning of visual representations. InProceedings of the 37th International Conference on Machine Learning (ICML),

work page doi:10.4324/9780203014936

[8] [9]

When does contrastive visual representation learning work? In 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp

Elijah Cole, Xuan Yang, Kimberly Wilber, Oisin Mac Aodha, and Serge Belongie. When does contrastive visual representation learning work? In 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 01–10,

work page 2022

[9] [10]

A ConvNet for the 2020s

doi: 10.1109/CVPR52688.2022.01434. Kevin Denamgana¨ ı, Sondess Missaoui, and James Alfred Walker. Visual referential games further the emer- gence of disentangled representations,

work page doi:10.1109/cvpr52688.2022.01434 2022

[10] [11]

Hiroto Ebara, Tomoaki Nakamura, Akira Taniguchi, and Tadahiro Taniguchi

URL https://arxiv.org/abs/2304.14511. Hiroto Ebara, Tomoaki Nakamura, Akira Taniguchi, and Tadahiro Taniguchi. Multi-agent reinforcement learning with emergent communication using discrete and indifferentiable message. In 2023 15th Inter- national Congress on Advanced Applied Informatics Winter (IIAI-AAI-Winter) , pp. 366–371,

work page arXiv 2023

[11] [12]

doi: 10.1109/IIAI-AAI-Winter61682.2023.00073. Jakob N. Foerster, Yannis M. Assael, Nando de Freitas, and Shimon Whiteson. Learning to communicate with deep multi-agent reinforcement learning,

work page doi:10.1109/iiai-aai-winter61682.2023.00073 2023

[12] [13]

Learning to Communicate with Deep Multi-Agent Reinforcement Learning

URL https://arxiv.org/abs/1605.06676. Karl Friston, Rosalyn J Moran, Yukie Nagai, Tadahiro Taniguchi, Hiroaki Gomi, and Josh Tenenbaum. World model learning and inference. Neural networks: the official journal of the International Neural Network Society, 144:573–590,

work page internal anchor Pith review Pith/arXiv arXiv

[13] [14]

Lukas Galke, Yoav Ram, and Limor Raviv

doi: 10.1016/j.neunet.2021.09.011. Lukas Galke, Yoav Ram, and Limor Raviv. Emergent communication for understanding human language evolution: What’s missing? arXiv,

work page doi:10.1016/j.neunet.2021.09.011 2021

[14] [15]

Serhii Havrylov and Ivan Titov

doi: 10.3389/ frobt.2019.00134. Serhii Havrylov and Ivan Titov. Emergence of language with multi-agent games: Learning to communicate with sequences of symbols. In Advances in Neural Information Processing Systems 30 , pp. 2146–2156,

work page arXiv 2019

[15] [16]

Liesen, Z

ISBN 9780199682737. doi: 10.1093/ acprof:oso/9780199682737.001.0001. Jun Inukai, Tadahiro Taniguchi, Akira Taniguchi, and Yoshinobu Hagiwara. Recursive metropolis-hastings naming game: Symbol emergence in a multi-agent system based on probabilistic generative models. Fron- tiers in Artificial Intelligence ,

work page arXiv

[16] [17]

doi: 10.3389/frai.2023.1229127

ISSN 2624-8212. doi: 10.3389/frai.2023.1229127. Eric Jang, Shixiang Gu, and Ben Poole. Categorical reparameterization with gumbel-softmax. In Interna- tional Conference on Learning Representations,

work page doi:10.3389/frai.2023.1229127 2023

[17] [18]

Auto-Encoding Variational Bayes

Diederik P Kingma and Max Welling. Auto-encoding variational bayes. arXiv, https://arxiv.org/abs/1312.6114,

work page internal anchor Pith review Pith/arXiv arXiv

[18] [19]

Angeliki Lazaridou, Alexander Peysakhovich, and Marco Baroni

URL https://arxiv.org/abs/2006.02419. Angeliki Lazaridou, Alexander Peysakhovich, and Marco Baroni. Multi-agent cooperation and the emergence of (natural) language. In The International Conference on Learning Representations (ICLR) ,

work page arXiv 2006

[19] [20]

doi: 10.1109/access.2020.3031549

ISSN 2169-3536. doi: 10.1109/access.2020.3031549. URL http://dx.doi.org/10.1109/ACCESS.2020.3031549. Yann LeCun, Yoshua Bengio, and Geoffrey Hinton. Deep learning. Nature, 521(7553):436–444,

work page doi:10.1109/access.2020.3031549 2020

[20] [21]

Cr-vae: Contrastive regularization on variational autoencoders for preventing posterior collapse

Fotios Lygerakis and Elmar Rueckert. Cr-vae: Contrastive regularization on variational autoencoders for preventing posterior collapse. In 2023 7th Asian Conference on Artificial Intelligence Technology (ACAIT), pp. 427–437,

work page 2023

[21] [22]

Ryota Okumura, Tadahiro Taniguchi, Yosinobu Hagiwara, and Akira Taniguchi

doi: 10.48550/arxiv.2203.11437. Ryota Okumura, Tadahiro Taniguchi, Yosinobu Hagiwara, and Akira Taniguchi. Metropolis-hastings algo- rithm in joint-attention naming game: Experimental semiotics study. Frontiers in Artificial Intelligence , 6,

work page doi:10.48550/arxiv.2203.11437

[22] [23]

doi: 10.3389/frai.2023.1235231

ISSN 2624-8212. doi: 10.3389/frai.2023.1235231. Jannik Peters, Constantin Waubert de Puiseau, Hasan Tercan, Arya Gopikrishnan, Gustavo Adolpho Lu- cas De Carvalho, Christian Bitter, and Tobias Meisen. A survey on emergent language,

work page doi:10.3389/frai.2023.1235231 2023

[23] [24]

Mathieu Rita, Paul Michel, Rahma Chaabouni, Olivier Pietquin, Emmanuel Dupoux, and Florian Strub

URL https://arxiv.org/abs/2409.02645. Mathieu Rita, Paul Michel, Rahma Chaabouni, Olivier Pietquin, Emmanuel Dupoux, and Florian Strub. Language evolution with deep learning,

work page arXiv

[24] [25]

, author Michel, P

URL https://arxiv.org/abs/2403.11958. Claude E. Shannon and Warren Weaver. The Mathematical Theory of Communication. University of Illinois Press, Urbana, IL,

work page arXiv

[25] [26]

Tadahiro Taniguchi, Takayuki Nagai, Tomoaki Nakamura, Naoto Iwahashi, Tetsuya Ogata, and Hideki Asoh

doi: 10.3389/frobt.2024.1353870. Tadahiro Taniguchi, Takayuki Nagai, Tomoaki Nakamura, Naoto Iwahashi, Tetsuya Ogata, and Hideki Asoh. Symbol emergence in robotics: a survey. Advanced Robotics, 30(11-12):706–728,

work page doi:10.3389/frobt.2024.1353870 2024

[26] [27]

World mod- els and predictive coding for cognitive and developmental robotics: frontiers and challenges

Tadahiro Taniguchi, Shingo Murata, Masahiro Suzuki, Dimitri Ognibene, Pablo Lanillos, Emre Ugur, Lorenzo Jamone, Tomoaki Nakamura, Alejandra Ciria, Bruno Lara, and Giovanni Pezzulo. World mod- els and predictive coding for cognitive and developmental robotics: frontiers and challenges. Advanced Robotics, 37(13), 2023a. Tadahiro Taniguchi, Yuto Yoshida, Yu...

work page doi:10.1080/01691864.2023.2260856 2023

[27] [28]

Tobias Uelwer, Jan Robine, Stefan Sylvius Wagner, Marc H¨ oftmann, Eric Upschulte, Sebastian Konietzny, Maike Behrendt, and Stefan Harmeling

URL https://arxiv.org/abs/2102.06810. Tobias Uelwer, Jan Robine, Stefan Sylvius Wagner, Marc H¨ oftmann, Eric Upschulte, Sebastian Konietzny, Maike Behrendt, and Stefan Harmeling. A survey on self-supervised representation learning,

work page arXiv

[28] [29]

URL https://arxiv.org/abs/2308.11455. K. Wagner, J. Reggia, J. Uriagereka, and G. Wilkinson. Progress in the simulation of emergent communica- tion and language. Adaptive Behavior, 11(1):37–69,

work page arXiv

[29] [30]

Yu Wang, Hengrui Zhang, Zhiwei Liu, Liangwei Yang, and Philip S

doi: 10.1177/10597123030111003. Yu Wang, Hengrui Zhang, Zhiwei Liu, Liangwei Yang, and Philip S. Yu. Contrastvae: Contrastive variational autoencoder for sequential recommendation. In Proceedings of the 31st ACM International Conference on Information & Knowledge Management, CIKM ’22, pp. 2056–2066, New York, NY, USA,

work page doi:10.1177/10597123030111003 2056

[30] [31]

ISBN 9781450392365

Association for Computing Machinery. ISBN 9781450392365. doi: 10.1145/3511808.3557268. URL https://doi.org/ 10.1145/3511808.3557268. Han Xiao, Kashif Rasul, and Roland Vollgraf. Fashion-mnist: a novel image dataset for benchmarking machine learning algorithms,

work page doi:10.1145/3511808.3557268

[31] [32]

Fashion-MNIST: a Novel Image Dataset for Benchmarking Machine Learning Algorithms

URL https://arxiv.org/abs/1708.07747. 13 Zhenlin Xu, Marc Niethammer, and Colin Raffel. Compositional generalization in unsupervised composi- tional representation learning: A study on disentanglement and emergent language. International Con- ference on Neural Information Processing Systems ,

work page internal anchor Pith review Pith/arXiv arXiv

[32] [33]

log ηθ MY i=1 vMF(fϕ(xj); µz = gθ(xi), κz) ! − log 1 M # (47) := 1 M MX j=1

A Comparison among referential game, Metropolis-Hastings nam- ing game and SimSiam naming game Aspect Referential Game Metropolis-Hastings (MH) Naming Game SimSiam Naming Game (SSNG) Objective Develop emergent lan- guage (EmLang) to refer to shared objects or concepts, focusing on com- munication accuracy. Develop EmLang through probabilistic updates, opt...

work page 2017

[33] [34]

is a collection of 60,000 color images, each of size 32x32 and belonging to one of 10 different classes with 50,000 training and 10,000 testing images. Model Architecture: • Backbone network: – FashionMNIST Backbone: A custom CNN with two convolutional layers: the first outputs 16 channels (kernel size 4, stride 2, padding 1), and the second doubles the c...

work page 2016