pith. machine review for the scientific record. sign in

arxiv: 1903.05168 · v1 · submitted 2019-03-12 · 💻 cs.LG · cs.AI· cs.CL· stat.ML

Recognition: unknown

On the Pitfalls of Measuring Emergent Communication

Authors on Pith no claims yet
classification 💻 cs.LG cs.AIcs.CLstat.ML
keywords communicationagentsemergentmeasuringmetricsagentallowchannel
0
0 comments X
read the original abstract

How do we know if communication is emerging in a multi-agent system? The vast majority of recent papers on emergent communication show that adding a communication channel leads to an increase in reward or task success. This is a useful indicator, but provides only a coarse measure of the agent's learned communication abilities. As we move towards more complex environments, it becomes imperative to have a set of finer tools that allow qualitative and quantitative insights into the emergence of communication. This may be especially useful to allow humans to monitor agents' behaviour, whether for fault detection, assessing performance, or even building trust. In this paper, we examine a few intuitive existing metrics for measuring communication, and show that they can be misleading. Specifically, by training deep reinforcement learning agents to play simple matrix games augmented with a communication channel, we find a scenario where agents appear to communicate (their messages provide information about their subsequent action), and yet the messages do not impact the environment or other agent in any way. We explain this phenomenon using ablation studies and by visualizing the representations of the learned policies. We also survey some commonly used metrics for measuring emergent communication, and provide recommendations as to when these metrics should be used.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. Wireless Communication Enhanced Value Decomposition for Multi-Agent Reinforcement Learning

    cs.LG 2026-04 unverdicted novelty 7.0

    CLOVER augments value decomposition with a GNN mixer whose weights depend on the realized wireless communication graph, proving permutation invariance, monotonicity, and greater expressiveness than QMIX while showing ...