Unifilarisation of stochastic Mealy machines is an instance of coalgebraic determinisation over monads with support structure, producing causal stochastic behaviours rather than Moore-style output distributions.
Learning Awareness Models
2 Pith papers cite this work. Polarity classification is still indexing.
abstract
We consider the setting of an agent with a fixed body interacting with an unknown and uncertain external world. We show that models trained to predict proprioceptive information about the agent's body come to represent objects in the external world. In spite of being trained with only internally available signals, these dynamic body models come to represent external objects through the necessity of predicting their effects on the agent's own body. That is, the model learns holistic persistent representations of objects in the world, even though the only training signals are body signals. Our dynamics model is able to successfully predict distributions over 132 sensor readings over 100 steps into the future and we demonstrate that even when the body is no longer in contact with an object, the latent variables of the dynamics model continue to represent its shape. We show that active data collection by maximizing the entropy of predictions about the body---touch sensors, proprioception and vestibular information---leads to learning of dynamic models that show superior performance when used for control. We also collect data from a real robotic hand and show that the same models can be used to answer questions about properties of objects in the real world. Videos with qualitative results of our models are available at https://goo.gl/mZuqAV.
verdicts
UNVERDICTED 2representative citing papers
Multi-step predictive generative models form stable belief states capturing environment layout and agent pose, yielding higher data efficiency on RL tasks than model-free agents.
citing papers explorer
-
Bayesian updates from coalgebraic determinisation
Unifilarisation of stochastic Mealy machines is an instance of coalgebraic determinisation over monads with support structure, producing causal stochastic behaviours rather than Moore-style output distributions.
-
Shaping Belief States with Generative Environment Models for RL
Multi-step predictive generative models form stable belief states capturing environment layout and agent pose, yielding higher data efficiency on RL tasks than model-free agents.