pith. sign in

arxiv: 1702.05068 · v1 · pith:HLYNDCHEnew · submitted 2017-02-16 · 💻 cs.LG · cs.CV

Discovering objects and their relations from entangled scene representations

classification 💻 cs.LG cs.CV
keywords objectslearningrelationrelationsscenearchitecturedatadescription
0
0 comments X
read the original abstract

Our world can be succinctly and compactly described as structured scenes of objects and relations. A typical room, for example, contains salient objects such as tables, chairs and books, and these objects typically relate to each other by their underlying causes and semantics. This gives rise to correlated features, such as position, function and shape. Humans exploit knowledge of objects and their relations for learning a wide spectrum of tasks, and more generally when learning the structure underlying observed data. In this work, we introduce relation networks (RNs) - a general purpose neural network architecture for object-relation reasoning. We show that RNs are capable of learning object relations from scene description data. Furthermore, we show that RNs can act as a bottleneck that induces the factorization of objects from entangled scene description inputs, and from distributed deep representations of scene images provided by a variational autoencoder. The model can also be used in conjunction with differentiable memory mechanisms for implicit relation discovery in one-shot learning tasks. Our results suggest that relation networks are a potentially powerful architecture for solving a variety of problems that require object relation reasoning.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. Graph Neural Based End-to-end Data Association Framework for Online Multiple-Object Tracking

    cs.CV 2019-07 unverdicted novelty 6.0

    A graph neural network framework learns affinities from appearance and motion then solves bipartite matching for online multiple-object tracking.