pith. sign in

arxiv: 1905.11634 · v1 · pith:34RMPORGnew · submitted 2019-05-28 · 💻 cs.CV · cs.AI· cs.LG

LatentGNN: Learning Efficient Non-local Relations for Visual Recognition

classification 💻 cs.CV cs.AIcs.LG
keywords graphvisualnon-localcontextrecognitionrepresentationaffinitycomplexity
0
0 comments X
read the original abstract

Capturing long-range dependencies in feature representations is crucial for many visual recognition tasks. Despite recent successes of deep convolutional networks, it remains challenging to model non-local context relations between visual features. A promising strategy is to model the feature context by a fully-connected graph neural network (GNN), which augments traditional convolutional features with an estimated non-local context representation. However, most GNN-based approaches require computing a dense graph affinity matrix and hence have difficulty in scaling up to tackle complex real-world visual problems. In this work, we propose an efficient and yet flexible non-local relation representation based on a novel class of graph neural networks. Our key idea is to introduce a latent space to reduce the complexity of graph, which allows us to use a low-rank representation for the graph affinity matrix and to achieve a linear complexity in computation. Extensive experimental evaluations on three major visual recognition tasks show that our method outperforms the prior works with a large margin while maintaining a low computation cost.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.