PiCANet: Learning Pixel-wise Contextual Attention for Saliency Detection

Junwei Han; Ming-Hsuan Yang; Nian Liu

arxiv: 1708.06433 · v2 · pith:RZ7UGSRQnew · submitted 2017-08-21 · 💻 cs.CV

PiCANet: Learning Pixel-wise Contextual Attention for Saliency Detection

Nian Liu , Junwei Han , Ming-Hsuan Yang This is my paper

classification 💻 cs.CV

keywords contextualattentionglobalsaliencycontextdetectionlocalpicanet

0 comments

read the original abstract

Contexts play an important role in the saliency detection task. However, given a context region, not all contextual information is helpful for the final task. In this paper, we propose a novel pixel-wise contextual attention network, i.e., the PiCANet, to learn to selectively attend to informative context locations for each pixel. Specifically, for each pixel, it can generate an attention map in which each attention weight corresponds to the contextual relevance at each context location. An attended contextual feature can then be constructed by selectively aggregating the contextual information. We formulate the proposed PiCANet in both global and local forms to attend to global and local contexts, respectively. Both models are fully differentiable and can be embedded into CNNs for joint training. We also incorporate the proposed models with the U-Net architecture to detect salient objects. Extensive experiments show that the proposed PiCANets can consistently improve saliency detection performance. The global and local PiCANets facilitate learning global contrast and homogeneousness, respectively. As a result, our saliency model can detect salient objects more accurately and uniformly, thus performing favorably against the state-of-the-art methods.

This paper has not been read by Pith yet.

discussion (0)

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

OGNet: Salient Object Detection with Output-guided Attention Module
cs.CV 2019-07 unverdicted novelty 5.0

OGNet proposes an output-guided attention module from multi-scale outputs and an intractable area F-measure loss to enhance salient object detection in edges and confusing areas while remaining lightweight.