pith. sign in

arxiv: 1502.03240 · v3 · pith:4KOSR4GOnew · submitted 2015-02-11 · 💻 cs.CV

Conditional Random Fields as Recurrent Neural Networks

classification 💻 cs.CV
keywords deepnetworkneuralcnnsconditionalfieldsimagenetworks
0
0 comments X
read the original abstract

Pixel-level labelling tasks, such as semantic segmentation, play a central role in image understanding. Recent approaches have attempted to harness the capabilities of deep learning techniques for image recognition to tackle pixel-level labelling tasks. One central issue in this methodology is the limited capacity of deep learning techniques to delineate visual objects. To solve this problem, we introduce a new form of convolutional neural network that combines the strengths of Convolutional Neural Networks (CNNs) and Conditional Random Fields (CRFs)-based probabilistic graphical modelling. To this end, we formulate mean-field approximate inference for the Conditional Random Fields with Gaussian pairwise potentials as Recurrent Neural Networks. This network, called CRF-RNN, is then plugged in as a part of a CNN to obtain a deep network that has desirable properties of both CNNs and CRFs. Importantly, our system fully integrates CRF modelling with CNNs, making it possible to train the whole deep network end-to-end with the usual back-propagation algorithm, avoiding offline post-processing methods for object delineation. We apply the proposed method to the problem of semantic image segmentation, obtaining top results on the challenging Pascal VOC 2012 segmentation benchmark.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. Multi-Person tracking by multi-scale detection in Basketball scenarios

    cs.CV 2019-07 unverdicted novelty 4.0

    A multi-scale detection pipeline extracts geometric and content features to produce multi-person tracking in basketball videos, evaluated on a custom dataset with standard detection and tracking metrics.