Understanding Deep Learning Techniques for Image Segmentation

Ishita Das; Nibaran Das; Swarnendu Ghosh; Ujjwal Maulik

Reviewed by Pith at T0; open to challenge.

T0 means a machine referee read the full paper against a public rubric. The mark states how deep the mechanical check went, never who wrote it. the ladder, T0–T4 →

Challenge this review Re-run · record.json Download PDF Read on arXiv ↗

T0 review · grok-4.3

Logical grouping of deep learning segmentation algorithms by their unique features gives readers a clearer view of how each works.

2026-05-24 21:44 UTC pith:RGAAAQVN

load-bearing objection This is a 2019 survey that organizes DL segmentation methods with intuitive explanations but adds no new results or analysis.

arxiv 1907.06119 v1 pith:RGAAAQVN submitted 2019-07-13 cs.CV cs.LGcs.NE

Understanding Deep Learning Techniques for Image Segmentation

Swarnendu Ghosh , Nibaran Das , Ishita Das , Ujjwal Maulik This is my paper

classification cs.CV cs.LGcs.NE

keywords deep learningimage segmentationconvolutional neural networksreviewadversarial networksautoencodersobject segmentation

verification ladder T0 review T1 audit T2 compute T3 formal T4 reserved

The pith

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper sets out to supply an intuitive understanding of the main deep learning techniques that have shaped image segmentation. It starts with traditional segmentation methods, traces the impact of deep learning, and then places the major algorithms into logical categories while explaining what each category adds. A reader who wants to make sense of the many available networks would value the step-by-step descriptions that aim to show the internal steps of these processes. The review covers convolutional, recurrent, adversarial, and autoencoder approaches applied to segmentation tasks.

Core claim

The paper claims that by moving from traditional image segmentation methods through the influence of deep learning and then logically categorizing the major algorithms with focused paragraphs on their distinctive contributions, readers gain an improved ability to visualize the internal dynamics of these techniques.

What carries the argument

Logical categorization of segmentation algorithms by their unique contributions, presented after a progression from traditional methods to deep learning architectures.

Load-bearing premise

The chosen techniques and papers form a representative sample of the field and the explanations stay accurate without selection bias or outdated framing.

What would settle it

A reader new to the topic who cannot correctly describe the internal steps of a reviewed algorithm after reading the categorized sections, or who finds major current techniques omitted, would indicate the provided understanding falls short.

Watch this falsifier — get emailed when new claim-graph text bears on it.

If this is right

The shift from traditional segmentation to deep learning approaches becomes easier to follow through the described progression.
Readers can visualize the internal steps of networks such as convolutional and adversarial models in segmentation contexts.
The variety of deep learning techniques applied to detection, localization, and segmentation tasks is presented in grouped form.
An analytical view of the field reduces the sense of being overwhelmed by the number of available methods.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The grouping could be used as a baseline when new architectures appear and need placement in similar categories.
Linking the explanations to concrete datasets or benchmarks might show which category performs best under different image conditions.
The intuitive style could support introductory teaching materials that introduce segmentation without requiring prior network expertise.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit.

Desk Editor's Note

This is a 2019 survey that organizes DL segmentation methods with intuitive explanations but adds no new results or analysis.

read the letter

This 2019 survey walks through deep learning methods for image segmentation by starting with traditional approaches and then grouping the main DL techniques with short descriptions of what each contributes. The structure moves from basic CNNs through recurrent nets, adversarial models, and autoencoders, with paragraphs that try to make the internal workings easier to picture. That kind of organized walkthrough can give a newcomer a usable map of the techniques that were prominent at the time. The paper does this job without claiming any new method or experiment, which matches what the abstract states. The value sits entirely in the synthesis and the choice to keep explanations intuitive rather than deeply technical. The main limitation is the usual one for surveys: the categories and summaries depend on which papers were picked and how accurately they are described. A 2019 cutoff also leaves out later developments that changed the field. No derivations or data are present to check, so any factual slip in the technical paragraphs would be hard to catch without going back to the cited sources. This paper is aimed at students or practitioners who want an entry-level overview rather than researchers already active in the area. It does not resolve open questions or enable new work. I would send it for peer review at a journal that accepts surveys, because the organization is clear enough to be useful for its intended audience even if the overall contribution stays modest.

Referee Report

0 major / 2 minor

Summary. The manuscript is a survey paper that reviews traditional image segmentation methods before discussing the impact of deep learning on the domain. It logically categorizes major deep learning-based segmentation algorithms (including convolutional, recurrent, adversarial, and autoencoder-based networks) and provides intuitive explanations of their unique contributions, with the goal of helping readers visualize internal dynamics.

Significance. As an expository survey without original derivations, empirical results, or novel claims, the paper's value lies in synthesis and accessibility. If the categorizations accurately reflect the cited literature and the explanations are balanced, it could serve as a useful entry point for researchers entering the image segmentation field circa 2019. No machine-checked proofs, reproducible code, or falsifiable predictions are present.

minor comments (2)

[Abstract] Abstract: The claim that 'many new deep learning techniques have surfaced with respect to image segmentation techniques' would benefit from a brief statement of the paper's temporal scope (e.g., coverage up to mid-2019) to set reader expectations for completeness.
The manuscript should include a table or structured list summarizing the categorized algorithms, their key architectural differences, and representative citations to improve scannability.

Simulated Author's Rebuttal

0 responses · 0 unresolved

We thank the referee for the constructive summary and recommendation of minor revision. The assessment correctly identifies the manuscript as an expository survey focused on synthesis and intuitive explanations rather than novel claims or experiments. No specific major comments were raised in the report, so our response addresses the overall evaluation.

Circularity Check

0 steps flagged

No significant circularity; expository survey with no derivations

full rationale

The paper is a survey that categorizes and intuitively explains existing deep learning techniques for image segmentation, starting from traditional methods and progressing to DL approaches without presenting any original derivations, equations, predictions, or fitted parameters. No self-citations form load-bearing premises, no uniqueness theorems are invoked, and no results reduce to inputs by construction. The central contribution is descriptive categorization, which is self-contained against external benchmarks and contains no internal derivation chain to inspect for circularity.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

This is a survey paper. No new mathematical claims, fitted parameters, axioms, or invented entities are introduced.

pith-pipeline@v0.9.0 · 5719 in / 1021 out tokens · 15899 ms · 2026-05-24T21:44:47.761835+00:00 · methodology

0 comments

read the original abstract

The machine learning community has been overwhelmed by a plethora of deep learning based approaches. Many challenging computer vision tasks such as detection, localization, recognition and segmentation of objects in unconstrained environment are being efficiently addressed by various types of deep neural networks like convolutional neural networks, recurrent networks, adversarial networks, autoencoders and so on. While there have been plenty of analytical studies regarding the object detection or recognition domain, many new deep learning techniques have surfaced with respect to image segmentation techniques. This paper approaches these various deep learning techniques of image segmentation from an analytical perspective. The main goal of this work is to provide an intuitive understanding of the major techniques that has made significant contribution to the image segmentation domain. Starting from some of the traditional image segmentation approaches, the paper progresses describing the effect deep learning had on the image segmentation domain. Thereafter, most of the major segmentation algorithms have been logically categorized with paragraphs dedicated to their unique contribution. With an ample amount of intuitive explanations, the reader is expected to have an improved ability to visualize the internal dynamics of these processes.

Figures

Figures reproduced from arXiv: 1907.06119 by Ishita Das, Nibaran Das, Swarnendu Ghosh, Ujjwal Maulik.

**Figure 2.** Figure 2: Legends for subsequent diagrams of popular deep learning architec [PITH_FULL_IMAGE:figures/full_fig_p004_2.png] view at source ↗

**Figure 3.** Figure 3: Input image and sample activation maps from a typical CNN. (Top [PITH_FULL_IMAGE:figures/full_fig_p007_3.png] view at source ↗

**Figure 4.** Figure 4: A Fully convolutional network with image segmentation with concate [PITH_FULL_IMAGE:figures/full_fig_p008_4.png] view at source ↗

**Figure 5.** Figure 5: The Deepmask Network shared feature representation. One of them created a pixel level classification of or a probabilistic mask for the central object and the second branch generated a score corresponding to the object recognition accuracy. The network coupled with sliding windows of sixteen strides to create segments of objects at various locations of the image, whereas the score helped in identifying whi… view at source ↗

**Figure 6.** Figure 6: The Sharpmask Network using convolutional refinements at every steps to generate high resolution masks (Refer fig. 6). The sharpmask scored an average recall of 39.3 which beats deepmask, which scored 36.6 on the MS COCO Segmentation Dataset. 4.1.2 Region proposal networks Another similar wing that started developing with image segmentation was object localization. Task such as this involved locating spec… view at source ↗

**Figure 7.** Figure 7: The RCNN Family of localization and segmentation networks [PITH_FULL_IMAGE:figures/full_fig_p012_7.png] view at source ↗

**Figure 8.** Figure 8: Normal convolution(red) vs. Atrous or Dilated convolution(green) [PITH_FULL_IMAGE:figures/full_fig_p013_8.png] view at source ↗

**Figure 9.** Figure 9: DeepLab Architecture as compared to a standard VGG net(top) along [PITH_FULL_IMAGE:figures/full_fig_p013_9.png] view at source ↗

**Figure 10.** Figure 10: A schematic representation of the PSPNet [PITH_FULL_IMAGE:figures/full_fig_p017_10.png] view at source ↗

**Figure 11.** Figure 11: A schematic representation of the RefineNet [PITH_FULL_IMAGE:figures/full_fig_p018_11.png] view at source ↗

**Figure 12.** Figure 12: (Left) Normal Convolution with unit stride. (Right) Transposed [PITH_FULL_IMAGE:figures/full_fig_p019_12.png] view at source ↗

**Figure 13.** Figure 13: Architecture of U-Net 4.2.2 Forwarding pooling indices Max-pooling has been the most commonly used technique for reducing the size of the activation maps for various reasons. The activations represent of the response of the region of an image to a specific kernel. In max pooling, a region of pixels is compressed to single value by considering only the maximum response obtained within that region. If a typ… view at source ↗

**Figure 14.** Figure 14: Forwarding pooling indices to maintain spatial relationship during [PITH_FULL_IMAGE:figures/full_fig_p021_14.png] view at source ↗

**Figure 15.** Figure 15: Architecture of SegNet two networks a generative network and a discriminator network. The generator G tries to generate images,,./ like the ones from the training dataset using a noisy input prior distribution called pz(z). The network G(z; θg) represents a differentiable function represented by a neural network with weights θg. A discriminator network tries to correctly guess whether an input data is fr… view at source ↗

**Figure 16.** Figure 16: Adversarial learning model for image segmentation [PITH_FULL_IMAGE:figures/full_fig_p023_16.png] view at source ↗

**Figure 17.** Figure 17: Sequential Models: (topleft) Generic Representation for [PITH_FULL_IMAGE:figures/full_fig_p037_17.png] view at source ↗

**Figure 18.** Figure 18: Generic representation of autoencoder with fully connected linear [PITH_FULL_IMAGE:figures/full_fig_p037_18.png] view at source ↗

**Figure 19.** Figure 19: A typical convolutional neural network Generative Models : Generative models are probably one of the latest attractions of deep learning in computer vision. While sequential models like long short term memory or gated recurrent units are able to generate sequence of vectorized elements, in computer vision it is much more difficult due to the spatial complexities. Lately various methodologies like variat… view at source ↗

**Figure 20.** Figure 20: A block diagram of generative adversarial network [PITH_FULL_IMAGE:figures/full_fig_p039_20.png] view at source ↗

discussion (0)

Reference graph

Works this paper leans on

221 extracted references · 221 canonical work pages · 33 internal anchors

[1]

Slic superpixels compared to state-of-the-art superpixel methods

Achanta, R., Shaji, A., Smith, K., Lucchi, A., Fua, P., S¨usstrunk, S., et al. Slic superpixels compared to state-of-the-art superpixel methods. IEEE transactions on pattern analysis and machine intelligence 34, 11 (2012), 2274–2282

work page 2012
[2]

H., and Seitz, S

Agarwala, A., Hertzmann, A., Salesin, D. H., and Seitz, S. M. Keyframe-based tracking for rotoscoping and animation. In ACM Trans- actions on Graphics (ToG) (2004), vol. 23, ACM, pp. 584–591

work page 2004
[3]

Ahmad, J., Mehmood, I., and Baik, S. W. Eﬃcient object-based surveillance image search using spatial pooling of convolutional features. Journal of Visual Communication and Image Representation 45 (2017), 62–76

work page 2017
[4]

I., Zhou, J., Liew, A

Alam, F. I., Zhou, J., Liew, A. W.-C., and Jia, X. Crf learning with cnn features for hyperspectral image segmentation. In Geoscience and Remote Sensing Symposium (IGARSS), 2016 IEEE International (2016), IEEE, pp. 6890–6893

work page 2016
[5]

Albiol, A., Torres, L., and Delp, E. J. An unsupervised color image segmentation algorithm for face detection applications. In Image Processing, 2001. Proceedings. 2001 International Conference on (2001), vol. 2, IEEE, pp. 681–684

work page 2001
[6]

Classiﬁcation of breast cancer histology images using convolutional neural networks

Ara´ujo, T., Aresta, G., Castro, E., Rouco, J., Aguiar, P., Eloy, C., Pol ´onia, A., and Campilho, A. Classiﬁcation of breast cancer histology images using convolutional neural networks. PloS one 12 , 6 (2017), e0177544

work page 2017
[7]

Performance com- parison of fpga, gpu and cpu in image processing

Asano, S., Maruyama, T., and Yamaguchi, Y. Performance com- parison of fpga, gpu and cpu in image processing. In Field programmable logic and applications, 2009. fpl 2009. international conference on (2009), IEEE, pp. 126–131

work page 2009
[8]

A quality analysis of openstreetmap data

Ather, A. A quality analysis of openstreetmap data. ME Thesis, Uni- versity College London, London, UK 22 (2009). 39

work page 2009
[9]

IEEE transactions on pattern analysis and machine intelligence 39 , 12 (2017), 2481–2495

Badrinarayanan, V., Kendall, A., and Cipolla, R.Segnet: A deep convolutional encoder-decoder architecture for image segmentation. IEEE transactions on pattern analysis and machine intelligence 39 , 12 (2017), 2481–2495

work page 2017
[10]

IEEE transactions on Geoscience and Remote Sensing 45, 5 (2007), 1506– 1511

Bandyopadhyay, S., Maulik, U., and Mukhopadhyay, A.Multiob- jective genetic clustering for pixel classiﬁcation in remote sensing imagery. IEEE transactions on Geoscience and Remote Sensing 45, 5 (2007), 1506– 1511

work page 2007
[11]

High spatial resolution satellite imagery, dem derivatives, and image segmentation for the detec- tion of mass wasting processes

Barlow, J., Franklin, S., and Martin, Y. High spatial resolution satellite imagery, dem derivatives, and image segmentation for the detec- tion of mass wasting processes. Photogrammetric Engineering and Remote Sensing 72, 6 (2006), 687–692

work page 2006
[12]

Color-and texture-based image segmentation using em and its application to content- based image retrieval

Belongie, S., Carson, C., Greenspan, H., and Malik, J. Color-and texture-based image segmentation using em and its application to content- based image retrieval. In Computer Vision, 1998. Sixth International Conference on (1998), IEEE, pp. 675–682

work page 1998
[13]

Greedy layer-wise training of deep networks

Bengio, Y., Lamblin, P., Popovici, D., and Larochelle, H. Greedy layer-wise training of deep networks. In Advances in neural infor- mation processing systems (2007), pp. 153–160

work page 2007
[14]

Learning long-term de- pendencies with gradient descent is diﬃcult

Bengio, Y., Simard, P., and Frasconi, P. Learning long-term de- pendencies with gradient descent is diﬃcult. IEEE transactions on neural networks 5, 2 (1994), 157–166

work page 1994
[15]

Large scale visual recognition challenge (ilsvrc), 2010

Berg, A., Deng, J., and Fei-Fei, L. Large scale visual recognition challenge (ilsvrc), 2010. URL http://www. image-net. org/challenges/LSVRC 3 (2010)

work page 2010
[16]

C., Ehrlich, R., and Full, W

Bezdek, J. C., Ehrlich, R., and Full, W. Fcm: The fuzzy c-means clustering algorithm. Computers and Geosciences 10, 2-3 (1984), 191–203

work page 1984
[17]

S., Fonseca, L

Bins, L. S., Fonseca, L. G., Erthal, G. J., and Ii, F. M. Satellite imagery segmentation: a region growing approach. Simp´ osio Brasileiro de Sensoriamento Remoto 8 , 1996 (1996), 677–680

work page 1996
[18]

What is a salient object? a dataset and a baseline model for salient object detection

Borji, A. What is a salient object? a dataset and a baseline model for salient object detection. IEEE Transactions on Image Processing 24 , 2 (2015), 742–756

work page 2015
[19]

Salient Object Detection: A Survey

Borji, A., Cheng, M.-M., Hou, Q., Jiang, H., and Li, J. Salient object detection: A survey. arXiv preprint arXiv:1411.5878 (2014)

work page internal anchor Pith review Pith/arXiv arXiv 2014
[20]

Salient object de- tection: A benchmark

Borji, A., Cheng, M.-M., Jiang, H., and Li, J. Salient object de- tection: A benchmark. IEEE Transactions on Image Processing 24 , 12 (2015), 5706–5722. 40

work page 2015
[21]

Fast approximate energy minimization via graph cuts

Boykov, Y., Veksler, O., and Zabih, R. Fast approximate energy minimization via graph cuts. IEEE Transactions on pattern analysis and machine intelligence 23 , 11 (2001), 1222–1239

work page 2001
[22]

Y., and Jolly, M.-P

Boykov, Y. Y., and Jolly, M.-P. Interactive graph cuts for optimal boundary & region segmentation of objects in nd images. In Computer Vision, 2001. ICCV 2001. Proceedings. Eighth IEEE International Con- ference on (2001), vol. 1, IEEE, pp. 105–112

work page 2001
[23]

J., Fauqueur, J., and Cipolla, R

Brostow, G. J., Fauqueur, J., and Cipolla, R. Semantic object classes in video: A high-deﬁnition ground truth database. Pattern Recog- nition Letters 30 , 2 (2009), 88–97

work page 2009
[24]

D., and Ray, L

Cahill, N. D., and Ray, L. A. Method and system for compositing images to produce a cropped image, Jan. 9 2007. US Patent 7,162,102

work page 2007
[25]

L., Magrath, E., Gherman, A., Button, J., Nguyen, J., Bazin, P.-L., Calabresi, P

Carass, A., Roy, S., Jog, A., Cuzzocreo, J. L., Magrath, E., Gherman, A., Button, J., Nguyen, J., Bazin, P.-L., Calabresi, P. A., et al. Longitudinal multiple sclerosis lesion segmentation data resource. Data in brief 12 (2017), 346–350

work page 2017
[26]

In Proceedings of the IEEE Confer- ence on Computer Vision and Pattern Recognition (2017), pp

Castrejon, L., Kundu, K., Urtasun, R., and Fidler, S.Annotating object instances with a polygon-rnn. In Proceedings of the IEEE Confer- ence on Computer Vision and Pattern Recognition (2017), pp. 5230–5238

work page 2017
[27]

Exploiting the self-organizing map for medical image segmentation

Chang, P.-L., and Teng, W.-G. Exploiting the self-organizing map for medical image segmentation. In Computer-Based Medical Systems, 2007. CBMS’07. Twentieth IEEE International Symposium on (2007), IEEE, pp. 281–288

work page 2007
[28]

Chen, J., Yang, L., Zhang, Y., Alber, M., and Chen, D. Z. Com- bining fully convolutional and recurrent neural networks for 3d biomedical image segmentation. In Advances in Neural Information Processing Sys- tems (2016), pp. 3036–3044

work page 2016
[29]

Chen, L.-C., Papandreou, G., Kokkinos, I., Murphy, K., and Yuille, A. L. Semantic image segmentation with deep convolutional nets and fully connected crfs. arXiv preprint arXiv:1412.7062 (2014)

work page internal anchor Pith review Pith/arXiv arXiv 2014
[30]

Chen, L.-C., Papandreou, G., Kokkinos, I., Murphy, K., and Yuille, A. L. Deeplab: Semantic image segmentation with deep convo- lutional nets, atrous convolution, and fully connected crfs. IEEE transac- tions on pattern analysis and machine intelligence 40 , 4 (2018), 834–848

work page 2018
[31]

Chen, L.-C., Papandreou, G., Kokkinos, I., Murphy, K., and Yuille, A. L. Deeplab: Semantic image segmentation with deep convo- lutional nets, atrous convolution, and fully connected crfs. IEEE transac- tions on pattern analysis and machine intelligence 40 , 4 (2018), 834–848. 41

work page 2018
[32]

Rethinking Atrous Convolution for Semantic Image Segmentation

Chen, L.-C., Papandreou, G., Schroff, F., and Adam, H.Rethink- ing atrous convolution for semantic image segmentation. arXiv preprint arXiv:1706.05587 (2017)

work page internal anchor Pith review Pith/arXiv arXiv 2017
[33]

Chen, L.-C., Yang, Y., Wang, J., Xu, W., and Yuille, A. L. At- tention to scale: Scale-aware semantic image segmentation. In Proceedings of the IEEE conference on computer vision and pattern recognition(2016), pp. 3640–3649

work page 2016
[34]

Encoder-Decoder with Atrous Separable Convolution for Semantic Image Segmentation

Chen, L.-C., Zhu, Y., Papandreou, G., Schroff, F., and Adam, H. Encoder-decoder with atrous separable convolution for semantic image segmentation. arXiv preprint arXiv:1802.02611 (2018)

work page internal anchor Pith review Pith/arXiv arXiv 2018
[35]

The application of com- petitive hopﬁeld neural network to medical image segmentation

Cheng, K.-S., Lin, J.-S., and Mao, C.-W. The application of com- petitive hopﬁeld neural network to medical image segmentation. IEEE transactions on medical imaging 15 , 4 (1996), 560–567

work page 1996
[36]

J., Huang, X., and Hu, S.-M

Cheng, M.-M., Mitra, N. J., Huang, X., and Hu, S.-M. Salientshape: Group saliency in image collections. The Visual Computer 30, 4 (2014), 443–453

work page 2014
[37]

J., Huang, X., Torr, P

Cheng, M.-M., Mitra, N. J., Huang, X., Torr, P. H., and Hu, S.-M. Global contrast based salient region detection. IEEE Transactions on Pattern Analysis and Machine Intelligence 37 , 3 (2015), 569–582

work page 2015
[38]

A multi-cue information based approach to contour detection by utilizing superpixel segmentation

Choudhuri, S., Das, N., Ghosh, S., and Nasipuri, M. A multi-cue information based approach to contour detection by utilizing superpixel segmentation. In Advances in Computing, Communications and Informat- ics (ICACCI), 2016 International Conference on (2016), IEEE, pp. 1057– 1063

work page 2016
[39]

Fuzzy c-means clustering with spatial information for image segmentation

Chuang, K.-S., Tzeng, H.-L., Chen, S., Wu, J., and Chen, T.-J. Fuzzy c-means clustering with spatial information for image segmentation. Computerized MedicalImaging and Graphics 30 , 1 (2006), 9–15

work page 2006
[40]

Robust analysis of feature spaces: color image segmentation

Comaniciu, D., and Meer, P. Robust analysis of feature spaces: color image segmentation. In Computer Vision and Pattern Recognition, 1997. Proceedings., 1997 IEEE Computer Society Conference on (1997), IEEE, pp. 750–755

work page 1997
[41]

The cityscapes dataset for semantic urban scene understanding

Cordts, M., Omran, M., Ramos, S., Rehfeld, T., Enzweiler, M., Benenson, R., Franke, U., Roth, S., and Schiele, B. The cityscapes dataset for semantic urban scene understanding. In Proceedings of the IEEE conference on computer vision and pattern recognition(2016), pp. 3213–3223

work page 2016
[42]

Boxsup: Exploiting bounding boxes to supervise convolutional networks for semantic segmentation

Dai, J., He, K., and Sun, J. Boxsup: Exploiting bounding boxes to supervise convolutional networks for semantic segmentation. In Proceed- ings of the IEEE International Conference on Computer Vision (2015), pp. 1635–1643. 42

work page 2015
[43]

Instance-aware semantic segmentation via multi-task network cascades

Dai, J., He, K., and Sun, J. Instance-aware semantic segmentation via multi-task network cascades. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2016), pp. 3150–3158

work page 2016
[44]

R-fcn: Object detection via region- based fully convolutional networks

Dai, J., Li, Y., He, K., and Sun, J. R-fcn: Object detection via region- based fully convolutional networks. In Advances in neural information processing systems (2016), pp. 379–387

work page 2016
[45]

Combining Multi-level Contexts of Superpixel using Convolutional Neural Networks to perform Natural Scene Labeling

Das, A., Ghosh, S., Sarkhel, R., Choudhuri, S., Das, N., and Nasipuri, M. Combining multi-level contexts of superpixel using convo- lutional neural networks to perform natural scene labeling. arXiv preprint arXiv:1803.05200 (2018)

work page internal anchor Pith review Pith/arXiv arXiv 2018
[46]

Combining multilevel contexts of superpixel using con- volutional neural networks to perform natural scene labeling

Das, A., Ghosh, S., Sarkhel, R., Choudhuri, S., Das, N., and Nasipuri, M. Combining multilevel contexts of superpixel using con- volutional neural networks to perform natural scene labeling. In Recent Developments in Machine Learning and Data Analytics . Springer, 2019, pp. 297–306

work page 2019
[47]

P., Esquef, I

De Albuquerque, M. P., Esquef, I. A., and Mello, A. G. Im- age thresholding using tsallis entropy. Pattern Recognition Letters 25 , 9 (2004), 1059–1065

work page 2004
[48]

A., and Niessen, W

de Bruijne, M., van Ginneken, B., Viergever, M. A., and Niessen, W. J. Interactive segmentation of abdominal aortic aneurysms in cta images. Medical Image Analysis 8 , 2 (2004), 127–138

work page 2004
[49]

DeepGlobe 2018: A Challenge to Parse the Earth through Satellite Images

Demir, I., Koperski, K., Lindenbaum, D., Pang, G., Huang, J., Basu, S., Hughes, F., Tuia, D., and Raskar, R. Deepglobe 2018: A challenge to parse the earth through satellite images. arXiv preprint arXiv:1805.06561 (2018)

work page internal anchor Pith review Pith/arXiv arXiv 2018
[50]

Imagenet: A large-scale hierarchical image database

Deng, J., Dong, W., Socher, R., Li, L.-J., Li, K., and Fei-Fei, L. Imagenet: A large-scale hierarchical image database. In Computer Vision and Pattern Recognition, 2009. CVPR 2009. IEEE Conference on (2009), IEEE, pp. 248–255

work page 2009
[51]

Video-based noncooperative iris image segmentation

Du, Y., Arslanturk, E., Zhou, Z., and Belcher, C. Video-based noncooperative iris image segmentation. IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics) 41 , 1 (2011), 64–74

work page 2011
[52]

W., Xu, D., and Chua, T.-S

Duan, L., Tsang, I. W., Xu, D., and Chua, T.-S. Domain adap- tation from multiple sources via auxiliary classiﬁers. In Proceedings of the 26th Annual International Conference on Machine Learning (2009), ACM, pp. 289–296

work page 2009
[53]

A guide to convolution arithmetic for deep learning

Dumoulin, V., and Visin, F. A guide to convolution arithmetic for deep learning. arXiv preprint arXiv:1603.07285 (2016). 43

work page internal anchor Pith review Pith/arXiv arXiv 2016
[54]

K., Winn, J., and Zisserman, A

Everingham, M., Van Gool, L., Williams, C. K., Winn, J., and Zisserman, A. The pascal visual object classes (voc) challenge. Interna- tional journal of computer vision 88 , 2 (2010), 303–338

work page 2010
[55]

Learning hierarchical features for scene labeling

Farabet, C., Couprie, C., Najman, L., and LeCun, Y. Learning hierarchical features for scene labeling. IEEE transactions on pattern analysis and machine intelligence 35 , 8 (2013), 1915–1929

work page 2013
[56]

F., and Huttenlocher, D

Felzenszwalb, P. F., and Huttenlocher, D. P. Eﬃcient graph- based image segmentation. International journal of computer vision 59 , 2 (2004), 167–181

work page 2004
[57]

S., and Sensing, R

for Photogrammetry, I. S., and Sensing, R. Isprs 2d semantic labeling contest

work page
[58]

M., Remagnino, P., Hoppe, A., Uyyanonvara, B., Rud- nicka, A

Fraz, M. M., Remagnino, P., Hoppe, A., Uyyanonvara, B., Rud- nicka, A. R., Owen, C. G., and Barman, S. A. Blood vessel seg- mentation methodologies in retinal images–a survey. Computer methods and programs in biomedicine 108 , 1 (2012), 407–433

work page 2012
[59]

Yet another survey on image segmentation: Region and boundary information integration

Freixenet, J., Mu˜noz, X., Raba, D., Mart´ı, J., and Cuf´ı, X. Yet another survey on image segmentation: Region and boundary information integration. In European conference on computer vision (2002), Springer, pp. 408–422

work page 2002
[60]

Image segmentation in video sequences: A probabilistic approach

Friedman, N., and Russell, S. Image segmentation in video sequences: A probabilistic approach. In Proceedings of the Thirteenth conference on Uncertainty in artiﬁcial intelligence (1997), Morgan Kaufmann Publishers Inc., pp. 175–181

work page 1997
[61]

A survey on image segmentation

Fu, K.-S., and Mui, J. A survey on image segmentation. Pattern recognition 13, 1 (1981), 3–16

work page 1981
[62]

Neocognitron: A hierarchical neural network capable of visual pattern recognition

Fukushima, K. Neocognitron: A hierarchical neural network capable of visual pattern recognition. Neural networks 1 , 2 (1988), 119–130

work page 1988
[63]

In Compe- tition and cooperation in neural nets

Fukushima, K., and Miyake, S.Neocognitron: A self-organizing neural network model for a mechanism of visual pattern recognition. In Compe- tition and cooperation in neural nets . Springer, 1982, pp. 267–285

work page 1982
[64]

A uniﬁed video segmentation benchmark: Annotation, metrics and analysis

Galasso, F., Shankar Nagaraja, N., Jimenez Cardenas, T., Brox, T., and Schiele, B. A uniﬁed video segmentation benchmark: Annotation, metrics and analysis. In Proceedings of the IEEE Interna- tional Conference on Computer Vision (2013), pp. 3527–3534

work page 2013
[65]

Deepirisnet: Deep iris representation with applications in iris recognition and cross-sensor iris recognition

Gangwar, A., and Joshi, A. Deepirisnet: Deep iris representation with applications in iris recognition and cross-sensor iris recognition. In Image Processing (ICIP), 2016 IEEE International Conference on (2016), IEEE, pp. 2301–2305. 44

work page 2016
[66]

A Review on Deep Learning Techniques Applied to Semantic Segmentation

Garcia-Garcia, A., Orts-Escolano, S., Oprea, S., Villena- Martinez, V., and Garcia-Rodriguez, J. A review on deep learning techniques applied to semantic segmentation. arXiv preprint arXiv:1704.06857 (2017)

work page internal anchor Pith review Pith/arXiv arXiv 2017
[67]

Are we ready for autonomous driving? the kitti vision benchmark suite

Geiger, A., Lenz, P., and Urtasun, R. Are we ready for autonomous driving? the kitti vision benchmark suite. In Computer Vision and Pattern Recognition (CVPR), 2012 IEEE Conference on (2012), IEEE, pp. 3354–3361

work page 2012
[68]

Survey of recent progress in semantic image segmentation with cnns

Geng, Q., Zhou, Z., and Cao, X. Survey of recent progress in semantic image segmentation with cnns. Science China Information Sciences 61 , 5 (2018), 051101

work page 2018
[69]

Fast R-CNN

Girshick, R. Fast r-cnn. arXiv preprint arXiv:1504.08083 (2015)

work page internal anchor Pith review Pith/arXiv arXiv 2015
[70]

Rich fea- ture hierarchies for accurate object detection and semantic segmentation

Girshick, R., Donahue, J., Darrell, T., and Malik, J. Rich fea- ture hierarchies for accurate object detection and semantic segmentation. In Proceedings of the IEEE conference on computer vision and pattern recognition (2014), pp. 580–587

work page 2014
[71]

Generative adversarial nets

Goodfellow, I., Pouget-Abadie, J., Mirza, M., Xu, B., Warde- Farley, D., Ozair, S., Courville, A., and Bengio, Y. Generative adversarial nets. In Advances in neural information processing systems (2014), pp. 2672–2680

work page 2014
[72]

Decomposing a scene into geometric and semantically consistent regions

Gould, S., Fulton, R., and Koller, D. Decomposing a scene into geometric and semantically consistent regions. In Computer Vision, 2009 IEEE 12th International Conference on (2009), IEEE, pp. 1–8

work page 2009
[73]

Grau, V., Mewes, A., Alcaniz, M., Kikinis, R., and Warfield, S. K. Improved watershed transform for medical image segmentation using prior information. IEEE transactions on medical imaging 23 , 4 (2004), 447–458

work page 2004
[74]

Automatic Liver Lesion Segmentation Using A Deep Convolutional Neural Network Method

Han, X. Automatic liver lesion segmentation using a deep convolutional neural network method. arXiv preprint arXiv:1704.07239 (2017)

work page internal anchor Pith review Pith/arXiv arXiv 2017
[75]

Semantic contours from inverse detectors

Hariharan, B., Arbelaez, P., Bourdev, L., Maji, S., and Malik, J. Semantic contours from inverse detectors. In International Conference on Computer Vision (ICCV) (2011)

work page 2011
[76]

Mask r-cnn

He, K., Gkioxari, G., Doll ´ar, P., and Girshick, R. Mask r-cnn. In Computer Vision (ICCV), 2017 IEEE International Conference on (2017), IEEE, pp. 2980–2988

work page 2017
[77]

Spatial pyramid pooling in deep convolutional networks for visual recognition

He, K., Zhang, X., Ren, S., and Sun, J. Spatial pyramid pooling in deep convolutional networks for visual recognition. IEEE transactions on pattern analysis and machine intelligence 37 , 9 (2015), 1904–1916. 45

work page 2015
[78]

Deep residual learning for image recognition

He, K., Zhang, X., Ren, S., and Sun, J. Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition (2016), pp. 770–778

work page 2016
[79]

E., Osindero, S., and Teh, Y.-W

Hinton, G. E., Osindero, S., and Teh, Y.-W. A fast learning algo- rithm for deep belief nets. Neural computation 18, 7 (2006), 1527–1554

work page 2006
[80]

Hochbaum, D. S. An eﬃcient algorithm for image segmentation, markov random ﬁelds and related problems. Journal of the ACM (JACM) 48 , 4 (2001), 686–701

work page 2001

Showing first 80 references.

[1] [1]

Slic superpixels compared to state-of-the-art superpixel methods

Achanta, R., Shaji, A., Smith, K., Lucchi, A., Fua, P., S¨usstrunk, S., et al. Slic superpixels compared to state-of-the-art superpixel methods. IEEE transactions on pattern analysis and machine intelligence 34, 11 (2012), 2274–2282

work page 2012

[2] [2]

H., and Seitz, S

Agarwala, A., Hertzmann, A., Salesin, D. H., and Seitz, S. M. Keyframe-based tracking for rotoscoping and animation. In ACM Trans- actions on Graphics (ToG) (2004), vol. 23, ACM, pp. 584–591

work page 2004

[3] [3]

Ahmad, J., Mehmood, I., and Baik, S. W. Eﬃcient object-based surveillance image search using spatial pooling of convolutional features. Journal of Visual Communication and Image Representation 45 (2017), 62–76

work page 2017

[4] [4]

I., Zhou, J., Liew, A

Alam, F. I., Zhou, J., Liew, A. W.-C., and Jia, X. Crf learning with cnn features for hyperspectral image segmentation. In Geoscience and Remote Sensing Symposium (IGARSS), 2016 IEEE International (2016), IEEE, pp. 6890–6893

work page 2016

[5] [5]

Albiol, A., Torres, L., and Delp, E. J. An unsupervised color image segmentation algorithm for face detection applications. In Image Processing, 2001. Proceedings. 2001 International Conference on (2001), vol. 2, IEEE, pp. 681–684

work page 2001

[6] [6]

Classiﬁcation of breast cancer histology images using convolutional neural networks

Ara´ujo, T., Aresta, G., Castro, E., Rouco, J., Aguiar, P., Eloy, C., Pol ´onia, A., and Campilho, A. Classiﬁcation of breast cancer histology images using convolutional neural networks. PloS one 12 , 6 (2017), e0177544

work page 2017

[7] [7]

Performance com- parison of fpga, gpu and cpu in image processing

Asano, S., Maruyama, T., and Yamaguchi, Y. Performance com- parison of fpga, gpu and cpu in image processing. In Field programmable logic and applications, 2009. fpl 2009. international conference on (2009), IEEE, pp. 126–131

work page 2009

[8] [8]

A quality analysis of openstreetmap data

Ather, A. A quality analysis of openstreetmap data. ME Thesis, Uni- versity College London, London, UK 22 (2009). 39

work page 2009

[9] [9]

IEEE transactions on pattern analysis and machine intelligence 39 , 12 (2017), 2481–2495

Badrinarayanan, V., Kendall, A., and Cipolla, R.Segnet: A deep convolutional encoder-decoder architecture for image segmentation. IEEE transactions on pattern analysis and machine intelligence 39 , 12 (2017), 2481–2495

work page 2017

[10] [10]

IEEE transactions on Geoscience and Remote Sensing 45, 5 (2007), 1506– 1511

Bandyopadhyay, S., Maulik, U., and Mukhopadhyay, A.Multiob- jective genetic clustering for pixel classiﬁcation in remote sensing imagery. IEEE transactions on Geoscience and Remote Sensing 45, 5 (2007), 1506– 1511

work page 2007

[11] [11]

High spatial resolution satellite imagery, dem derivatives, and image segmentation for the detec- tion of mass wasting processes

Barlow, J., Franklin, S., and Martin, Y. High spatial resolution satellite imagery, dem derivatives, and image segmentation for the detec- tion of mass wasting processes. Photogrammetric Engineering and Remote Sensing 72, 6 (2006), 687–692

work page 2006

[12] [12]

Color-and texture-based image segmentation using em and its application to content- based image retrieval

Belongie, S., Carson, C., Greenspan, H., and Malik, J. Color-and texture-based image segmentation using em and its application to content- based image retrieval. In Computer Vision, 1998. Sixth International Conference on (1998), IEEE, pp. 675–682

work page 1998

[13] [13]

Greedy layer-wise training of deep networks

Bengio, Y., Lamblin, P., Popovici, D., and Larochelle, H. Greedy layer-wise training of deep networks. In Advances in neural infor- mation processing systems (2007), pp. 153–160

work page 2007

[14] [14]

Learning long-term de- pendencies with gradient descent is diﬃcult

Bengio, Y., Simard, P., and Frasconi, P. Learning long-term de- pendencies with gradient descent is diﬃcult. IEEE transactions on neural networks 5, 2 (1994), 157–166

work page 1994

[15] [15]

Large scale visual recognition challenge (ilsvrc), 2010

Berg, A., Deng, J., and Fei-Fei, L. Large scale visual recognition challenge (ilsvrc), 2010. URL http://www. image-net. org/challenges/LSVRC 3 (2010)

work page 2010

[16] [16]

C., Ehrlich, R., and Full, W

Bezdek, J. C., Ehrlich, R., and Full, W. Fcm: The fuzzy c-means clustering algorithm. Computers and Geosciences 10, 2-3 (1984), 191–203

work page 1984

[17] [17]

S., Fonseca, L

Bins, L. S., Fonseca, L. G., Erthal, G. J., and Ii, F. M. Satellite imagery segmentation: a region growing approach. Simp´ osio Brasileiro de Sensoriamento Remoto 8 , 1996 (1996), 677–680

work page 1996

[18] [18]

What is a salient object? a dataset and a baseline model for salient object detection

Borji, A. What is a salient object? a dataset and a baseline model for salient object detection. IEEE Transactions on Image Processing 24 , 2 (2015), 742–756

work page 2015

[19] [19]

Salient Object Detection: A Survey

Borji, A., Cheng, M.-M., Hou, Q., Jiang, H., and Li, J. Salient object detection: A survey. arXiv preprint arXiv:1411.5878 (2014)

work page internal anchor Pith review Pith/arXiv arXiv 2014

[20] [20]

Salient object de- tection: A benchmark

Borji, A., Cheng, M.-M., Jiang, H., and Li, J. Salient object de- tection: A benchmark. IEEE Transactions on Image Processing 24 , 12 (2015), 5706–5722. 40

work page 2015

[21] [21]

Fast approximate energy minimization via graph cuts

Boykov, Y., Veksler, O., and Zabih, R. Fast approximate energy minimization via graph cuts. IEEE Transactions on pattern analysis and machine intelligence 23 , 11 (2001), 1222–1239

work page 2001

[22] [22]

Y., and Jolly, M.-P

Boykov, Y. Y., and Jolly, M.-P. Interactive graph cuts for optimal boundary & region segmentation of objects in nd images. In Computer Vision, 2001. ICCV 2001. Proceedings. Eighth IEEE International Con- ference on (2001), vol. 1, IEEE, pp. 105–112

work page 2001

[23] [23]

J., Fauqueur, J., and Cipolla, R

Brostow, G. J., Fauqueur, J., and Cipolla, R. Semantic object classes in video: A high-deﬁnition ground truth database. Pattern Recog- nition Letters 30 , 2 (2009), 88–97

work page 2009

[24] [24]

D., and Ray, L

Cahill, N. D., and Ray, L. A. Method and system for compositing images to produce a cropped image, Jan. 9 2007. US Patent 7,162,102

work page 2007

[25] [25]

L., Magrath, E., Gherman, A., Button, J., Nguyen, J., Bazin, P.-L., Calabresi, P

Carass, A., Roy, S., Jog, A., Cuzzocreo, J. L., Magrath, E., Gherman, A., Button, J., Nguyen, J., Bazin, P.-L., Calabresi, P. A., et al. Longitudinal multiple sclerosis lesion segmentation data resource. Data in brief 12 (2017), 346–350

work page 2017

[26] [26]

In Proceedings of the IEEE Confer- ence on Computer Vision and Pattern Recognition (2017), pp

Castrejon, L., Kundu, K., Urtasun, R., and Fidler, S.Annotating object instances with a polygon-rnn. In Proceedings of the IEEE Confer- ence on Computer Vision and Pattern Recognition (2017), pp. 5230–5238

work page 2017

[27] [27]

Exploiting the self-organizing map for medical image segmentation

Chang, P.-L., and Teng, W.-G. Exploiting the self-organizing map for medical image segmentation. In Computer-Based Medical Systems, 2007. CBMS’07. Twentieth IEEE International Symposium on (2007), IEEE, pp. 281–288

work page 2007

[28] [28]

Chen, J., Yang, L., Zhang, Y., Alber, M., and Chen, D. Z. Com- bining fully convolutional and recurrent neural networks for 3d biomedical image segmentation. In Advances in Neural Information Processing Sys- tems (2016), pp. 3036–3044

work page 2016

[29] [29]

Chen, L.-C., Papandreou, G., Kokkinos, I., Murphy, K., and Yuille, A. L. Semantic image segmentation with deep convolutional nets and fully connected crfs. arXiv preprint arXiv:1412.7062 (2014)

work page internal anchor Pith review Pith/arXiv arXiv 2014

[30] [30]

Chen, L.-C., Papandreou, G., Kokkinos, I., Murphy, K., and Yuille, A. L. Deeplab: Semantic image segmentation with deep convo- lutional nets, atrous convolution, and fully connected crfs. IEEE transac- tions on pattern analysis and machine intelligence 40 , 4 (2018), 834–848

work page 2018

[31] [31]

Chen, L.-C., Papandreou, G., Kokkinos, I., Murphy, K., and Yuille, A. L. Deeplab: Semantic image segmentation with deep convo- lutional nets, atrous convolution, and fully connected crfs. IEEE transac- tions on pattern analysis and machine intelligence 40 , 4 (2018), 834–848. 41

work page 2018

[32] [32]

Rethinking Atrous Convolution for Semantic Image Segmentation

Chen, L.-C., Papandreou, G., Schroff, F., and Adam, H.Rethink- ing atrous convolution for semantic image segmentation. arXiv preprint arXiv:1706.05587 (2017)

work page internal anchor Pith review Pith/arXiv arXiv 2017

[33] [33]

Chen, L.-C., Yang, Y., Wang, J., Xu, W., and Yuille, A. L. At- tention to scale: Scale-aware semantic image segmentation. In Proceedings of the IEEE conference on computer vision and pattern recognition(2016), pp. 3640–3649

work page 2016

[34] [34]

Encoder-Decoder with Atrous Separable Convolution for Semantic Image Segmentation

Chen, L.-C., Zhu, Y., Papandreou, G., Schroff, F., and Adam, H. Encoder-decoder with atrous separable convolution for semantic image segmentation. arXiv preprint arXiv:1802.02611 (2018)

work page internal anchor Pith review Pith/arXiv arXiv 2018

[35] [35]

The application of com- petitive hopﬁeld neural network to medical image segmentation

Cheng, K.-S., Lin, J.-S., and Mao, C.-W. The application of com- petitive hopﬁeld neural network to medical image segmentation. IEEE transactions on medical imaging 15 , 4 (1996), 560–567

work page 1996

[36] [36]

J., Huang, X., and Hu, S.-M

Cheng, M.-M., Mitra, N. J., Huang, X., and Hu, S.-M. Salientshape: Group saliency in image collections. The Visual Computer 30, 4 (2014), 443–453

work page 2014

[37] [37]

J., Huang, X., Torr, P

Cheng, M.-M., Mitra, N. J., Huang, X., Torr, P. H., and Hu, S.-M. Global contrast based salient region detection. IEEE Transactions on Pattern Analysis and Machine Intelligence 37 , 3 (2015), 569–582

work page 2015

[38] [38]

A multi-cue information based approach to contour detection by utilizing superpixel segmentation

Choudhuri, S., Das, N., Ghosh, S., and Nasipuri, M. A multi-cue information based approach to contour detection by utilizing superpixel segmentation. In Advances in Computing, Communications and Informat- ics (ICACCI), 2016 International Conference on (2016), IEEE, pp. 1057– 1063

work page 2016

[39] [39]

Fuzzy c-means clustering with spatial information for image segmentation

Chuang, K.-S., Tzeng, H.-L., Chen, S., Wu, J., and Chen, T.-J. Fuzzy c-means clustering with spatial information for image segmentation. Computerized MedicalImaging and Graphics 30 , 1 (2006), 9–15

work page 2006

[40] [40]

Robust analysis of feature spaces: color image segmentation

Comaniciu, D., and Meer, P. Robust analysis of feature spaces: color image segmentation. In Computer Vision and Pattern Recognition, 1997. Proceedings., 1997 IEEE Computer Society Conference on (1997), IEEE, pp. 750–755

work page 1997

[41] [41]

The cityscapes dataset for semantic urban scene understanding

Cordts, M., Omran, M., Ramos, S., Rehfeld, T., Enzweiler, M., Benenson, R., Franke, U., Roth, S., and Schiele, B. The cityscapes dataset for semantic urban scene understanding. In Proceedings of the IEEE conference on computer vision and pattern recognition(2016), pp. 3213–3223

work page 2016

[42] [42]

Boxsup: Exploiting bounding boxes to supervise convolutional networks for semantic segmentation

Dai, J., He, K., and Sun, J. Boxsup: Exploiting bounding boxes to supervise convolutional networks for semantic segmentation. In Proceed- ings of the IEEE International Conference on Computer Vision (2015), pp. 1635–1643. 42

work page 2015

[43] [43]

Instance-aware semantic segmentation via multi-task network cascades

Dai, J., He, K., and Sun, J. Instance-aware semantic segmentation via multi-task network cascades. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2016), pp. 3150–3158

work page 2016

[44] [44]

R-fcn: Object detection via region- based fully convolutional networks

Dai, J., Li, Y., He, K., and Sun, J. R-fcn: Object detection via region- based fully convolutional networks. In Advances in neural information processing systems (2016), pp. 379–387

work page 2016

[45] [45]

Combining Multi-level Contexts of Superpixel using Convolutional Neural Networks to perform Natural Scene Labeling

Das, A., Ghosh, S., Sarkhel, R., Choudhuri, S., Das, N., and Nasipuri, M. Combining multi-level contexts of superpixel using convo- lutional neural networks to perform natural scene labeling. arXiv preprint arXiv:1803.05200 (2018)

work page internal anchor Pith review Pith/arXiv arXiv 2018

[46] [46]

Combining multilevel contexts of superpixel using con- volutional neural networks to perform natural scene labeling

Das, A., Ghosh, S., Sarkhel, R., Choudhuri, S., Das, N., and Nasipuri, M. Combining multilevel contexts of superpixel using con- volutional neural networks to perform natural scene labeling. In Recent Developments in Machine Learning and Data Analytics . Springer, 2019, pp. 297–306

work page 2019

[47] [47]

P., Esquef, I

De Albuquerque, M. P., Esquef, I. A., and Mello, A. G. Im- age thresholding using tsallis entropy. Pattern Recognition Letters 25 , 9 (2004), 1059–1065

work page 2004

[48] [48]

A., and Niessen, W

de Bruijne, M., van Ginneken, B., Viergever, M. A., and Niessen, W. J. Interactive segmentation of abdominal aortic aneurysms in cta images. Medical Image Analysis 8 , 2 (2004), 127–138

work page 2004

[49] [49]

DeepGlobe 2018: A Challenge to Parse the Earth through Satellite Images

Demir, I., Koperski, K., Lindenbaum, D., Pang, G., Huang, J., Basu, S., Hughes, F., Tuia, D., and Raskar, R. Deepglobe 2018: A challenge to parse the earth through satellite images. arXiv preprint arXiv:1805.06561 (2018)

work page internal anchor Pith review Pith/arXiv arXiv 2018

[50] [50]

Imagenet: A large-scale hierarchical image database

Deng, J., Dong, W., Socher, R., Li, L.-J., Li, K., and Fei-Fei, L. Imagenet: A large-scale hierarchical image database. In Computer Vision and Pattern Recognition, 2009. CVPR 2009. IEEE Conference on (2009), IEEE, pp. 248–255

work page 2009

[51] [51]

Video-based noncooperative iris image segmentation

Du, Y., Arslanturk, E., Zhou, Z., and Belcher, C. Video-based noncooperative iris image segmentation. IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics) 41 , 1 (2011), 64–74

work page 2011

[52] [52]

W., Xu, D., and Chua, T.-S

Duan, L., Tsang, I. W., Xu, D., and Chua, T.-S. Domain adap- tation from multiple sources via auxiliary classiﬁers. In Proceedings of the 26th Annual International Conference on Machine Learning (2009), ACM, pp. 289–296

work page 2009

[53] [53]

A guide to convolution arithmetic for deep learning

Dumoulin, V., and Visin, F. A guide to convolution arithmetic for deep learning. arXiv preprint arXiv:1603.07285 (2016). 43

work page internal anchor Pith review Pith/arXiv arXiv 2016

[54] [54]

K., Winn, J., and Zisserman, A

Everingham, M., Van Gool, L., Williams, C. K., Winn, J., and Zisserman, A. The pascal visual object classes (voc) challenge. Interna- tional journal of computer vision 88 , 2 (2010), 303–338

work page 2010

[55] [55]

Learning hierarchical features for scene labeling

Farabet, C., Couprie, C., Najman, L., and LeCun, Y. Learning hierarchical features for scene labeling. IEEE transactions on pattern analysis and machine intelligence 35 , 8 (2013), 1915–1929

work page 2013

[56] [56]

F., and Huttenlocher, D

Felzenszwalb, P. F., and Huttenlocher, D. P. Eﬃcient graph- based image segmentation. International journal of computer vision 59 , 2 (2004), 167–181

work page 2004

[57] [57]

S., and Sensing, R

for Photogrammetry, I. S., and Sensing, R. Isprs 2d semantic labeling contest

work page

[58] [58]

M., Remagnino, P., Hoppe, A., Uyyanonvara, B., Rud- nicka, A

Fraz, M. M., Remagnino, P., Hoppe, A., Uyyanonvara, B., Rud- nicka, A. R., Owen, C. G., and Barman, S. A. Blood vessel seg- mentation methodologies in retinal images–a survey. Computer methods and programs in biomedicine 108 , 1 (2012), 407–433

work page 2012

[59] [59]

Yet another survey on image segmentation: Region and boundary information integration

Freixenet, J., Mu˜noz, X., Raba, D., Mart´ı, J., and Cuf´ı, X. Yet another survey on image segmentation: Region and boundary information integration. In European conference on computer vision (2002), Springer, pp. 408–422

work page 2002

[60] [60]

Image segmentation in video sequences: A probabilistic approach

Friedman, N., and Russell, S. Image segmentation in video sequences: A probabilistic approach. In Proceedings of the Thirteenth conference on Uncertainty in artiﬁcial intelligence (1997), Morgan Kaufmann Publishers Inc., pp. 175–181

work page 1997

[61] [61]

A survey on image segmentation

Fu, K.-S., and Mui, J. A survey on image segmentation. Pattern recognition 13, 1 (1981), 3–16

work page 1981

[62] [62]

Neocognitron: A hierarchical neural network capable of visual pattern recognition

Fukushima, K. Neocognitron: A hierarchical neural network capable of visual pattern recognition. Neural networks 1 , 2 (1988), 119–130

work page 1988

[63] [63]

In Compe- tition and cooperation in neural nets

Fukushima, K., and Miyake, S.Neocognitron: A self-organizing neural network model for a mechanism of visual pattern recognition. In Compe- tition and cooperation in neural nets . Springer, 1982, pp. 267–285

work page 1982

[64] [64]

A uniﬁed video segmentation benchmark: Annotation, metrics and analysis

Galasso, F., Shankar Nagaraja, N., Jimenez Cardenas, T., Brox, T., and Schiele, B. A uniﬁed video segmentation benchmark: Annotation, metrics and analysis. In Proceedings of the IEEE Interna- tional Conference on Computer Vision (2013), pp. 3527–3534

work page 2013

[65] [65]

Deepirisnet: Deep iris representation with applications in iris recognition and cross-sensor iris recognition

Gangwar, A., and Joshi, A. Deepirisnet: Deep iris representation with applications in iris recognition and cross-sensor iris recognition. In Image Processing (ICIP), 2016 IEEE International Conference on (2016), IEEE, pp. 2301–2305. 44

work page 2016

[66] [66]

A Review on Deep Learning Techniques Applied to Semantic Segmentation

Garcia-Garcia, A., Orts-Escolano, S., Oprea, S., Villena- Martinez, V., and Garcia-Rodriguez, J. A review on deep learning techniques applied to semantic segmentation. arXiv preprint arXiv:1704.06857 (2017)

work page internal anchor Pith review Pith/arXiv arXiv 2017

[67] [67]

Are we ready for autonomous driving? the kitti vision benchmark suite

Geiger, A., Lenz, P., and Urtasun, R. Are we ready for autonomous driving? the kitti vision benchmark suite. In Computer Vision and Pattern Recognition (CVPR), 2012 IEEE Conference on (2012), IEEE, pp. 3354–3361

work page 2012

[68] [68]

Survey of recent progress in semantic image segmentation with cnns

Geng, Q., Zhou, Z., and Cao, X. Survey of recent progress in semantic image segmentation with cnns. Science China Information Sciences 61 , 5 (2018), 051101

work page 2018

[69] [69]

Fast R-CNN

Girshick, R. Fast r-cnn. arXiv preprint arXiv:1504.08083 (2015)

work page internal anchor Pith review Pith/arXiv arXiv 2015

[70] [70]

Rich fea- ture hierarchies for accurate object detection and semantic segmentation

Girshick, R., Donahue, J., Darrell, T., and Malik, J. Rich fea- ture hierarchies for accurate object detection and semantic segmentation. In Proceedings of the IEEE conference on computer vision and pattern recognition (2014), pp. 580–587

work page 2014

[71] [71]

Generative adversarial nets

Goodfellow, I., Pouget-Abadie, J., Mirza, M., Xu, B., Warde- Farley, D., Ozair, S., Courville, A., and Bengio, Y. Generative adversarial nets. In Advances in neural information processing systems (2014), pp. 2672–2680

work page 2014

[72] [72]

Decomposing a scene into geometric and semantically consistent regions

Gould, S., Fulton, R., and Koller, D. Decomposing a scene into geometric and semantically consistent regions. In Computer Vision, 2009 IEEE 12th International Conference on (2009), IEEE, pp. 1–8

work page 2009

[73] [73]

Grau, V., Mewes, A., Alcaniz, M., Kikinis, R., and Warfield, S. K. Improved watershed transform for medical image segmentation using prior information. IEEE transactions on medical imaging 23 , 4 (2004), 447–458

work page 2004

[74] [74]

Automatic Liver Lesion Segmentation Using A Deep Convolutional Neural Network Method

Han, X. Automatic liver lesion segmentation using a deep convolutional neural network method. arXiv preprint arXiv:1704.07239 (2017)

work page internal anchor Pith review Pith/arXiv arXiv 2017

[75] [75]

Semantic contours from inverse detectors

Hariharan, B., Arbelaez, P., Bourdev, L., Maji, S., and Malik, J. Semantic contours from inverse detectors. In International Conference on Computer Vision (ICCV) (2011)

work page 2011

[76] [76]

Mask r-cnn

He, K., Gkioxari, G., Doll ´ar, P., and Girshick, R. Mask r-cnn. In Computer Vision (ICCV), 2017 IEEE International Conference on (2017), IEEE, pp. 2980–2988

work page 2017

[77] [77]

Spatial pyramid pooling in deep convolutional networks for visual recognition

He, K., Zhang, X., Ren, S., and Sun, J. Spatial pyramid pooling in deep convolutional networks for visual recognition. IEEE transactions on pattern analysis and machine intelligence 37 , 9 (2015), 1904–1916. 45

work page 2015

[78] [78]

Deep residual learning for image recognition

He, K., Zhang, X., Ren, S., and Sun, J. Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition (2016), pp. 770–778

work page 2016

[79] [79]

E., Osindero, S., and Teh, Y.-W

Hinton, G. E., Osindero, S., and Teh, Y.-W. A fast learning algo- rithm for deep belief nets. Neural computation 18, 7 (2006), 1527–1554

work page 2006

[80] [80]

Hochbaum, D. S. An eﬃcient algorithm for image segmentation, markov random ﬁelds and related problems. Journal of the ACM (JACM) 48 , 4 (2001), 686–701

work page 2001