pith. sign in

arxiv: 1907.06119 · v1 · pith:RGAAAQVNnew · submitted 2019-07-13 · 💻 cs.CV · cs.LG· cs.NE

Understanding Deep Learning Techniques for Image Segmentation

Pith reviewed 2026-05-24 21:44 UTC · model grok-4.3

classification 💻 cs.CV cs.LGcs.NE
keywords deep learningimage segmentationconvolutional neural networksreviewadversarial networksautoencodersobject segmentation
0
0 comments X

The pith

Logical grouping of deep learning segmentation algorithms by their unique features gives readers a clearer view of how each works.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper sets out to supply an intuitive understanding of the main deep learning techniques that have shaped image segmentation. It starts with traditional segmentation methods, traces the impact of deep learning, and then places the major algorithms into logical categories while explaining what each category adds. A reader who wants to make sense of the many available networks would value the step-by-step descriptions that aim to show the internal steps of these processes. The review covers convolutional, recurrent, adversarial, and autoencoder approaches applied to segmentation tasks.

Core claim

The paper claims that by moving from traditional image segmentation methods through the influence of deep learning and then logically categorizing the major algorithms with focused paragraphs on their distinctive contributions, readers gain an improved ability to visualize the internal dynamics of these techniques.

What carries the argument

Logical categorization of segmentation algorithms by their unique contributions, presented after a progression from traditional methods to deep learning architectures.

If this is right

  • The shift from traditional segmentation to deep learning approaches becomes easier to follow through the described progression.
  • Readers can visualize the internal steps of networks such as convolutional and adversarial models in segmentation contexts.
  • The variety of deep learning techniques applied to detection, localization, and segmentation tasks is presented in grouped form.
  • An analytical view of the field reduces the sense of being overwhelmed by the number of available methods.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The grouping could be used as a baseline when new architectures appear and need placement in similar categories.
  • Linking the explanations to concrete datasets or benchmarks might show which category performs best under different image conditions.
  • The intuitive style could support introductory teaching materials that introduce segmentation without requiring prior network expertise.

Load-bearing premise

The chosen techniques and papers form a representative sample of the field and the explanations stay accurate without selection bias or outdated framing.

What would settle it

A reader new to the topic who cannot correctly describe the internal steps of a reviewed algorithm after reading the categorized sections, or who finds major current techniques omitted, would indicate the provided understanding falls short.

Figures

Figures reproduced from arXiv: 1907.06119 by Ishita Das, Nibaran Das, Swarnendu Ghosh, Ujjwal Maulik.

Figure 1
Figure 1. Figure 1: Semantic Image Segmentation(Samples from the Mapillary Vistas [PITH_FULL_IMAGE:figures/full_fig_p002_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Legends for subsequent diagrams of popular deep learning architec [PITH_FULL_IMAGE:figures/full_fig_p004_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Input image and sample activation maps from a typical CNN. (Top [PITH_FULL_IMAGE:figures/full_fig_p007_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: A Fully convolutional network with image segmentation with concate [PITH_FULL_IMAGE:figures/full_fig_p008_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: The Deepmask Network shared feature representation. One of them created a pixel level classification of or a probabilistic mask for the central object and the second branch generated a score corresponding to the object recognition accuracy. The network coupled with sliding windows of sixteen strides to create segments of objects at various locations of the image, whereas the score helped in identifying whi… view at source ↗
Figure 6
Figure 6. Figure 6: The Sharpmask Network using convolutional refinements at every steps to generate high resolution masks (Refer fig. 6). The sharpmask scored an average recall of 39.3 which beats deepmask, which scored 36.6 on the MS COCO Segmentation Dataset. 4.1.2 Region proposal networks Another similar wing that started developing with image segmentation was ob￾ject localization. Task such as this involved locating spec… view at source ↗
Figure 7
Figure 7. Figure 7: The RCNN Family of localization and segmentation networks [PITH_FULL_IMAGE:figures/full_fig_p012_7.png] view at source ↗
Figure 8
Figure 8. Figure 8: Normal convolution(red) vs. Atrous or Dilated convolution(green) [PITH_FULL_IMAGE:figures/full_fig_p013_8.png] view at source ↗
Figure 9
Figure 9. Figure 9: DeepLab Architecture as compared to a standard VGG net(top) along [PITH_FULL_IMAGE:figures/full_fig_p013_9.png] view at source ↗
Figure 10
Figure 10. Figure 10: A schematic representation of the PSPNet [PITH_FULL_IMAGE:figures/full_fig_p017_10.png] view at source ↗
Figure 11
Figure 11. Figure 11: A schematic representation of the RefineNet [PITH_FULL_IMAGE:figures/full_fig_p018_11.png] view at source ↗
Figure 12
Figure 12. Figure 12: (Left) Normal Convolution with unit stride. (Right) Transposed [PITH_FULL_IMAGE:figures/full_fig_p019_12.png] view at source ↗
Figure 13
Figure 13. Figure 13: Architecture of U-Net 4.2.2 Forwarding pooling indices Max-pooling has been the most commonly used technique for reducing the size of the activation maps for various reasons. The activations represent of the response of the region of an image to a specific kernel. In max pooling, a region of pixels is compressed to single value by considering only the maximum response obtained within that region. If a typ… view at source ↗
Figure 14
Figure 14. Figure 14: Forwarding pooling indices to maintain spatial relationship during [PITH_FULL_IMAGE:figures/full_fig_p021_14.png] view at source ↗
Figure 15
Figure 15. Figure 15: Architecture of SegNet two networks a generative network and a discriminator network. The generator G tries to generate images,,./ like the ones from the training dataset using a noisy input prior distribution called pz(z). The network G(z; θg) represents a differentiable function represented by a neural network with weights θg. A dis￾criminator network tries to correctly guess whether an input data is fr… view at source ↗
Figure 16
Figure 16. Figure 16: Adversarial learning model for image segmentation [PITH_FULL_IMAGE:figures/full_fig_p023_16.png] view at source ↗
Figure 17
Figure 17. Figure 17: Sequential Models: (topleft) Generic Representation for [PITH_FULL_IMAGE:figures/full_fig_p037_17.png] view at source ↗
Figure 18
Figure 18. Figure 18: Generic representation of autoencoder with fully connected linear [PITH_FULL_IMAGE:figures/full_fig_p037_18.png] view at source ↗
Figure 19
Figure 19. Figure 19: A typical convolutional neural network Generative Models : Generative models are probably one of the latest at￾tractions of deep learning in computer vision. While sequential models like long short term memory or gated recurrent units are able to generate sequence of vec￾torized elements, in computer vision it is much more difficult due to the spatial complexities. Lately various methodologies like variat… view at source ↗
Figure 20
Figure 20. Figure 20: A block diagram of generative adversarial network [PITH_FULL_IMAGE:figures/full_fig_p039_20.png] view at source ↗
read the original abstract

The machine learning community has been overwhelmed by a plethora of deep learning based approaches. Many challenging computer vision tasks such as detection, localization, recognition and segmentation of objects in unconstrained environment are being efficiently addressed by various types of deep neural networks like convolutional neural networks, recurrent networks, adversarial networks, autoencoders and so on. While there have been plenty of analytical studies regarding the object detection or recognition domain, many new deep learning techniques have surfaced with respect to image segmentation techniques. This paper approaches these various deep learning techniques of image segmentation from an analytical perspective. The main goal of this work is to provide an intuitive understanding of the major techniques that has made significant contribution to the image segmentation domain. Starting from some of the traditional image segmentation approaches, the paper progresses describing the effect deep learning had on the image segmentation domain. Thereafter, most of the major segmentation algorithms have been logically categorized with paragraphs dedicated to their unique contribution. With an ample amount of intuitive explanations, the reader is expected to have an improved ability to visualize the internal dynamics of these processes.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

0 major / 2 minor

Summary. The manuscript is a survey paper that reviews traditional image segmentation methods before discussing the impact of deep learning on the domain. It logically categorizes major deep learning-based segmentation algorithms (including convolutional, recurrent, adversarial, and autoencoder-based networks) and provides intuitive explanations of their unique contributions, with the goal of helping readers visualize internal dynamics.

Significance. As an expository survey without original derivations, empirical results, or novel claims, the paper's value lies in synthesis and accessibility. If the categorizations accurately reflect the cited literature and the explanations are balanced, it could serve as a useful entry point for researchers entering the image segmentation field circa 2019. No machine-checked proofs, reproducible code, or falsifiable predictions are present.

minor comments (2)
  1. [Abstract] Abstract: The claim that 'many new deep learning techniques have surfaced with respect to image segmentation techniques' would benefit from a brief statement of the paper's temporal scope (e.g., coverage up to mid-2019) to set reader expectations for completeness.
  2. The manuscript should include a table or structured list summarizing the categorized algorithms, their key architectural differences, and representative citations to improve scannability.

Simulated Author's Rebuttal

0 responses · 0 unresolved

We thank the referee for the constructive summary and recommendation of minor revision. The assessment correctly identifies the manuscript as an expository survey focused on synthesis and intuitive explanations rather than novel claims or experiments. No specific major comments were raised in the report, so our response addresses the overall evaluation.

Circularity Check

0 steps flagged

No significant circularity; expository survey with no derivations

full rationale

The paper is a survey that categorizes and intuitively explains existing deep learning techniques for image segmentation, starting from traditional methods and progressing to DL approaches without presenting any original derivations, equations, predictions, or fitted parameters. No self-citations form load-bearing premises, no uniqueness theorems are invoked, and no results reduce to inputs by construction. The central contribution is descriptive categorization, which is self-contained against external benchmarks and contains no internal derivation chain to inspect for circularity.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

This is a survey paper. No new mathematical claims, fitted parameters, axioms, or invented entities are introduced.

pith-pipeline@v0.9.0 · 5719 in / 1021 out tokens · 15899 ms · 2026-05-24T21:44:47.761835+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

221 extracted references · 221 canonical work pages · 33 internal anchors

  1. [1]

    Slic superpixels compared to state-of-the-art superpixel methods

    Achanta, R., Shaji, A., Smith, K., Lucchi, A., Fua, P., S¨usstrunk, S., et al. Slic superpixels compared to state-of-the-art superpixel methods. IEEE transactions on pattern analysis and machine intelligence 34, 11 (2012), 2274–2282

  2. [2]

    H., and Seitz, S

    Agarwala, A., Hertzmann, A., Salesin, D. H., and Seitz, S. M. Keyframe-based tracking for rotoscoping and animation. In ACM Trans- actions on Graphics (ToG) (2004), vol. 23, ACM, pp. 584–591

  3. [3]

    Ahmad, J., Mehmood, I., and Baik, S. W. Efficient object-based surveillance image search using spatial pooling of convolutional features. Journal of Visual Communication and Image Representation 45 (2017), 62–76

  4. [4]

    I., Zhou, J., Liew, A

    Alam, F. I., Zhou, J., Liew, A. W.-C., and Jia, X. Crf learning with cnn features for hyperspectral image segmentation. In Geoscience and Remote Sensing Symposium (IGARSS), 2016 IEEE International (2016), IEEE, pp. 6890–6893

  5. [5]

    Albiol, A., Torres, L., and Delp, E. J. An unsupervised color image segmentation algorithm for face detection applications. In Image Processing, 2001. Proceedings. 2001 International Conference on (2001), vol. 2, IEEE, pp. 681–684

  6. [6]

    Classification of breast cancer histology images using convolutional neural networks

    Ara´ujo, T., Aresta, G., Castro, E., Rouco, J., Aguiar, P., Eloy, C., Pol ´onia, A., and Campilho, A. Classification of breast cancer histology images using convolutional neural networks. PloS one 12 , 6 (2017), e0177544

  7. [7]

    Performance com- parison of fpga, gpu and cpu in image processing

    Asano, S., Maruyama, T., and Yamaguchi, Y. Performance com- parison of fpga, gpu and cpu in image processing. In Field programmable logic and applications, 2009. fpl 2009. international conference on (2009), IEEE, pp. 126–131

  8. [8]

    A quality analysis of openstreetmap data

    Ather, A. A quality analysis of openstreetmap data. ME Thesis, Uni- versity College London, London, UK 22 (2009). 39

  9. [9]

    IEEE transactions on pattern analysis and machine intelligence 39 , 12 (2017), 2481–2495

    Badrinarayanan, V., Kendall, A., and Cipolla, R.Segnet: A deep convolutional encoder-decoder architecture for image segmentation. IEEE transactions on pattern analysis and machine intelligence 39 , 12 (2017), 2481–2495

  10. [10]

    IEEE transactions on Geoscience and Remote Sensing 45, 5 (2007), 1506– 1511

    Bandyopadhyay, S., Maulik, U., and Mukhopadhyay, A.Multiob- jective genetic clustering for pixel classification in remote sensing imagery. IEEE transactions on Geoscience and Remote Sensing 45, 5 (2007), 1506– 1511

  11. [11]

    High spatial resolution satellite imagery, dem derivatives, and image segmentation for the detec- tion of mass wasting processes

    Barlow, J., Franklin, S., and Martin, Y. High spatial resolution satellite imagery, dem derivatives, and image segmentation for the detec- tion of mass wasting processes. Photogrammetric Engineering and Remote Sensing 72, 6 (2006), 687–692

  12. [12]

    Color-and texture-based image segmentation using em and its application to content- based image retrieval

    Belongie, S., Carson, C., Greenspan, H., and Malik, J. Color-and texture-based image segmentation using em and its application to content- based image retrieval. In Computer Vision, 1998. Sixth International Conference on (1998), IEEE, pp. 675–682

  13. [13]

    Greedy layer-wise training of deep networks

    Bengio, Y., Lamblin, P., Popovici, D., and Larochelle, H. Greedy layer-wise training of deep networks. In Advances in neural infor- mation processing systems (2007), pp. 153–160

  14. [14]

    Learning long-term de- pendencies with gradient descent is difficult

    Bengio, Y., Simard, P., and Frasconi, P. Learning long-term de- pendencies with gradient descent is difficult. IEEE transactions on neural networks 5, 2 (1994), 157–166

  15. [15]

    Large scale visual recognition challenge (ilsvrc), 2010

    Berg, A., Deng, J., and Fei-Fei, L. Large scale visual recognition challenge (ilsvrc), 2010. URL http://www. image-net. org/challenges/LSVRC 3 (2010)

  16. [16]

    C., Ehrlich, R., and Full, W

    Bezdek, J. C., Ehrlich, R., and Full, W. Fcm: The fuzzy c-means clustering algorithm. Computers and Geosciences 10, 2-3 (1984), 191–203

  17. [17]

    S., Fonseca, L

    Bins, L. S., Fonseca, L. G., Erthal, G. J., and Ii, F. M. Satellite imagery segmentation: a region growing approach. Simp´ osio Brasileiro de Sensoriamento Remoto 8 , 1996 (1996), 677–680

  18. [18]

    What is a salient object? a dataset and a baseline model for salient object detection

    Borji, A. What is a salient object? a dataset and a baseline model for salient object detection. IEEE Transactions on Image Processing 24 , 2 (2015), 742–756

  19. [19]

    Salient Object Detection: A Survey

    Borji, A., Cheng, M.-M., Hou, Q., Jiang, H., and Li, J. Salient object detection: A survey. arXiv preprint arXiv:1411.5878 (2014)

  20. [20]

    Salient object de- tection: A benchmark

    Borji, A., Cheng, M.-M., Jiang, H., and Li, J. Salient object de- tection: A benchmark. IEEE Transactions on Image Processing 24 , 12 (2015), 5706–5722. 40

  21. [21]

    Fast approximate energy minimization via graph cuts

    Boykov, Y., Veksler, O., and Zabih, R. Fast approximate energy minimization via graph cuts. IEEE Transactions on pattern analysis and machine intelligence 23 , 11 (2001), 1222–1239

  22. [22]

    Y., and Jolly, M.-P

    Boykov, Y. Y., and Jolly, M.-P. Interactive graph cuts for optimal boundary & region segmentation of objects in nd images. In Computer Vision, 2001. ICCV 2001. Proceedings. Eighth IEEE International Con- ference on (2001), vol. 1, IEEE, pp. 105–112

  23. [23]

    J., Fauqueur, J., and Cipolla, R

    Brostow, G. J., Fauqueur, J., and Cipolla, R. Semantic object classes in video: A high-definition ground truth database. Pattern Recog- nition Letters 30 , 2 (2009), 88–97

  24. [24]

    D., and Ray, L

    Cahill, N. D., and Ray, L. A. Method and system for compositing images to produce a cropped image, Jan. 9 2007. US Patent 7,162,102

  25. [25]

    L., Magrath, E., Gherman, A., Button, J., Nguyen, J., Bazin, P.-L., Calabresi, P

    Carass, A., Roy, S., Jog, A., Cuzzocreo, J. L., Magrath, E., Gherman, A., Button, J., Nguyen, J., Bazin, P.-L., Calabresi, P. A., et al. Longitudinal multiple sclerosis lesion segmentation data resource. Data in brief 12 (2017), 346–350

  26. [26]

    In Proceedings of the IEEE Confer- ence on Computer Vision and Pattern Recognition (2017), pp

    Castrejon, L., Kundu, K., Urtasun, R., and Fidler, S.Annotating object instances with a polygon-rnn. In Proceedings of the IEEE Confer- ence on Computer Vision and Pattern Recognition (2017), pp. 5230–5238

  27. [27]

    Exploiting the self-organizing map for medical image segmentation

    Chang, P.-L., and Teng, W.-G. Exploiting the self-organizing map for medical image segmentation. In Computer-Based Medical Systems, 2007. CBMS’07. Twentieth IEEE International Symposium on (2007), IEEE, pp. 281–288

  28. [28]

    Chen, J., Yang, L., Zhang, Y., Alber, M., and Chen, D. Z. Com- bining fully convolutional and recurrent neural networks for 3d biomedical image segmentation. In Advances in Neural Information Processing Sys- tems (2016), pp. 3036–3044

  29. [29]

    Chen, L.-C., Papandreou, G., Kokkinos, I., Murphy, K., and Yuille, A. L. Semantic image segmentation with deep convolutional nets and fully connected crfs. arXiv preprint arXiv:1412.7062 (2014)

  30. [30]

    Chen, L.-C., Papandreou, G., Kokkinos, I., Murphy, K., and Yuille, A. L. Deeplab: Semantic image segmentation with deep convo- lutional nets, atrous convolution, and fully connected crfs. IEEE transac- tions on pattern analysis and machine intelligence 40 , 4 (2018), 834–848

  31. [31]

    Chen, L.-C., Papandreou, G., Kokkinos, I., Murphy, K., and Yuille, A. L. Deeplab: Semantic image segmentation with deep convo- lutional nets, atrous convolution, and fully connected crfs. IEEE transac- tions on pattern analysis and machine intelligence 40 , 4 (2018), 834–848. 41

  32. [32]

    Rethinking Atrous Convolution for Semantic Image Segmentation

    Chen, L.-C., Papandreou, G., Schroff, F., and Adam, H.Rethink- ing atrous convolution for semantic image segmentation. arXiv preprint arXiv:1706.05587 (2017)

  33. [33]

    Chen, L.-C., Yang, Y., Wang, J., Xu, W., and Yuille, A. L. At- tention to scale: Scale-aware semantic image segmentation. In Proceedings of the IEEE conference on computer vision and pattern recognition(2016), pp. 3640–3649

  34. [34]

    Encoder-Decoder with Atrous Separable Convolution for Semantic Image Segmentation

    Chen, L.-C., Zhu, Y., Papandreou, G., Schroff, F., and Adam, H. Encoder-decoder with atrous separable convolution for semantic image segmentation. arXiv preprint arXiv:1802.02611 (2018)

  35. [35]

    The application of com- petitive hopfield neural network to medical image segmentation

    Cheng, K.-S., Lin, J.-S., and Mao, C.-W. The application of com- petitive hopfield neural network to medical image segmentation. IEEE transactions on medical imaging 15 , 4 (1996), 560–567

  36. [36]

    J., Huang, X., and Hu, S.-M

    Cheng, M.-M., Mitra, N. J., Huang, X., and Hu, S.-M. Salientshape: Group saliency in image collections. The Visual Computer 30, 4 (2014), 443–453

  37. [37]

    J., Huang, X., Torr, P

    Cheng, M.-M., Mitra, N. J., Huang, X., Torr, P. H., and Hu, S.-M. Global contrast based salient region detection. IEEE Transactions on Pattern Analysis and Machine Intelligence 37 , 3 (2015), 569–582

  38. [38]

    A multi-cue information based approach to contour detection by utilizing superpixel segmentation

    Choudhuri, S., Das, N., Ghosh, S., and Nasipuri, M. A multi-cue information based approach to contour detection by utilizing superpixel segmentation. In Advances in Computing, Communications and Informat- ics (ICACCI), 2016 International Conference on (2016), IEEE, pp. 1057– 1063

  39. [39]

    Fuzzy c-means clustering with spatial information for image segmentation

    Chuang, K.-S., Tzeng, H.-L., Chen, S., Wu, J., and Chen, T.-J. Fuzzy c-means clustering with spatial information for image segmentation. Computerized MedicalImaging and Graphics 30 , 1 (2006), 9–15

  40. [40]

    Robust analysis of feature spaces: color image segmentation

    Comaniciu, D., and Meer, P. Robust analysis of feature spaces: color image segmentation. In Computer Vision and Pattern Recognition, 1997. Proceedings., 1997 IEEE Computer Society Conference on (1997), IEEE, pp. 750–755

  41. [41]

    The cityscapes dataset for semantic urban scene understanding

    Cordts, M., Omran, M., Ramos, S., Rehfeld, T., Enzweiler, M., Benenson, R., Franke, U., Roth, S., and Schiele, B. The cityscapes dataset for semantic urban scene understanding. In Proceedings of the IEEE conference on computer vision and pattern recognition(2016), pp. 3213–3223

  42. [42]

    Boxsup: Exploiting bounding boxes to supervise convolutional networks for semantic segmentation

    Dai, J., He, K., and Sun, J. Boxsup: Exploiting bounding boxes to supervise convolutional networks for semantic segmentation. In Proceed- ings of the IEEE International Conference on Computer Vision (2015), pp. 1635–1643. 42

  43. [43]

    Instance-aware semantic segmentation via multi-task network cascades

    Dai, J., He, K., and Sun, J. Instance-aware semantic segmentation via multi-task network cascades. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2016), pp. 3150–3158

  44. [44]

    R-fcn: Object detection via region- based fully convolutional networks

    Dai, J., Li, Y., He, K., and Sun, J. R-fcn: Object detection via region- based fully convolutional networks. In Advances in neural information processing systems (2016), pp. 379–387

  45. [45]

    Combining Multi-level Contexts of Superpixel using Convolutional Neural Networks to perform Natural Scene Labeling

    Das, A., Ghosh, S., Sarkhel, R., Choudhuri, S., Das, N., and Nasipuri, M. Combining multi-level contexts of superpixel using convo- lutional neural networks to perform natural scene labeling. arXiv preprint arXiv:1803.05200 (2018)

  46. [46]

    Combining multilevel contexts of superpixel using con- volutional neural networks to perform natural scene labeling

    Das, A., Ghosh, S., Sarkhel, R., Choudhuri, S., Das, N., and Nasipuri, M. Combining multilevel contexts of superpixel using con- volutional neural networks to perform natural scene labeling. In Recent Developments in Machine Learning and Data Analytics . Springer, 2019, pp. 297–306

  47. [47]

    P., Esquef, I

    De Albuquerque, M. P., Esquef, I. A., and Mello, A. G. Im- age thresholding using tsallis entropy. Pattern Recognition Letters 25 , 9 (2004), 1059–1065

  48. [48]

    A., and Niessen, W

    de Bruijne, M., van Ginneken, B., Viergever, M. A., and Niessen, W. J. Interactive segmentation of abdominal aortic aneurysms in cta images. Medical Image Analysis 8 , 2 (2004), 127–138

  49. [49]

    DeepGlobe 2018: A Challenge to Parse the Earth through Satellite Images

    Demir, I., Koperski, K., Lindenbaum, D., Pang, G., Huang, J., Basu, S., Hughes, F., Tuia, D., and Raskar, R. Deepglobe 2018: A challenge to parse the earth through satellite images. arXiv preprint arXiv:1805.06561 (2018)

  50. [50]

    Imagenet: A large-scale hierarchical image database

    Deng, J., Dong, W., Socher, R., Li, L.-J., Li, K., and Fei-Fei, L. Imagenet: A large-scale hierarchical image database. In Computer Vision and Pattern Recognition, 2009. CVPR 2009. IEEE Conference on (2009), IEEE, pp. 248–255

  51. [51]

    Video-based noncooperative iris image segmentation

    Du, Y., Arslanturk, E., Zhou, Z., and Belcher, C. Video-based noncooperative iris image segmentation. IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics) 41 , 1 (2011), 64–74

  52. [52]

    W., Xu, D., and Chua, T.-S

    Duan, L., Tsang, I. W., Xu, D., and Chua, T.-S. Domain adap- tation from multiple sources via auxiliary classifiers. In Proceedings of the 26th Annual International Conference on Machine Learning (2009), ACM, pp. 289–296

  53. [53]

    A guide to convolution arithmetic for deep learning

    Dumoulin, V., and Visin, F. A guide to convolution arithmetic for deep learning. arXiv preprint arXiv:1603.07285 (2016). 43

  54. [54]

    K., Winn, J., and Zisserman, A

    Everingham, M., Van Gool, L., Williams, C. K., Winn, J., and Zisserman, A. The pascal visual object classes (voc) challenge. Interna- tional journal of computer vision 88 , 2 (2010), 303–338

  55. [55]

    Learning hierarchical features for scene labeling

    Farabet, C., Couprie, C., Najman, L., and LeCun, Y. Learning hierarchical features for scene labeling. IEEE transactions on pattern analysis and machine intelligence 35 , 8 (2013), 1915–1929

  56. [56]

    F., and Huttenlocher, D

    Felzenszwalb, P. F., and Huttenlocher, D. P. Efficient graph- based image segmentation. International journal of computer vision 59 , 2 (2004), 167–181

  57. [57]

    S., and Sensing, R

    for Photogrammetry, I. S., and Sensing, R. Isprs 2d semantic labeling contest

  58. [58]

    M., Remagnino, P., Hoppe, A., Uyyanonvara, B., Rud- nicka, A

    Fraz, M. M., Remagnino, P., Hoppe, A., Uyyanonvara, B., Rud- nicka, A. R., Owen, C. G., and Barman, S. A. Blood vessel seg- mentation methodologies in retinal images–a survey. Computer methods and programs in biomedicine 108 , 1 (2012), 407–433

  59. [59]

    Yet another survey on image segmentation: Region and boundary information integration

    Freixenet, J., Mu˜noz, X., Raba, D., Mart´ı, J., and Cuf´ı, X. Yet another survey on image segmentation: Region and boundary information integration. In European conference on computer vision (2002), Springer, pp. 408–422

  60. [60]

    Image segmentation in video sequences: A probabilistic approach

    Friedman, N., and Russell, S. Image segmentation in video sequences: A probabilistic approach. In Proceedings of the Thirteenth conference on Uncertainty in artificial intelligence (1997), Morgan Kaufmann Publishers Inc., pp. 175–181

  61. [61]

    A survey on image segmentation

    Fu, K.-S., and Mui, J. A survey on image segmentation. Pattern recognition 13, 1 (1981), 3–16

  62. [62]

    Neocognitron: A hierarchical neural network capable of visual pattern recognition

    Fukushima, K. Neocognitron: A hierarchical neural network capable of visual pattern recognition. Neural networks 1 , 2 (1988), 119–130

  63. [63]

    In Compe- tition and cooperation in neural nets

    Fukushima, K., and Miyake, S.Neocognitron: A self-organizing neural network model for a mechanism of visual pattern recognition. In Compe- tition and cooperation in neural nets . Springer, 1982, pp. 267–285

  64. [64]

    A unified video segmentation benchmark: Annotation, metrics and analysis

    Galasso, F., Shankar Nagaraja, N., Jimenez Cardenas, T., Brox, T., and Schiele, B. A unified video segmentation benchmark: Annotation, metrics and analysis. In Proceedings of the IEEE Interna- tional Conference on Computer Vision (2013), pp. 3527–3534

  65. [65]

    Deepirisnet: Deep iris representation with applications in iris recognition and cross-sensor iris recognition

    Gangwar, A., and Joshi, A. Deepirisnet: Deep iris representation with applications in iris recognition and cross-sensor iris recognition. In Image Processing (ICIP), 2016 IEEE International Conference on (2016), IEEE, pp. 2301–2305. 44

  66. [66]

    A Review on Deep Learning Techniques Applied to Semantic Segmentation

    Garcia-Garcia, A., Orts-Escolano, S., Oprea, S., Villena- Martinez, V., and Garcia-Rodriguez, J. A review on deep learning techniques applied to semantic segmentation. arXiv preprint arXiv:1704.06857 (2017)

  67. [67]

    Are we ready for autonomous driving? the kitti vision benchmark suite

    Geiger, A., Lenz, P., and Urtasun, R. Are we ready for autonomous driving? the kitti vision benchmark suite. In Computer Vision and Pattern Recognition (CVPR), 2012 IEEE Conference on (2012), IEEE, pp. 3354–3361

  68. [68]

    Survey of recent progress in semantic image segmentation with cnns

    Geng, Q., Zhou, Z., and Cao, X. Survey of recent progress in semantic image segmentation with cnns. Science China Information Sciences 61 , 5 (2018), 051101

  69. [69]

    Fast R-CNN

    Girshick, R. Fast r-cnn. arXiv preprint arXiv:1504.08083 (2015)

  70. [70]

    Rich fea- ture hierarchies for accurate object detection and semantic segmentation

    Girshick, R., Donahue, J., Darrell, T., and Malik, J. Rich fea- ture hierarchies for accurate object detection and semantic segmentation. In Proceedings of the IEEE conference on computer vision and pattern recognition (2014), pp. 580–587

  71. [71]

    Generative adversarial nets

    Goodfellow, I., Pouget-Abadie, J., Mirza, M., Xu, B., Warde- Farley, D., Ozair, S., Courville, A., and Bengio, Y. Generative adversarial nets. In Advances in neural information processing systems (2014), pp. 2672–2680

  72. [72]

    Decomposing a scene into geometric and semantically consistent regions

    Gould, S., Fulton, R., and Koller, D. Decomposing a scene into geometric and semantically consistent regions. In Computer Vision, 2009 IEEE 12th International Conference on (2009), IEEE, pp. 1–8

  73. [73]

    Grau, V., Mewes, A., Alcaniz, M., Kikinis, R., and Warfield, S. K. Improved watershed transform for medical image segmentation using prior information. IEEE transactions on medical imaging 23 , 4 (2004), 447–458

  74. [74]

    Automatic Liver Lesion Segmentation Using A Deep Convolutional Neural Network Method

    Han, X. Automatic liver lesion segmentation using a deep convolutional neural network method. arXiv preprint arXiv:1704.07239 (2017)

  75. [75]

    Semantic contours from inverse detectors

    Hariharan, B., Arbelaez, P., Bourdev, L., Maji, S., and Malik, J. Semantic contours from inverse detectors. In International Conference on Computer Vision (ICCV) (2011)

  76. [76]

    Mask r-cnn

    He, K., Gkioxari, G., Doll ´ar, P., and Girshick, R. Mask r-cnn. In Computer Vision (ICCV), 2017 IEEE International Conference on (2017), IEEE, pp. 2980–2988

  77. [77]

    Spatial pyramid pooling in deep convolutional networks for visual recognition

    He, K., Zhang, X., Ren, S., and Sun, J. Spatial pyramid pooling in deep convolutional networks for visual recognition. IEEE transactions on pattern analysis and machine intelligence 37 , 9 (2015), 1904–1916. 45

  78. [78]

    Deep residual learning for image recognition

    He, K., Zhang, X., Ren, S., and Sun, J. Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition (2016), pp. 770–778

  79. [79]

    E., Osindero, S., and Teh, Y.-W

    Hinton, G. E., Osindero, S., and Teh, Y.-W. A fast learning algo- rithm for deep belief nets. Neural computation 18, 7 (2006), 1527–1554

  80. [80]

    Hochbaum, D. S. An efficient algorithm for image segmentation, markov random fields and related problems. Journal of the ACM (JACM) 48 , 4 (2001), 686–701

Showing first 80 references.