pith. machine review for the scientific record. sign in

arxiv: 1112.6209 · v5 · submitted 2011-12-29 · 💻 cs.LG

Recognition: unknown

Building high-level features using large scale unsupervised learning

Authors on Pith no claims yet
classification 💻 cs.LG
keywords imagesdetectorfacehigh-levelnetworkonlytrainbuilding
0
0 comments X
read the original abstract

We consider the problem of building high-level, class-specific feature detectors from only unlabeled data. For example, is it possible to learn a face detector using only unlabeled images? To answer this, we train a 9-layered locally connected sparse autoencoder with pooling and local contrast normalization on a large dataset of images (the model has 1 billion connections, the dataset has 10 million 200x200 pixel images downloaded from the Internet). We train this network using model parallelism and asynchronous SGD on a cluster with 1,000 machines (16,000 cores) for three days. Contrary to what appears to be a widely-held intuition, our experimental results reveal that it is possible to train a face detector without having to label images as containing a face or not. Control experiments show that this feature detector is robust not only to translation but also to scaling and out-of-plane rotation. We also find that the same network is sensitive to other high-level concepts such as cat faces and human bodies. Starting with these learned features, we trained our network to obtain 15.8% accuracy in recognizing 20,000 object categories from ImageNet, a leap of 70% relative improvement over the previous state-of-the-art.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. Intriguing properties of neural networks

    cs.CV 2013-12 accept novelty 8.0

    Deep neural networks exhibit distributed high-level semantic representations and discontinuous input-output mappings vulnerable to transferable adversarial perturbations.