pith. sign in

arxiv: 1807.09200 · v1 · pith:JO5AOGGGnew · submitted 2018-07-24 · 💻 cs.LG · cs.CV· stat.ML

Self-Paced Learning with Adaptive Deep Visual Embeddings

classification 💻 cs.LG cs.CVstat.ML
keywords deeplearningexamplesself-pacedtrainingvisualadaptivedata
0
0 comments X
read the original abstract

Selecting the most appropriate data examples to present a deep neural network (DNN) at different stages of training is an unsolved challenge. Though practitioners typically ignore this problem, a non-trivial data scheduling method may result in a significant improvement in both convergence and generalization performance. In this paper, we introduce Self-Paced Learning with Adaptive Deep Visual Embeddings (SPL-ADVisE), a novel end-to-end training protocol that unites self-paced learning (SPL) and deep metric learning (DML). We leverage the Magnet Loss to train an embedding convolutional neural network (CNN) to learn a salient representation space. The student CNN classifier dynamically selects similar instance-level training examples to form a mini-batch, where the easiness from the cross-entropy loss and the true diverseness of examples from the learned metric space serve as sample importance priors. To demonstrate the effectiveness of SPL-ADVisE, we use deep CNN architectures for the task of supervised image classification on several coarse- and fine-grained visual recognition datasets. Results show that, across all datasets, the proposed method converges faster and reaches a higher final accuracy than other SPL variants, particularly on fine-grained classes.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. Submodular Batch Selection for Training Deep Neural Networks

    cs.LG 2019-06 unverdicted novelty 5.0

    A greedy submodular maximization method for mini-batch selection in DNN training yields better generalization than SGD on standard datasets.