pith. sign in

arxiv: 1609.08764 · v2 · pith:ZCIU2UL5new · submitted 2016-09-28 · 💻 cs.CV

Understanding data augmentation for classification: when to warp?

classification 💻 cs.CV
keywords dataaugmentationsamplesadditionalconvolutionalmachinebenefitclassifier
0
0 comments X
read the original abstract

In this paper we investigate the benefit of augmenting data with synthetically created samples when training a machine learning classifier. Two approaches for creating additional training samples are data warping, which generates additional samples through transformations applied in the data-space, and synthetic over-sampling, which creates additional samples in feature-space. We experimentally evaluate the benefits of data augmentation for a convolutional backpropagation-trained neural network, a convolutional support vector machine and a convolutional extreme learning machine classifier, using the standard MNIST handwritten digit dataset. We found that while it is possible to perform generic augmentation in feature-space, if plausible transforms for the data are known then augmentation in data-space provides a greater benefit for improving performance and reducing overfitting.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. Assessing Post Deletion in Sina Weibo: Multi-modal Classification of Hot Topics

    cs.SI 2019-06 unverdicted novelty 5.0

    Multi-modal analysis of 994 Weibo posts and 18,966 images finds sentiment as the sole consistent predictor of censorship, with anti-government topics deleted more often and average deletion time of three hours.