Understanding data augmentation for classification: when to warp?
read the original abstract
In this paper we investigate the benefit of augmenting data with synthetically created samples when training a machine learning classifier. Two approaches for creating additional training samples are data warping, which generates additional samples through transformations applied in the data-space, and synthetic over-sampling, which creates additional samples in feature-space. We experimentally evaluate the benefits of data augmentation for a convolutional backpropagation-trained neural network, a convolutional support vector machine and a convolutional extreme learning machine classifier, using the standard MNIST handwritten digit dataset. We found that while it is possible to perform generic augmentation in feature-space, if plausible transforms for the data are known then augmentation in data-space provides a greater benefit for improving performance and reducing overfitting.
This paper has not been read by Pith yet.
Forward citations
Cited by 1 Pith paper
-
Assessing Post Deletion in Sina Weibo: Multi-modal Classification of Hot Topics
Multi-modal analysis of 994 Weibo posts and 18,966 images finds sentiment as the sole consistent predictor of censorship, with anti-government topics deleted more often and average deletion time of three hours.
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.