Improving CNN Accuracy by Training on Auxiliary Data Source
Loading...
Date
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
The standard model of supervised learning assumes that training and test data are drawn from the same underlying distribution. This thesis explores an area in which a second, (auxiliary), source of data is available and is drawn from a different distribution. This auxiliary data might be plentiful, but of significantly lower quality, than the training and test data. In the CNN framework, the Softmax function gives the probabilities that are further used in classification. This thesis considers using the auxiliary data in either of these roles. This auxiliary data framework is applied to a problem of classifying images of MNIST Handwritten dataset and also on Handwritten Gurmukhi Script dataset. Experiments show that even when the training dataset is small, training with auxiliary data can produces improvements in accuracy. When the dataset is large, the improvements in accuracy are even higher.
