Deep neural networks produce state-of-the-art results when trained on a large number of labeled examples but tend to overfit when small amounts of labeled examples are used for training. Creating a large number of labeled examples requires considerable resources, time, and effort. If labeling new data is not feasible, so-called semi-supervised learning can achieve better generalisation than purely supervised learning by employing unlabeled instances as well as labeled ones. The work presented in this paper is motivated by the observation that transfer learning provides the opportunity to potentially further improve performance by exploiting models pretrained on a similar domain. More specifically, we explore the use of transfer learning when performing semi-supervised learning using self-learning. The main contribution is an empirical evaluation of transfer learning using different combinations of similarity metric learning methods and label propagation algorithms in semi-supervised learning. We find that transfer learning always substantially improves the model's accuracy when few labeled examples are available, regardless of the type of loss used for training the neural network. This finding is obtained by performing extensive experiments on the SVHN, CIFAR10, and Plant Village image classification datasets and applying pretrained weights from Imagenet for transfer learning.
翻译:深心神经网络在培训大量标签实例时产生最先进的结果,但在培训中使用少量标签实例时往往过度使用。 创建大量标签实例需要大量的资源、时间和努力。 如果标签新数据不可行,所谓的半监督学习可以比纯监督学习取得更好的概括性结果,办法是使用未标签实例和标签实例。本文件所述工作是由于这样一种观察,即转让学习通过利用类似领域预先培训的模型,为进一步提高性能提供了机会。更具体地说,我们探索在利用自学进行半监督学习时使用转移学习的方法。主要贡献是经验性地评估转让学习,使用类似标准学习方法的不同组合和半监督学习中标签传播算法。我们发现,转让学习总是在很少有标签实例的情况下大大提高模型的准确性,而不论用于培训神经网络的损失类型如何。通过在SVHN、CIFAR10和村庄图像系统之前进行广泛实验,并应用图像系统升级数据分类来进行学习。