Most of the achievements in artificial intelligence so far were accomplished by supervised learning which requires numerous annotated training data and thus costs innumerable manpower for labeling. Unsupervised learning is one of the effective solutions to overcome such difficulties. In our work, we propose AugNet, a new deep learning training paradigm to learn image features from a collection of unlabeled pictures. We develop a method to construct the similarities between pictures as distance metrics in the embedding space by leveraging the inter-correlation between augmented versions of samples. Our experiments demonstrate that the method is able to represent the image in low dimensional space and performs competitively in downstream tasks such as image classification and image similarity comparison. Specifically, we achieved over 60% and 27% accuracy on the STL10 and CIFAR100 datasets with unsupervised clustering, respectively. Moreover, unlike many deep-learning-based image retrieval algorithms, our approach does not require access to external annotated datasets to train the feature extractor, but still shows comparable or even better feature representation ability and easy-to-use characteristics. In our evaluations, the method outperforms all the state-of-the-art image retrieval algorithms on some out-of-domain image datasets. The code for the model implementation is available at https://github.com/chenmingxiang110/AugNet.
翻译:迄今在人工智能领域取得的大部分成就都是通过监督学习取得的,这需要大量附加说明的培训数据,从而需要大量人力来进行标签。不受监督的学习是克服这些困难的有效解决办法之一。在我们的工作中,我们提议AugNet,这是一个新的深层次的学习培训模式,从未贴标签的图片库中学习图像特征。我们开发了一种方法,通过利用扩大的样本版本之间的相互关系,在嵌入空间中将图像作为远程测量标准来构建相像的相似性。我们的实验表明,该方法能够代表低维空间的图像,并在图像分类和图像相似性比较等下游任务中以竞争方式进行。具体地说,我们在STL10和CIFAR100数据集中分别实现了60%和27%的准确性,并且没有标签标签的组合。此外,与许多基于深层次的图像检索算法不同,我们的方法并不要求利用外部附加说明的数据集来培训地貌提取器,但是仍然显示可比较或更佳的特征展示能力以及易于使用的特征特征特征特征特征特征特征特征特征特征特征。在我们的评估中,一些评估中,方法超越了所有州/州级的图像搜索系统可使用系统。