Self-supervised Learning (SSL) has recently gained much attention due to the high cost and data limitation in the training of supervised learning models. The current paradigm in the SSL is to utilize data augmentation at the input space to create different views of the same images and train a model to maximize the representations between similar images and minimize them for different ones. While this approach achieves state-of-the-art (SOTA) results in various downstream tasks, it still lakes the opportunity to investigate the latent space augmentation. This paper proposes TriMix, a novel concept for SSL that generates virtual embeddings through linear interpolation of the data, thus providing the model with novel representations. Our strategy focuses on training the model to extract the original embeddings from virtual ones, hence, better representation learning. Additionally, we propose a self-consistency term that improves the consistency between the virtual and actual embeddings. We validate TriMix on eight benchmark datasets consisting of natural and medical images with an improvement of 2.71% and 0.41% better than the second-best models for both data types. Further, our approach outperformed the current methods in semi-supervised learning, particularly in low data regimes. Besides, our pre-trained models showed better transfer to other datasets.
翻译:自监督学习(SSL)最近由于在受监督的学习模式的培训中成本和数据限制很高而引起许多关注。SSL目前的范例是利用输入空间的数据增强来生成对相同图像的不同观点,并培训一个模型以最大限度地扩大类似图像之间的表达方式,并尽可能缩小不同图像之间的表达方式。虽然这一方法在各种下游任务中取得了最先进的(SOTA)结果,但它仍然为调查潜在的空间增强提供了机会。本文件提出了TriMix,这是SSL的一个新概念,它通过数据的线性内插生成虚拟嵌入,从而为模型提供了新的表述。我们的战略侧重于培训模型,从虚拟图像中提取原始嵌入,从而更好地进行演示学习。此外,我们提出了一个自我一致的术语,以提高虚拟嵌入和实际嵌入之间的一致性。我们验证了由自然和医学图像组成的八个基准数据集的TriMix,其改进率为2.71%和0.41%,这比两种数据类型的第二最佳模型都好。此外,我们的方法超越了从虚拟模型中提取原始嵌入的原始数据的方法,因此,在半超前数据学习模式中,我们展示了更好的方法。