This project focuses on the self-supervised training of convolutional neural networks (CNNs) and transformer networks for the task of image recognition. A simple siamese network with different backbones is used in order to maximize the similarity of two augmented transformed images from the same source image. In this way, the backbone is able to learn visual information without supervision. Finally, the method is evaluated on three image recognition datasets.
翻译:该项目侧重于对革命神经网络和变压器网络进行自我监督的培训,以完成图像识别任务。 使用一个带有不同脊椎的简单剪切网络,以尽量扩大同一源图像中两个放大的变形图像的相似性。 这样,主干就可以在没有监督的情况下学习视觉信息。 最后,该方法用三个图像识别数据集进行评估。