Unsupervised image representations have significantly reduced the gap with supervised pretraining, notably with the recent achievements of contrastive learning methods. These contrastive methods typically work online and rely on a large number of explicit pairwise feature comparisons, which is computationally challenging. In this paper, we propose an online algorithm, SwAV, that takes advantage of contrastive methods without requiring to compute pairwise comparisons. Specifically, our method simultaneously clusters the data while enforcing consistency between cluster assignments produced for different augmentations (or views) of the same image, instead of comparing features directly as in contrastive learning. Simply put, we use a swapped prediction mechanism where we predict the cluster assignment of a view from the representation of another view. Our method can be trained with large and small batches and can scale to unlimited amounts of data. Compared to previous contrastive methods, our method is more memory efficient since it does not require a large memory bank or a special momentum network. In addition, we also propose a new data augmentation strategy, multi-crop, that uses a mix of views with different resolutions in place of two full-resolution views, without increasing the memory or compute requirements much. We validate our findings by achieving 75.3% top-1 accuracy on ImageNet with ResNet-50, as well as surpassing supervised pretraining on all the considered transfer tasks.
 翻译:未经监督的图像显示方式大大缩小了与监督前培训之间的差距,特别是最近通过对比学习方法取得的最新成就。这些对比性方法通常在网上运作,并依靠大量明确的对称特征比较,这在计算上具有挑战性。在本文中,我们建议使用在线算法SwaV,利用对比方法,利用对比方法,利用对比方法,利用对比方法,对不同增强(或观点)同一图像的组任务之间数据进行同时分组,而不是直接比较不同图像的特征。简而言之,我们使用一种互换的预测机制,即我们从另一个视图的表示中预测一个视图的组合分配。我们的方法可以用大批量和小批量的批量的培训,可以达到无限数量的数据。与以往的对比方法相比,我们的方法提高了记忆效率,因为它不需要大的记忆库或特殊的动力网络。此外,我们还提出了一个新的数据增强战略,即多项,利用不同决议的观点组合,取代了两种完全解析式的观点,而不会增加记忆或压缩对另一个视图的视图的组合。我们的方法可以被培训成大批量和无限数量。我们的方法,我们通过75.3对图像的升级来验证了我们的所有结果,我们经过了,我们经过了对图像的升级是经过了75的升级。