Recently, self-supervised learning methods like MoCo, SimCLR, BYOL and SwAV have reduced the gap with supervised methods. These results have been achieved in a control environment, that is the highly curated ImageNet dataset. However, the premise of self-supervised learning is that it can learn from any random image and from any unbounded dataset. In this work, we explore if self-supervision lives to its expectation by training large models on random, uncurated images with no supervision. Our final SElf-supERvised (SEER) model, a RegNetY with 1.3B parameters trained on 1B random images with 512 GPUs achieves 84.2% top-1 accuracy, surpassing the best self-supervised pretrained model by 1% and confirming that self-supervised learning works in a real world setting. Interestingly, we also observe that self-supervised models are good few-shot learners achieving 77.9% top-1 with access to only 10% of ImageNet. Code: https://github.com/facebookresearch/vissl
翻译:最近,自我监督的学习方法,如MoCo、SimCLR、BYOL和SWAV等,通过监督方法缩小了差距。这些结果是在控制环境中取得的,即高度整理的图像网络数据集。然而,自我监督学习的前提是,它可以从任何随机图像和任何未覆盖的数据集中学习。在这项工作中,我们探索自我监督视像是否在无监督的情况下通过在随机、无监督的图像上培训大型模型来达到期望。我们最后的Self-Supervised(Seer)模型,一个具有1.3B参数的RegNetY,在1B随机图像(512 GPUs)上培训了1.3B参数的RegNetY,实现了84.2%的顶级-1精确度,超过最佳自我监督的预设模型的1%,并确认在现实环境中进行自我监督的学习工作。有趣的是,我们还注意到,自监督的模型是少数成功的学生,只有77.9%的上一年级学生,只能访问图像网络的10%。代码:https://github.com/pacebribours/vissl。