Recent progress in self-supervised learning has resulted in models that are capable of extracting rich representations from image collections without requiring any explicit label supervision. However, to date the vast majority of these approaches have restricted themselves to training on standard benchmark datasets such as ImageNet. We argue that fine-grained visual categorization problems, such as plant and animal species classification, provide an informative testbed for self-supervised learning. In order to facilitate progress in this area we present two new natural world visual classification datasets, iNat2021 and NeWT. The former consists of 2.7M images from 10k different species uploaded by users of the citizen science application iNaturalist. We designed the latter, NeWT, in collaboration with domain experts with the aim of benchmarking the performance of representation learning algorithms on a suite of challenging natural world binary classification tasks that go beyond standard species classification. These two new datasets allow us to explore questions related to large-scale representation and transfer learning in the context of fine-grained categories. We provide a comprehensive analysis of feature extractors trained with and without supervision on ImageNet and iNat2021, shedding light on the strengths and weaknesses of different learned features across a diverse set of tasks. We find that features produced by standard supervised methods still outperform those produced by self-supervised approaches such as SimCLR. However, improved self-supervised learning methods are constantly being released and the iNat2021 and NeWT datasets are a valuable resource for tracking their progress.
翻译:在自我监督的学习方面最近取得的进展产生了一些模型,这些模型能够从图像收集中提取丰富的内容,而不需要任何明确的标签监督。然而,迄今为止,这些方法中绝大多数都局限于对图像网络等标准基准数据集的培训。我们争辩说,微小的视觉分类问题,如动植物物种分类,为自我监督的学习提供了内容丰富的测试台。为了便利在这方面取得进展,我们提出了两个新的自然世界视觉分类数据集,即iNat2021和NeWT。前者由公民科学应用程序iNaturist用户上传的10k种不同物种的2.7M图像组成。我们与域专家合作设计了后者,即NeWT,目的是为具有挑战性的自然世界二进制分类任务的一套具有挑战性的研究算法的绩效设定基准。这两个新的数据集使我们得以探索与在精细的类别中大规模代表制和转让学习20。我们对经过培训和不受监督的10k种不同物种的图像网络和iNat2021的图象进行了有价值的分析。我们设计了后者(NeWT),与域专家合作设计了后者,目的是将代表的算算出一套挑战性算方法,但我们通过不断改进的自我评估的自我评估的系统,通过这些方法,我们通过不断改进了各种的自我学习的弱点和Sim20LR的自我分析了这些方法,我们发现了各种的弱点和Simefleg-