Existing approaches to unsupervised object discovery (UOD) do not scale up to large datasets without approximations that compromise their performance. We propose a novel formulation of UOD as a ranking problem, amenable to the arsenal of distributed methods available for eigenvalue problems and link analysis. Through the use of self-supervised features, we also demonstrate the first effective fully unsupervised pipeline for UOD. Extensive experiments on COCO and OpenImages show that, in the single-object discovery setting where a single prominent object is sought in each image, the proposed LOD (Large-scale Object Discovery) approach is on par with, or better than the state of the art for medium-scale datasets (up to 120K images), and over 37% better than the only other algorithms capable of scaling up to 1.7M images. In the multi-object discovery setting where multiple objects are sought in each image, the proposed LOD is over 14% better in average precision (AP) than all other methods for datasets ranging from 20K to 1.7M images. Using self-supervised features, we also show that the proposed method obtains state-of-the-art UOD performance on OpenImages. Our code is publicly available at https://github.com/huyvvo/LOD.
翻译:在未受监督的天体发现(UOD)的现有方法没有放大到影响其性能的大型数据集。 我们建议将UOD作为一个排名问题,适合用于天值问题和链接分析的分布式方法库。 我们还通过使用自我监督的特性,展示了第一个完全不受监督的UOD管道。 在COCO和OpenImaages上的广泛实验显示,在每个图像中寻找单一突出对象的单项发现设置中,拟议的LOD(大型天体发现)方法与中等级数据集(最多为120K图像)相近,或优于其艺术状态,比唯一能够放大至1.7M图像的其他算法高出37%以上。在每张图像中寻找多个对象的多点发现设置中,拟议的LOD比20K至1.7M图像的所有其他数据集方法平均精度(AP)高14%以上。使用自我监督的自我监督功能,我们还可以在OFODI/ODODI上展示拟议的方法。