We introduce MOVE, a novel method to segment objects without any form of supervision. MOVE exploits the fact that foreground objects can be shifted locally relative to their initial position and result in realistic (undistorted) new images. This property allows us to train a segmentation model on a dataset of images without annotation and to achieve state of the art (SotA) performance on several evaluation datasets for unsupervised salient object detection and segmentation. In unsupervised single object discovery, MOVE gives an average CorLoc improvement of 7.2% over the SotA, and in unsupervised class-agnostic object detection it gives a relative AP improvement of 53% on average. Our approach is built on top of self-supervised features (e.g. from DINO or MAE), an inpainting network (based on the Masked AutoEncoder) and adversarial training.
翻译:我们引入了MOVE, 这是一种在没有任何监督形式的情况下分割对象的新方法。 MOVE利用了一个事实, 即前景对象可以相对其初始位置在本地移动, 并产生现实的( 孤立的) 新图像。 这个属性使我们能够在图像数据集上训练一个分解模型, 而不作注释, 并在一些未受监督的突出对象探测和分割的评价数据集上达到最新水平( SotA) 。 在未受监督的单一对象发现中, MOVE 给出了比 SotA 平均7.2%的CorLoc改进度, 在未受监督的类物体探测中, 它给出了53%的相对AP值。 我们的方法建在自我监督特性( 例如来自 DINO 或 MAE ) 的顶端上, 一个基于防腐蚀的自动计算机( Aute Encoder) 和对抗性能训练的插入网络 。