Growing interests in RGB-D salient object detection (RGB-D SOD) have been witnessed in recent years, owing partly to the popularity of depth sensors and the rapid progress of deep learning techniques. Unfortunately, existing RGB-D SOD methods typically demand large quantity of training images being thoroughly annotated at pixel-level. The laborious and time-consuming manual annotation has become a real bottleneck in various practical scenarios. On the other hand, current unsupervised RGB-D SOD methods still heavily rely on handcrafted feature representations. This inspires us to propose in this paper a deep unsupervised RGB-D saliency detection approach, which requires no manual pixel-level annotation during training. It is realized by two key ingredients in our training pipeline. First, a depth-disentangled saliency update (DSU) framework is designed to automatically produce pseudo-labels with iterative follow-up refinements, which provides more trustworthy supervision signals for training the saliency network. Second, an attentive training strategy is introduced to tackle the issue of noisy pseudo-labels, by properly re-weighting to highlight the more reliable pseudo-labels. Extensive experiments demonstrate the superior efficiency and effectiveness of our approach in tackling the challenging unsupervised RGB-D SOD scenarios. Moreover, our approach can also be adapted to work in fully-supervised situation. Empirical studies show the incorporation of our approach gives rise to notably performance improvement in existing supervised RGB-D SOD models.
翻译:近年来,人们对RGB-D显要物体探测(RGB-D SOD)的兴趣日益浓厚,部分原因是深度传感器受到欢迎,深层次学习技术也迅速取得进展。不幸的是,现有的RGB-D SOD方法通常要求大量的培训图像在像素层面得到彻底的注解。在各种实际情景中,耗时费力的手册批注已成为一个真正的瓶颈。另一方面,目前不受监督的RGB-D SOD方法仍然在很大程度上依赖于手动特征演示。这激励我们在本文件中提出一个深度不受监督的RGB-D显要性探测方法,这在培训期间不需要手动像素级说明。我们的培训管道中有两个关键成份实现了这一点。首先,一个深度分散的显要性更新(DSU)框架旨在自动制作带有迭接式后续改进的假标签,为培训突出的网络提供更可靠的监督信号。第二,我们引入了一项细致的培训战略,以解决调调的伪标签问题,在培训过程中不需要人工的像素量级级级级级的比标问题,通过适当的再展示我们更具有挑战性的工作效率。