Recent progress on salient object detection (SOD) mainly benefits from multi-scale learning, where the high-level and low-level features collaborate in locating salient objects and discovering fine details, respectively. However, most efforts are devoted to low-level feature learning by fusing multi-scale features or enhancing boundary representations. High-level features, which although have long proven effective for many other tasks, yet have been barely studied for SOD. In this paper, we tap into this gap and show that enhancing high-level features is essential for SOD as well. To this end, we introduce an Extremely-Downsampled Network (EDN), which employs an extreme downsampling technique to effectively learn a global view of the whole image, leading to accurate salient object localization. To accomplish better multi-level feature fusion, we construct the Scale-Correlated Pyramid Convolution (SCPC) to build an elegant decoder for recovering object details from the above extreme downsampling. Extensive experiments demonstrate that EDN achieves state-of-the-art performance with real-time speed. Our efficient EDN-Lite also achieves competitive performance with a speed of 316fps. Hence, this work is expected to spark some new thinking in SOD. Full training and testing code will be available at https://github.com/yuhuan-wu/EDN.
翻译:显要物体探测(SOD)最近的进展主要得益于多级学习,其中高低级特征在定位突出物体和发现精细细节方面相互协作。然而,大部分努力都致力于通过使用多级特征或加强边界代表来有效学习低层次特征。高层次特征虽然长期以来已证明对许多其他任务有效,但对于SOD却很少研究。在本文中,我们利用这一差距,并表明加强高层次特征对于SOD也至关重要。为此,我们引入了一个极低层次的网络(EDN),它使用极低的采样技术来有效学习全图像的全球观,导致精确的显要目标本地化。为了实现更好的多层次特征融合,我们建立了与SCPC有关的S-C-Cor Pyramid Convolu (SCC), 以便建立一个优雅的解码器,从上述极端下游中恢复目标细节。广泛的实验表明,EDMN以实时速度取得最先进的表现。我们高效的S-MISL工作将具有竞争力,在SARVEA上进行新的测试。