Recent progress on salient object detection mainly aims at exploiting how to effectively integrate multi-scale convolutional features in convolutional neural networks (CNNs). Many popular methods impose deep supervision to perform side-output predictions that are linearly aggregated for final saliency prediction. In this paper, we theoretically and experimentally demonstrate that linear aggregation of side-output predictions is suboptimal, and it only makes limited use of the side-output information obtained by deep supervision. To solve this problem, we propose Deeply-supervised Nonlinear Aggregation (DNA) for better leveraging the complementary information of various side-outputs. Compared with existing methods, it i) aggregates side-output features rather than predictions, and ii) adopts nonlinear instead of linear transformations. Experiments demonstrate that DNA can successfully break through the bottleneck of current linear approaches. Specifically, the proposed saliency detector, a modified U-Net architecture with DNA, performs favorably against state-of-the-art methods on various datasets and evaluation metrics without bells and whistles.
翻译:显性物体探测的近期进展主要旨在探索如何有效地将多种规模的进化特征纳入进化神经网络(CNNs)中。许多流行的方法要求进行深度监督,以进行为最后显性预测而线性汇总的副产出预测。在本文中,我们理论上和实验性地证明,副产出预测的线性汇总不理想,而且只能有限地使用通过深入监督获得的副产出信息。为了解决这一问题,我们提议采用由深度监督的非线性聚合(DNA),以便更好地利用各种副产出的补充信息。与现有方法相比,它(i) 综合的副产出特征而不是预测,以及(ii) 采用非线性转变。实验表明,DNA能够成功地突破当前线性方法的瓶颈。具体地说,拟议的显性检测器,一个带有DNA的经过修改的U-网络结构,与各种数据集和评估指标的状态方法相比,没有钟和哨子。