Pixel-wise prediction with deep neural network has become an effective paradigm for salient object detection (SOD) and achieved remarkable performance. However, very few SOD models are robust against adversarial attacks which are visually imperceptible for human visual attention. The previous work robust saliency (ROSA) shuffles the pre-segmented superpixels and then refines the coarse saliency map by the densely connected conditional random field (CRF). Different from ROSA that relies on various pre- and post-processings, this paper proposes a light-weight Learnable Noise (LeNo) to defend adversarial attacks for SOD models. LeNo preserves accuracy of SOD models on both adversarial and clean images, as well as inference speed. In general, LeNo consists of a simple shallow noise and noise estimation that embedded in the encoder and decoder of arbitrary SOD networks respectively. Inspired by the center prior of human visual attention mechanism, we initialize the shallow noise with a cross-shaped gaussian distribution for better defense against adversarial attacks. Instead of adding additional network components for post-processing, the proposed noise estimation modifies only one channel of the decoder. With the deeply-supervised noise-decoupled training on state-of-the-art RGB and RGB-D SOD networks, LeNo outperforms previous works not only on adversarial images but also on clean images, which contributes stronger robustness for SOD. Our code is available at https://github.com/ssecv/LeNo.
翻译:精密神经网络的等离子预测已成为显要物体探测(SOD)的有效范例,并取得了显著的性能。然而,很少有甚小的SOD模型能够抵御对抗性攻击,而对抗性攻击的对抗性攻击则在视觉上是看不见的。以前的工作强度显性(ROSA)将预分解的超级像素打乱,然后通过密集连接的有条件随机字段(CRF)改进粗略的显性地图。与依靠各种预处理和后处理的ROSA不同,本文建议采用轻量的可学习噪音(LeNo)来保护SOD模型的对抗性攻击。 Le no 将SOD模型的精度保留在对抗性和清洁图像的视觉上,以及感知性速度。一般来说,LeNo(ROSA)包含一个简单的浅度噪音和噪音估计,分别嵌入于任意的SOD网络的精度。在人类视觉关注机制的前中心下,我们开始浅度噪音的粗度(BV)传播,但只是为了更好地防范对抗性攻击。在后,而不是增加网络的清洁网络组成部分,而在后处理中,拟议的S-GOD的深度的S-de-de-rode-rode rode 上,提议的S-refrode 将S-de 正在修修修。