Recent studies imply that deep neural networks are vulnerable to adversarial examples -- inputs with a slight but intentional perturbation are incorrectly classified by the network. Such vulnerability makes it risky for some security-related applications (e.g., semantic segmentation in autonomous cars) and triggers tremendous concerns on the model reliability. For the first time, we comprehensively evaluate the robustness of existing UDA methods and propose a robust UDA approach. It is rooted in two observations: (i) the robustness of UDA methods in semantic segmentation remains unexplored, which pose a security concern in this field; and (ii) although commonly used self-supervision (e.g., rotation and jigsaw) benefits image tasks such as classification and recognition, they fail to provide the critical supervision signals that could learn discriminative representation for segmentation tasks. These observations motivate us to propose adversarial self-supervision UDA (or ASSUDA) that maximizes the agreement between clean images and their adversarial examples by a contrastive loss in the output space. Extensive empirical studies on commonly used benchmarks demonstrate that ASSUDA is resistant to adversarial attacks.
翻译:最近的研究表明,深神经网络很容易受到对抗性例子的影响 -- -- 网络对少量但有意扰动的投入分类不正确,这种脆弱性使某些与安全有关的应用(如自主汽车中的语义分割)面临风险,并引起对模型可靠性的极大关注。我们第一次全面评估了现有UDA方法的稳健性,并提出了强有力的UDA方法。它植根于两种观察:(一) UDA方法在语义分割中的稳健性仍未得到探讨,这在这一领域引起了安全问题;(二) 尽管常用的自我监督(如轮用和jigsaw)有利于分类和识别等与安全有关的应用,但它们未能提供关键的监督信号,以学习对分割任务的区别对待性代表。这些观察促使我们提出自我监督UDA(或ASUDA)方案,通过在产出空间的对比性损失使清洁图像与其对抗性例子之间的协议最大化。关于常用基准(如分类和识别)的广泛经验研究表明,ASSUDA对对抗性攻击具有抵抗力。