Recently, the vulnerability of deep image classification models to adversarial attacks has been investigated. However, such an issue has not been thoroughly studied for image-to-image tasks that take an input image and generate an output image (e.g., colorization, denoising, deblurring, etc.) This paper presents comprehensive investigations into the vulnerability of deep image-to-image models to adversarial attacks. For five popular image-to-image tasks, 16 deep models are analyzed from various standpoints such as output quality degradation due to attacks, transferability of adversarial examples across different tasks, and characteristics of perturbations. We show that unlike image classification tasks, the performance degradation on image-to-image tasks largely differs depending on various factors, e.g., attack methods and task objectives. In addition, we analyze the effectiveness of conventional defense methods used for classification models in improving the robustness of the image-to-image models.
翻译:最近,对深刻图像分类模型易受对抗性攻击的脆弱性进行了调查,然而,对于采用输入图像并生成输出图像的图像到图像任务(如彩色化、脱色、分流等),尚未对此类问题进行彻底研究。本文件全面调查了深图像到图像模型易受对抗性攻击的脆弱性。对于五种流行的图像到图像任务,从不同角度分析了16个深度模型,如攻击导致的产出质量退化、不同任务中对抗性实例的可转移性和扰动性特征。我们表明,与图像分类任务不同,图像到图像任务的业绩退化主要取决于各种因素,如攻击方法和任务目标。此外,我们分析了用于分类模型的常规防御方法在提高图像到图像模型的稳健性方面的有效性。