Appearance-based gaze estimation has achieved significant improvement by using deep learning. However, many deep learning-based methods suffer from the vulnerability property, i.e., perturbing the raw image using noise confuses the gaze estimation models. Although the perturbed image visually looks similar to the original image, the gaze estimation models output the wrong gaze direction. In this paper, we investigate the vulnerability of appearance-based gaze estimation. To our knowledge, this is the first time that the vulnerability of gaze estimation to be found. We systematically characterized the vulnerability property from multiple aspects, the pixel-based adversarial attack, the patch-based adversarial attack and the defense strategy. Our experimental results demonstrate that the CA-Net shows superior performance against attack among the four popular appearance-based gaze estimation networks, Full-Face, Gaze-Net, CA-Net and RT-GENE. This study draws the attention of researchers in the appearance-based gaze estimation community to defense from adversarial attacks.
翻译:基于视觉的视觉估计通过深层学习取得了显著的改善。然而,许多基于深层次学习的方法都受到脆弱性特性的影响,即使用噪音扰动原始图像,使视觉估计模型混淆。虽然扰动的图像看起来与原始图像相似,但视觉估计模型显示的视觉方向是错误的。在本文中,我们调查了以外观为基础的视觉估计的脆弱性。据我们所知,这是第一次发现目视估计的脆弱性。我们系统地从多个方面、像素基的对抗性攻击、以补丁基对抗性攻击和防御战略来描述脆弱性特性。我们的实验结果显示,CA-Net显示,四个以外观为基础的视觉估计网络(Full-Face、Gaze-Net、CA-Net和RT-GENE)对攻击的优异性表现。这项研究吸引了以外观为基础的视觉估计界研究人员的注意,以防御对抗性攻击。