Siamese trackers are shown to be vulnerable to adversarial attacks recently. However, the existing attack methods craft the perturbations for each video independently, which comes at a non-negligible computational cost. In this paper, we show the existence of universal perturbations that can enable the targeted attack, e.g., forcing a tracker to follow the ground-truth trajectory with specified offsets, to be video-agnostic and free from inference in a network. Specifically, we attack a tracker by adding a universal imperceptible perturbation to the template image and adding a fake target, i.e., a small universal adversarial patch, into the search images adhering to the predefined trajectory, so that the tracker outputs the location and size of the fake target instead of the real target. Our approach allows perturbing a novel video to come at no additional cost except the mere addition operations -- and not require gradient optimization or network inference. Experimental results on several datasets demonstrate that our approach can effectively fool the Siamese trackers in a targeted attack manner. We show that the proposed perturbations are not only universal across videos, but also generalize well across different trackers. Such perturbations are therefore doubly universal, both with respect to the data and the network architectures. We will make our code publicly available.
翻译:亚马逊追踪器最近被证明容易受到对抗性攻击。 但是, 现有的攻击方法将每个视频的扰动独立地进行, 其计算成本是不可忽略的。 在本文中, 我们展示了全球的扰动, 能够让目标攻击成为可能, 例如, 迫使追踪器跟踪地面真相轨迹, 并加上特定的偏差, 成为视频的不可知性, 并且不受网络中的推断。 具体地说, 我们攻击跟踪器, 方法是在模板图像上添加一个普遍的可察觉的触动, 并添加一个假目标, 即一个小型的通用对立补丁, 以遵守预先定义的轨迹。 因此, 追踪器输出假目标的位置和大小, 而不是真实的目标。 我们的方法允许一个新的视频在不增加成本的情况下被扰动, 除了简单的添加操作之外, 不需要梯度优化或网络的推断。 几个数据集的实验结果表明, 我们的方法可以有效地欺骗Siamseense追踪器, 并且增加一个假的目标, 即一个小的通用的对准点, 。 因此, 我们提出的网络结构会显示, 跨一个不同的轨道, 将显示, 跨整个网络的系统结构将显示我们提出的系统结构将只是 。