Recent research has demonstrated that Deep Neural Networks (DNNs) are vulnerable to adversarial patches which introduce perceptible but localized changes to the input. Nevertheless, existing approaches have focused on generating adversarial patches on images, their counterparts in videos have been less explored. Compared with images, attacking videos is much more challenging as it needs to consider not only spatial cues but also temporal cues. To close this gap, we introduce a novel adversarial attack in this paper, the bullet-screen comment (BSC) attack, which attacks video recognition models with BSCs. Specifically, adversarial BSCs are generated with a Reinforcement Learning (RL) framework, where the environment is set as the target model and the agent plays the role of selecting the position and transparency of each BSC. By continuously querying the target models and receiving feedback, the agent gradually adjusts its selection strategies in order to achieve a high fooling rate with non-overlapping BSCs. As BSCs can be regarded as a kind of meaningful patch, adding it to a clean video will not affect people' s understanding of the video content, nor will arouse people' s suspicion. We conduct extensive experiments to verify the effectiveness of the proposed method. On both UCF-101 and HMDB-51 datasets, our BSC attack method can achieve about 90\% fooling rate when attacking three mainstream video recognition models, while only occluding \textless 8\% areas in the video. Our code is available at https://github.com/kay-ck/BSC-attack.
翻译:最近的研究显示,深神经网络(DNNS)很容易受到对抗性攻击,这种攻击使输入发生明显但局部的变化。然而,现有的方法侧重于在图像上产生对抗性补丁,而其视频中的对应方则较少被探索。与图像相比,攻击视频更具挑战性,因为它不仅需要考虑空间提示,也需要时间提示。为了缩小这一差距,我们在本文件中引入了一种新的对抗性攻击,即子弹屏评论(BSC)攻击BSC的视频识别模型。具体地说,对抗性BSC是用强化学习(RL)框架生成的,环境被设定为目标模型,而代理人则扮演着选择每个BSC的位置和透明度的角色。与图像相比,攻击视频视频视频相比更具挑战性更具挑战性,因为它不仅需要考虑空间提示,而且还需要考虑时间提示。为了缩小这一差距,我们在本文中引入了一种新颖的对抗性攻击性攻击,BSC只能被视为一种有意义的补丁,在清洁的视频中添加它不会影响人们对视频内容的理解,也不会影响作为目标模型的环境模式,代理方扮演着选择每个BSC的位置和透明度的作用。我们提议的“B+80 ” 数据测试。我们可以实现“攻击率” 。我们“B”中的“B+”中的“标准” 。我们提出的“B” 。我们在“攻击率”中的“B+”中的“B”中的“B”中的“B” 的“B” 的“标准” 的“标准” 101”中的“标准” 。我们可以进行广泛的实验方法” 。