We propose a universal and physically realizable adversarial attack on a cascaded multi-modal deep learning network (DNN), in the context of self-driving cars. DNNs have achieved high performance in 3D object detection, but they are known to be vulnerable to adversarial attacks. These attacks have been heavily investigated in the RGB image domain and more recently in the point cloud domain, but rarely in both domains simultaneously - a gap to be filled in this paper. We use a single 3D mesh and differentiable rendering to explore how perturbing the mesh's geometry and texture can reduce the robustness of DNNs to adversarial attacks. We attack a prominent cascaded multi-modal DNN, the Frustum-Pointnet model. Using the popular KITTI benchmark, we showed that the proposed universal multi-modal attack was successful in reducing the model's ability to detect a car by nearly 73%. This work can aid in the understanding of what the cascaded RGB-point cloud DNN learns and its vulnerability to adversarial attacks.
翻译:我们提议在自行驾驶的汽车中,对级联多式深层学习网络(DNN)进行普遍和实际可实现的对抗性攻击。 DNN在3D物体探测中取得了很高的性能,但众所周知,它们容易受到对抗性攻击。这些攻击在RGB图像域和最近的点云域中都进行了大量调查,但在两个领域很少同时进行——这是本文件要填补的一个空白。我们用一个单一的3D网和不同的图案来探索如何干扰网点的几何和质地能够降低DNN对对抗性攻击的坚固性。我们攻击了著名的Frustum-Pointnet型多式DNN。我们利用流行的KITTI基准表明,拟议的通用多式攻击成功地降低了模型探测一辆汽车的能力近73%。这项工作有助于理解串联式的RGB点云DNNN所学到的东西及其在对抗性攻击面前的脆弱性。