Most autonomous vehicles (AVs) rely on LiDAR and RGB camera sensors for perception. Using these point cloud and image data, perception models based on deep neural nets (DNNs) have achieved state-of-the-art performance in 3D detection. The vulnerability of DNNs to adversarial attacks has been heavily investigated in the RGB image domain and more recently in the point cloud domain, but rarely in both domains simultaneously. Multi-modal perception systems used in AVs can be divided into two broad types: cascaded models which use each modality independently, and fusion models which learn from different modalities simultaneously. We propose a universal and physically realizable adversarial attack for each type, and study and contrast their respective vulnerabilities to attacks. We place a single adversarial object with specific shape and texture on top of a car with the objective of making this car evade detection. Evaluating on the popular KITTI benchmark, our adversarial object made the host vehicle escape detection by each model type more than 50% of the time. The dense RGB input contributed more to the success of the adversarial attacks on both cascaded and fusion models.
翻译:大多数自主飞行器(AVs)依靠LiDAR和RGB相机传感器进行感知。使用这些点云和图像数据,基于深神经网的感知模型在3D探测中达到了最先进的性能。在RGB图像域和最近在点云域中都对DNNs易受对抗性攻击的脆弱性进行了大量调查,但很少同时在两个领域同时使用。AVs使用的多式感知系统可以分为两大类:独立使用每种模式的级联模型,以及同时从不同模式中学习的聚合模型。我们建议对每种类型进行普遍和可实际实现的对抗性攻击,并研究和比较它们各自的攻击弱点。我们在汽车顶部放置了一个带有特定形状和纹理的单一对抗性物体,目的是使这种车辆逃避性攻击得到检测。根据流行的KITTI基准,我们的对抗性物体使东道车辆在每种型号上的逃脱性探测超过50%的时间。密集的RGB投入有助于在级和聚合模型上成功进行对抗性攻击。