Deep Reinforcement Learning (DRL) enhances the efficiency of Autonomous Vehicles (AV), but also makes them susceptible to backdoor attacks that can result in traffic congestion or collisions. Backdoor functionality is typically incorporated by contaminating training datasets with covert malicious data to maintain high precision on genuine inputs while inducing the desired (malicious) outputs for specific inputs chosen by adversaries. Current defenses against backdoors mainly focus on image classification using image-based features, which cannot be readily transferred to the regression task of DRL-based AV controllers since the inputs are continuous sensor data, i.e., the combinations of velocity and distance of AV and its surrounding vehicles. Our proposed method adds well-designed noise to the input to neutralize backdoors. The approach involves learning an optimal smoothing (noise) distribution to preserve the normal functionality of genuine inputs while neutralizing backdoors. By doing so, the resulting model is expected to be more resilient against backdoor attacks while maintaining high accuracy on genuine inputs. The effectiveness of the proposed method is verified on a simulated traffic system based on a microscopic traffic simulator, where experimental results showcase that the smoothed traffic controller can neutralize all trigger samples and maintain the performance of relieving traffic congestion
翻译:强化学习增强了自动驾驶汽车的效率,但也使其容易受到后门攻击,后门攻击会导致交通拥堵或碰撞。通常采用将训练数据集污染为包含隐秘恶意数据,以保持在真实输入上的高精度,并且在特定输入(由攻击者选择)上诱导所期望的(恶意)输出来实现后门功能。目前,针对后门的防御主要集中在使用基于图像的特征对图像分类进行处理,但不能被无缝地转移到基于DRL的自动驾驶车辆控制器的回归任务,因为输入是连续的传感器数据,即包括车辆及其周围车辆的速度和距离的组合。我们提出的方法是向输入添加经过精心设计的噪声来使其中和后门。该方法涉及学习一种最佳平滑(噪声)分布,以保持真实输入的正常功能,并中和后门。通过这样做,预期得到的模型将更具有抵御后门攻击的弹性,同时在真实输入上保持高准确度。该方法的有效性在基于微观交通模拟器的模拟交通系统上得到了验证,实验结果表明平滑的交通控制器可以中和所有的触发样本,并维持缓解交通拥堵的表现。