Deep Neural Networks (DNN) are becoming increasingly more important in assisted and automated driving. Using such entities which are obtained using machine learning is inevitable: tasks such as recognizing traffic signs cannot be developed reasonably using traditional software development methods. DNN however do have the problem that they are mostly black boxes and therefore hard to understand and debug. One particular problem is that they are prone to hidden backdoors. This means that the DNN misclassifies its input, because it considers properties that should not be decisive for the output. Backdoors may either be introduced by malicious attackers or by inappropriate training. In any case, detecting and removing them is important in the automotive area, as they might lead to safety violations with potentially severe consequences. In this paper, we introduce a novel method to remove backdoors. Our method works for both intentional as well as unintentional backdoors. We also do not require prior knowledge about the shape or distribution of backdoors. Experimental evidence shows that our method performs well on several medium-sized examples.
翻译:深神经网络(DNN)在协助和自动驾驶方面变得越来越重要。使用通过机器学习获得的这些实体是不可避免的:识别交通标志等任务无法用传统的软件开发方法合理地开发。但是,DNN确实存在一个问题,即它们大多是黑箱,因此难以理解和调试。一个特别的问题是,它们容易隐藏后门。这意味着DNN错误分类其输入,因为它认为输入的属性不应该对输出起决定作用。后门可能是恶意攻击者引入的,也可能是通过不适当的培训引入的。无论如何,在汽车领域发现和清除它们是重要的,因为它们可能导致潜在的严重后果的安全侵犯。在本文件中,我们引入了一种新颖的清除后门的方法。我们的方法既适用于有意的,也适用于无意的后门。我们也不需要事先了解后门的形状或分布。实验证据表明,我们的方法在一些中等规模的例子中表现良好。