Currently, Deep learning and especially Convolutional Neural Networks (CNNs) have become a fundamental computational approach applied in a wide range of domains, including some safety-critical applications (e.g., automotive, robotics, and healthcare equipment). Therefore, the reliability evaluation of those computational systems is mandatory. The reliability evaluation of CNNs is performed by fault injection campaigns at different levels of abstraction, from the application level down to the hardware level. Many works have focused on evaluating the reliability of neural networks in the presence of transient faults. However, the effects of permanent faults have been investigated at the application level, only, e.g., targeting the parameters of the network. This paper intends to propose a framework, resorting to a binary instrumentation tool to perform fault injection campaigns, targeting different components inside the GPU, such as the register files and the functional units. This environment allows for the first time assessing the reliability of CNNs deployed on a GPU considering the presence of permanent faults.
翻译:目前,深层学习,特别是进化神经网络(NCNNs)已成为一种基本的计算方法,应用于广泛的领域,包括一些安全关键应用(如汽车、机器人和医疗设备),因此,这些计算系统的可靠性评价是强制性的,对CNN的可靠性评价是通过从应用到硬件等不同层次的抽取层的错漏注射运动进行的,许多工作的重点是在有瞬时缺陷的情况下评价神经网络的可靠性,但是,在应用层面只对永久缺陷的影响进行了调查,例如,仅针对网络参数进行了调查,本文件打算提出一个框架,采用二进制仪器工具来进行错误注射运动,针对GPU内部的不同组成部分,如登记档案和功能单元,这种环境允许首次评估在GPU上部署的CNN的可靠性,因为存在永久缺陷。