The great quest for adopting AI-based computation for safety-/mission-critical applications motivates the interest towards methods for assessing the robustness of the application w.r.t. not only its training/tuning but also errors due to faults, in particular soft errors, affecting the underlying hardware. Two strategies exist: architecture-level fault injection and application-level functional error simulation. We present a framework for the reliability analysis of Convolutional Neural Networks (CNNs) via an error simulation engine that exploits a set of validated error models extracted from a detailed fault injection campaign. These error models are defined based on the corruption patterns of the output of the CNN operators induced by faults and bridge the gap between fault injection and error simulation, exploiting the advantages of both approaches. We compared our methodology against SASSIFI for the accuracy of functional error simulation w.r.t. fault injection, and against TensorFI in terms of speedup for the error simulation strategy. Experimental results show that our methodology achieves about 99\% accuracy of the fault effects w.r.t. SASSIFI, and a speedup ranging from 44x up to 63x w.r.t. TensorFI, that only implements a limited set of error models.
翻译:对安全/任务关键应用应用采用基于AI的计算方法的极大追求促使人们对评估应用软件的稳健性的方法感兴趣,这种方法不仅包括培训/调整,而且还包括由于缺陷,特别是软差差错,影响到基本硬件。有两种战略存在:结构层面的过失注射和应用层面的功能错误模拟。我们提出了一个框架,用于通过一个错误模拟引擎对进化神经网络(CNNs)进行可靠性分析,该模拟引擎利用从详细的错误注入运动中提取的一套经验证的错误模型。这些错误模型的定义基于CNN操作器操作器出错引发的腐败模式,并弥补错误注入和错误模拟之间的差距,同时利用这两种方法的优势。我们比较了SASSIFII的系统方法,以功能错误模拟 w.r.t. 错误注入的准确性为依据,并针对Tonsororfi的系统模拟战略的精确性分析。实验结果显示,我们的方法在错误效应w.r.t.t.SASSIFII的精确度方面达到了99 ⁇ 的准确度。以及从44x到63个有限的模型,Tenfr.t.fis的精确度。