The application of Deep Neural Networks (DNNs) to a broad variety of tasks demands methods for coping with the complex and opaque nature of these architectures. When a gold standard is available, performance assessment treats the DNN as a black box and computes standard metrics based on the comparison of the predictions with the ground truth. A deeper understanding of performances requires going beyond such evaluation metrics to diagnose the model behavior and the prediction errors. This goal can be pursued in two complementary ways. On one side, model interpretation techniques "open the box" and assess the relationship between the input, the inner layers and the output, so as to identify the architecture modules most likely to cause the performance loss. On the other hand, black-box error diagnosis techniques study the correlation between the model response and some properties of the input not used for training, so as to identify the features of the inputs that make the model fail. Both approaches give hints on how to improve the architecture and/or the training process. This paper focuses on the application of DNNs to Computer Vision (CV) tasks and presents a survey of the tools that support the black-box performance diagnosis paradigm. It illustrates the features and gaps of the current proposals, discusses the relevant research directions and provides a brief overview of the diagnosis tools in sectors other than CV.
翻译:深神经网络(DNNs)应用于各种各样的任务,要求采用各种方法来应对这些建筑的复杂和不透明性质。当有金本位时,业绩评估将DNN作为黑盒处理,并根据对预测与地面真相的比较,计算标准度量。更深入地了解业绩要求超越这种评价指标来诊断模型行为和预测错误。这个目标可以用两种互补的方式实现。一方面,示范解释技术“打开盒子”和评估输入、内层和产出之间的关系,以便确定最有可能造成业绩损失的建筑模块。另一方面,黑箱错误诊断技术研究模型反应与未用于培训的某些投入的属性之间的相互关系,以便确定模型失败的投入的特点。两种方法都就如何改进结构和/或培训过程提供了提示。本文侧重于DNes对计算机视野(CV)任务的应用,并介绍了对支持黑箱绩效分析方向的工具进行的一项调查。另一方面,黑箱错误诊断技术研究模型与未用于培训的某些投入的属性之间的相互关系,以便确定模型失败的特征。两种方法都说明了如何改进结构和/或培训过程。本文件侧重于对计算机视野任务的应用,并介绍了支持黑盒分析当前分析方向的工具,而不是C分析工具的特征。