In recent years, Artificial Intelligence (AI) algorithms have been proven to outperform traditional statistical methods in terms of predictivity, especially when a large amount of data was available. Nevertheless, the "black box" nature of AI models is often a limit for a reliable application in high-stakes fields like diagnostic techniques, autonomous guide, etc. Recent works have shown that an adequate level of interpretability could enforce the more general concept of model trustworthiness. The basic idea of this paper is to exploit the human prior knowledge of the features' importance for a specific task, in order to coherently aid the phase of the model's fitting. This sort of "weighted" AI is obtained by extending the empirical loss with a regularization term encouraging the importance of the features to follow predetermined constraints. This procedure relies on local methods for the feature importance computation, e.g. LRP, LIME, etc. that are the link between the model weights to be optimized and the user-defined constraints on feature importance. In the fairness area, promising experimental results have been obtained for the Adult dataset. Many other possible applications of this model agnostic theoretical framework are described.
翻译:近些年来,人工智能(AI)算法已证明在预测性方面超过了传统的统计方法,特别是在有大量数据的情况下。然而,AI模型的“黑盒”性质往往是在诊断技术、自主指南等高占用领域可靠应用的限度。最近的工作表明,适当程度的可解释性可以执行更普遍的模型可靠性概念。本文件的基本想法是利用人类以前对特征对具体任务的重要性的知识,以便一致地帮助模型的安装阶段。这种“加权”AI是通过扩大经验损失而得的正规化术语,鼓励特征的重要性以遵循预定的限制。这一程序依靠地物重要性计算的地方方法,例如,LRP、LIME等,这是拟优化的模型重量与用户界定的特征重要性限制之间的联系。在公平方面,为成人数据集取得了有希望的实验结果。还描述了这一模型的许多其他可能的理论框架。